Summary
This paper presents a new protocol for wireless LAN's: MACAW. MACAW builds upon the MACA media access protocols by adding 2 new messages: Data-Sending and ACK resulting in a RTS-CTS-DS-DATA-ACK message exchange. MACAW also uses a different backoff algorithm.
The authors base their proposal on 4 observations of MACA proposal. 1) contention occurs at the receiver, not the sender 2) congestion is not homogenous, 3) all stations must work together to learn about the congestion levels and 4) the protocol should send info about contention periods to all devices so they can respond accordingly.
To make the use of bandwidth more fair, the authors propose a new backoff algorithm. In this algorithm, each of the stations shares their current value of the backoff counter so that all stations have the same backoff counter, resulting in fair allocation. Using multiple streams instead of single streams is another method to achieving fair allocation.
The authors add an ACK message which allows the receiver to tell the sender that they have received all the data. To solve the "exposed terminal" problem, they introduce the DS message which tells all nearby stations that a data transmission is about to occur. RRTS (Request-for-Request-to-Send) messages are also used to give all streams fair access to the media.
When evaluating MACAW, the authors found it to have an overhead of approximately 8% higher than MACA. However, it also has a throughput that is 37% higher than MACA, which more than makes up for the higher overhead.
Criticism & Questions
Although it wasn't highly critical to the protocol, I would be interested in knowing how they plan to change the packet header to accomodate the backoff counter value required for their backoff algorithm.
In addition, their backoff algorithm requires each station to copy over the backoff counter that they see in other stations' packets. I'm curious about how this could be misused. It doesn't seem like there's anyway to prevent a station from putting in a false value for other stations to use so that it can have an unfair advantage.
Feedback
I really enjoyed reading this paper. I think it was very well organized and easy to follow. I vote to keep it in the syllabus.
Wednesday, September 30, 2009
Tuesday, September 29, 2009
Understanding TCP Incast Throughput Collapse in Datacenter Networks
Summary
This paper also explored the Incast problem. The authors aim to understand the dynamics of Incast, propose an analytical model for Incast symptoms and evaluate different solutions to Incast.
The authors put forth some criteria required in a solution to Incast: (1) solution must be general and not for any particular environment, (2) an analytical model that can explain the observed Incast symptoms and (3) modifications to TCP that help improve the problem.
The authors first replicated the Incast problem in their environment. Then, they started with decreasing the minimum RTO timer value. They tried out 2 suggestions made by the authors of the previous paper: use high resolution timers and turn off delayed ACK's where possible. After trying out various combinations of the 2, they found that a low resolution RTO timer of 1 ms with delayed ACK's turned on was optimal.
Next, they propose an analytical model that accounts for the shape of the intial product and the "recovery" portion but is still missing the second order goodput decrease and doesn't predict the curves for small minimum RTO values.
Criticism & Questions
I enjoyed reading this paper. It was laid out well and easy to follow. I especially liked the graphs that were included - they were very helpful in understanding the results. One comment on the graphs is that some of the graphs were a little hard to read, mostly because the lines were very close and the symbols were hard to differentiate.
This paper also explored the Incast problem. The authors aim to understand the dynamics of Incast, propose an analytical model for Incast symptoms and evaluate different solutions to Incast.
The authors put forth some criteria required in a solution to Incast: (1) solution must be general and not for any particular environment, (2) an analytical model that can explain the observed Incast symptoms and (3) modifications to TCP that help improve the problem.
The authors first replicated the Incast problem in their environment. Then, they started with decreasing the minimum RTO timer value. They tried out 2 suggestions made by the authors of the previous paper: use high resolution timers and turn off delayed ACK's where possible. After trying out various combinations of the 2, they found that a low resolution RTO timer of 1 ms with delayed ACK's turned on was optimal.
Next, they propose an analytical model that accounts for the shape of the intial product and the "recovery" portion but is still missing the second order goodput decrease and doesn't predict the curves for small minimum RTO values.
Criticism & Questions
I enjoyed reading this paper. It was laid out well and easy to follow. I especially liked the graphs that were included - they were very helpful in understanding the results. One comment on the graphs is that some of the graphs were a little hard to read, mostly because the lines were very close and the symbols were hard to differentiate.
Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication
Summary
This paper explores the Incast Problem which is generally seen in TCP workloads in datacenter networks. The incast problem occurs when a client simultaneously requests data from many servers and can't continue until it receives all the data. All the servers respond at the same time, filling the client's inbound link, reducing overall throughput drastically.
This paper focuses on using high-resolution timers to enable more granular TCP timeouts. The authors test their solution in simulations and real-world experiments. Finally, they show that this solution is safe for wide area networks.
While explaining the background of the incast problem, the authors outline 3 main features that generally lead to TCP collapse: (1) high bandwidth, low latency networks with small switch buffers (2) clients that issue barrier-synchronized requests in parallel and doesn't issue next request until all come back from previous request and (3) servers that return small amount of data each request.
For current-day datacenters, the authors find that reducing the RTO to at least 1 ms will avoid degradation when having up to 16 servers. For next-generation datacenters, they suggest adding a randomized component to the RTO.
Finally, they investigate how well their solution would work in the wide area network. They conclude that reducing RTO to 200 microseconds doesn't affect TCP performance in the wide area network.
Criticism & Questions
I think this paper did a good job of explaining the incast problem and especially the preconditions required to go into TCP incast collapse. I think they also did a good job exploring how changes in RTO affects TCP performance. I would have liked to see more information on what else could be changed to fix the incast problem. This paper makes it seem like the timer was the biggest problem, but I'm unsure if that's the case, especially after reading the second paper.
This paper explores the Incast Problem which is generally seen in TCP workloads in datacenter networks. The incast problem occurs when a client simultaneously requests data from many servers and can't continue until it receives all the data. All the servers respond at the same time, filling the client's inbound link, reducing overall throughput drastically.
This paper focuses on using high-resolution timers to enable more granular TCP timeouts. The authors test their solution in simulations and real-world experiments. Finally, they show that this solution is safe for wide area networks.
While explaining the background of the incast problem, the authors outline 3 main features that generally lead to TCP collapse: (1) high bandwidth, low latency networks with small switch buffers (2) clients that issue barrier-synchronized requests in parallel and doesn't issue next request until all come back from previous request and (3) servers that return small amount of data each request.
For current-day datacenters, the authors find that reducing the RTO to at least 1 ms will avoid degradation when having up to 16 servers. For next-generation datacenters, they suggest adding a randomized component to the RTO.
Finally, they investigate how well their solution would work in the wide area network. They conclude that reducing RTO to 200 microseconds doesn't affect TCP performance in the wide area network.
Criticism & Questions
I think this paper did a good job of explaining the incast problem and especially the preconditions required to go into TCP incast collapse. I think they also did a good job exploring how changes in RTO affects TCP performance. I would have liked to see more information on what else could be changed to fix the incast problem. This paper makes it seem like the timer was the biggest problem, but I'm unsure if that's the case, especially after reading the second paper.
Tuesday, September 22, 2009
VL2: A Scalable and Flexible Data Center Network
Summary
This paper presents a network architecture that supports dynamic resource allocation across many servers. They aim to achieve uniform high capacity, performance isolation and layer-2 semantics. They achieve these goals by using flat addressing, load balancing and address resolution at the end-system.
The authors do an initial analysis of current data center traffic. Then, they built VL2 and ran experiments to evaluate how it performs based on the 3 criteria listed above. They find that VL2 does achieve their goals of high capacity, loading balancing fairness, isolation and convergence following link failures.
Criticism & Questions
I liked how this paper was structured. It was very easy to follow their thought process as they did their study. I also liked that they studied current traffic patterns in datacenters. It was good to have a baseline to compare with when reading this paper.
This paper presents a network architecture that supports dynamic resource allocation across many servers. They aim to achieve uniform high capacity, performance isolation and layer-2 semantics. They achieve these goals by using flat addressing, load balancing and address resolution at the end-system.
The authors do an initial analysis of current data center traffic. Then, they built VL2 and ran experiments to evaluate how it performs based on the 3 criteria listed above. They find that VL2 does achieve their goals of high capacity, loading balancing fairness, isolation and convergence following link failures.
Criticism & Questions
I liked how this paper was structured. It was very easy to follow their thought process as they did their study. I also liked that they studied current traffic patterns in datacenters. It was good to have a baseline to compare with when reading this paper.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
Summary
This paper discussed layer 2 protocols that would be optimal for data centers. The goals they try to achieve are i) VM's should be easily migratable without changing IP addresses, ii) no admin configuration required, iii) every node should be able to communicate with every other node, iv) no forwarding loops and v) recover from faults quickly.
PortLand uses a centralized fabric manager that maintains soft state about the network. Nodes uses a psuedo MAC address instead of their actual MAC address. The fabric manager is also able to reduce the overhead that results from broadcasting ARPs.
The authors compare PortLand to 2 other architectures with similar goals: TRILL and SEATTLE. Finally, they run experiments on their testbed to evaluate PortLand using the following criteria: convergence time with increasing faults, tcp convergence, multicast convergence, scalability and VM Migration.
Criticism & Questions
The authors do several experiments to evaluate the criteria given above and present those results quantitatively. However, it would be helpful to know how those numbers compare to what currently happens (ie. they say that after vm migration, communication resumes after 32 seconds - how does this compare to current vm migration?)
This paper discussed layer 2 protocols that would be optimal for data centers. The goals they try to achieve are i) VM's should be easily migratable without changing IP addresses, ii) no admin configuration required, iii) every node should be able to communicate with every other node, iv) no forwarding loops and v) recover from faults quickly.
PortLand uses a centralized fabric manager that maintains soft state about the network. Nodes uses a psuedo MAC address instead of their actual MAC address. The fabric manager is also able to reduce the overhead that results from broadcasting ARPs.
The authors compare PortLand to 2 other architectures with similar goals: TRILL and SEATTLE. Finally, they run experiments on their testbed to evaluate PortLand using the following criteria: convergence time with increasing faults, tcp convergence, multicast convergence, scalability and VM Migration.
Criticism & Questions
The authors do several experiments to evaluate the criteria given above and present those results quantitatively. However, it would be helpful to know how those numbers compare to what currently happens (ie. they say that after vm migration, communication resumes after 32 seconds - how does this compare to current vm migration?)
Thursday, September 17, 2009
Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises
Summary
This paper puts forth a new network architecture, SEATTLE, that combines the simplicity of Ethernet and the scalability of IP. SEATTLE achieves this with 3 main features. First, it uses a one-hop, network layer DHT, which doesn't require hosts to flood the network to disseminate host locations and instead has a global switch-level view that stores the location of each host. Second, it uses a traffic-driven location resolution and caching, which helps packets travel the shortest path to their destination. Finally, it uses a scalable, prompt cache-update protocol which uses unicast to inform other switches when a hash table entry is invalid, leading to faster reaction times.
The authors run several simulations using real traces supplemented with synthetic data. They use 4 topologies of various sizes. They find that SEATTLE uses much smaller tables, has 1000 times less control overhead, a smaller stretch than ROFL and reacts faster to network changes and host mobility. SEATTLE is also tested in a real deployment and the results were found to be comparable to those in the simulation.
Criticism & Questions
I think this was paper was well-written and conveyed the features included in SEATTLE well. I think this network architecture has real potential and would be interested in seeing how it performs when implemented in a more wide-scale deployment. As the authors mention, it would also be interesting to see how this architecture could be improved. One thing I would have liked to see is more information on the issues that would come up when implementing SEATTLE in practice.
This paper puts forth a new network architecture, SEATTLE, that combines the simplicity of Ethernet and the scalability of IP. SEATTLE achieves this with 3 main features. First, it uses a one-hop, network layer DHT, which doesn't require hosts to flood the network to disseminate host locations and instead has a global switch-level view that stores the location of each host. Second, it uses a traffic-driven location resolution and caching, which helps packets travel the shortest path to their destination. Finally, it uses a scalable, prompt cache-update protocol which uses unicast to inform other switches when a hash table entry is invalid, leading to faster reaction times.
The authors run several simulations using real traces supplemented with synthetic data. They use 4 topologies of various sizes. They find that SEATTLE uses much smaller tables, has 1000 times less control overhead, a smaller stretch than ROFL and reacts faster to network changes and host mobility. SEATTLE is also tested in a real deployment and the results were found to be comparable to those in the simulation.
Criticism & Questions
I think this was paper was well-written and conveyed the features included in SEATTLE well. I think this network architecture has real potential and would be interested in seeing how it performs when implemented in a more wide-scale deployment. As the authors mention, it would also be interesting to see how this architecture could be improved. One thing I would have liked to see is more information on the issues that would come up when implementing SEATTLE in practice.
Detailed Diagnosis in Enterprise Networks
Summary
The paper describes a new diagnostic system that provides operators more details on what occurred. The goals of the system are to get the diagnostic information without knowing anything about the applications as well as to use inference to see which entities in the network are impacting each other.
They do an analysis of several example problems and classify them into 3 categories: what was impacted, symptom and identified cause. Their inference model uses history to gauge the current impact. They use a strategy similar to probabilistic inference [ie. Prob (D = Dnow | S = Snow) and make the assumption that conditional probability affects causality.
The system, NetMedic, is composed of 3 main functions: capture current component state, generate dependency graph, and diagnose using the component state and dependency graph.
They evaluate NetMedic in many different situations and on many different faults. NetMedic performed well. It identified the correct component 80% of the time and the number is only slightly lower for simultaneously occurring faults.
Criticism & Questions
I enjoyed this paper. The authored laid our their assumptions initially, suggesting how those could affect their results. They did evaluate NetMedic in many different situations, but unfortunately they weren't able to test it on real faults and had to make do with faults they injected. I would be interested in seeing how this would perform when met with real faults, happening in real time (ie. many simultaneous and overlapping faults occurring)
The paper describes a new diagnostic system that provides operators more details on what occurred. The goals of the system are to get the diagnostic information without knowing anything about the applications as well as to use inference to see which entities in the network are impacting each other.
They do an analysis of several example problems and classify them into 3 categories: what was impacted, symptom and identified cause. Their inference model uses history to gauge the current impact. They use a strategy similar to probabilistic inference [ie. Prob (D = Dnow | S = Snow) and make the assumption that conditional probability affects causality.
The system, NetMedic, is composed of 3 main functions: capture current component state, generate dependency graph, and diagnose using the component state and dependency graph.
They evaluate NetMedic in many different situations and on many different faults. NetMedic performed well. It identified the correct component 80% of the time and the number is only slightly lower for simultaneously occurring faults.
Criticism & Questions
I enjoyed this paper. The authored laid our their assumptions initially, suggesting how those could affect their results. They did evaluate NetMedic in many different situations, but unfortunately they weren't able to test it on real faults and had to make do with faults they injected. I would be interested in seeing how this would perform when met with real faults, happening in real time (ie. many simultaneous and overlapping faults occurring)
Tuesday, September 15, 2009
Congestion Control for High Bandwidth-Delay Product Networks
Summary
This paper addresses the problems resulting from increase in per-flow bandwidth and high latency links, and proposes a solution to this problem.
As more high-bandwidth links and low-latency satellite links get added to the internet, TCP becomes more and more inefficient, no matter what queueing scheme is used. The authors put forth a new protocol called Explicit Control Protocol (XCP), which they've found to outperform TCP in both high-bandwidth, low-latency situations as well as conventional situations. The authors also argue that XCP achieves fair bandwidth allocation, short queues, high utilization and near-zero packet drops. One major feature of XCP is that the controllers for fairness and efficiency are decoupled, allowing each controller to optimize for the one area.
XCP does not use a binary feedback mechanism to signal congestion. Instead, it adds a feedback field in the header, which holds a number that indicates how much the sender should slow down, based on the bottleneck link through the network. XCP routers need to compute the feedback so as to optimize efficiency and min-max fairness.
The authors run several simulations to test XCP performance against TCP Reno. It uses the following queueing protocols: random early discard, random early marking, adaptive virtual queue, and core stateless fair queueing. Their simulations confirm that XCP has high utilization and fairness. It has short queues and near-zero packet drops. Over all the queueing protocols, XCP always performed at least as well as TCP.
Criticism and Questions
I thought this paper was interesting. It's one of the few papers that tries to rebuild a protocol from scratch, which definitely has its own problems. Generally, those problems surround the practicality of the protocol. I liked how the authors stated their assumptions upfront, but also addressed the practicality of deploying a protocol using a new packet format, and provided suggestions on how to do so. One thing I am still wondering about is what the impact would be on the sources to use and understand the XCP packet format.
This paper addresses the problems resulting from increase in per-flow bandwidth and high latency links, and proposes a solution to this problem.
As more high-bandwidth links and low-latency satellite links get added to the internet, TCP becomes more and more inefficient, no matter what queueing scheme is used. The authors put forth a new protocol called Explicit Control Protocol (XCP), which they've found to outperform TCP in both high-bandwidth, low-latency situations as well as conventional situations. The authors also argue that XCP achieves fair bandwidth allocation, short queues, high utilization and near-zero packet drops. One major feature of XCP is that the controllers for fairness and efficiency are decoupled, allowing each controller to optimize for the one area.
XCP does not use a binary feedback mechanism to signal congestion. Instead, it adds a feedback field in the header, which holds a number that indicates how much the sender should slow down, based on the bottleneck link through the network. XCP routers need to compute the feedback so as to optimize efficiency and min-max fairness.
The authors run several simulations to test XCP performance against TCP Reno. It uses the following queueing protocols: random early discard, random early marking, adaptive virtual queue, and core stateless fair queueing. Their simulations confirm that XCP has high utilization and fairness. It has short queues and near-zero packet drops. Over all the queueing protocols, XCP always performed at least as well as TCP.
Criticism and Questions
I thought this paper was interesting. It's one of the few papers that tries to rebuild a protocol from scratch, which definitely has its own problems. Generally, those problems surround the practicality of the protocol. I liked how the authors stated their assumptions upfront, but also addressed the practicality of deploying a protocol using a new packet format, and provided suggestions on how to do so. One thing I am still wondering about is what the impact would be on the sources to use and understand the XCP packet format.
Labels:
congestion avoidance,
high bandwidth-delay,
TCP,
XCP
Random Early Detection Gateways for Congestion Avoidance
Summary
This paper explains how Random Early Detection can be used in gateways for congestion avoidance. RED is based on the idea that gateways are the best places to detect congestion. It has the best view and can detect the difference between propagation delay and persistent queueing delay.
RED uses average queue size to detect congestion just as it begins to appear. The RED gateway would then notify the sender of this congestion by marking the packet (either by dropping the packet or setting a bit in the header). The gateway decides which packet to randomly mark based on a certain probability. With this strategy, RED can ensure that a sender is told to reduce its window size proportional to its share of the bandwidth. Additional features of RED are that it is able to keep the average queue size low, accommodate the occasional burst of traffic and avoid global synchronization.
When evaluating RED gateways, the authors used 7 criteria. These included congestion avoidance, appropriate time scales, no global synchronization, simplicity, fairness and appropriate for a range of environments. They explained how RED met each of those criteria.
Criticism and Questions
The authors mention that fairness is inherently built into the RED protocol. It cannot ensure that all users get the same amount of bandwidth. It is mentioned that it is possible to add in more mechanisms that implement fairness. I think it would be interesting to see the result of adding in those mechanisms. Fairness has been one of the major goals of the network and would help determine if a protocol could be used in practice. Therefore, it would nice to see RED's performance once the fairness mechanisms were added in.
Feedback
I enjoyed reading this paper. It laid out the basics of the RED protocol clearly and had some great simulations. I would say keep this paper in the syllabus as it shows another approach to congestion avoidance.
This paper explains how Random Early Detection can be used in gateways for congestion avoidance. RED is based on the idea that gateways are the best places to detect congestion. It has the best view and can detect the difference between propagation delay and persistent queueing delay.
RED uses average queue size to detect congestion just as it begins to appear. The RED gateway would then notify the sender of this congestion by marking the packet (either by dropping the packet or setting a bit in the header). The gateway decides which packet to randomly mark based on a certain probability. With this strategy, RED can ensure that a sender is told to reduce its window size proportional to its share of the bandwidth. Additional features of RED are that it is able to keep the average queue size low, accommodate the occasional burst of traffic and avoid global synchronization.
When evaluating RED gateways, the authors used 7 criteria. These included congestion avoidance, appropriate time scales, no global synchronization, simplicity, fairness and appropriate for a range of environments. They explained how RED met each of those criteria.
Criticism and Questions
The authors mention that fairness is inherently built into the RED protocol. It cannot ensure that all users get the same amount of bandwidth. It is mentioned that it is possible to add in more mechanisms that implement fairness. I think it would be interesting to see the result of adding in those mechanisms. Fairness has been one of the major goals of the network and would help determine if a protocol could be used in practice. Therefore, it would nice to see RED's performance once the fairness mechanisms were added in.
Feedback
I enjoyed reading this paper. It laid out the basics of the RED protocol clearly and had some great simulations. I would say keep this paper in the syllabus as it shows another approach to congestion avoidance.
Wednesday, September 9, 2009
Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks
Summary
This paper explores an alternative fair queueing algorithm that would require much less complexity in the routers. The authors are attempting to get the fairness that results from FQ and DRR without storing any state in the core routers. The authors propose a distributed algorithm, where the edge routers label the packets based on their arrival rate, etc and the core routers simply use FIFO along with the the labels to determine which packets to drop. This means that only the edge routers will be slowed down. The authors describe in detail several algorithms to calculate the flow arrival rate, link fair rate and label values as well as an alternative parameter to support flows with different weights. They use performance bounds to determine that ill-behaved users can only exploit the system for short amounts of time.
They run several simulations with the CSFQ, comparing it to several other algorithms. Two of the algorithms, FIFO and RED, are used as the baseline with no fairness attempted. The other two algorithms, FRED and DRR, represent two approaches to fairness. DRR is the optimal algorithm for fairness. The results of the simulations are that CSFQ achieves a reasonable amount of fairness. CSFQ is much better than FIFO and RED. It is equivalent to FRED in many cases but has the advantage of requiring less complexity. It's generally worse than DRR.
The authors finish up with an explanation of the unfriendly flow problem and two approaches to solve it as well as how to punish ill-behaved hosts.
Criticism & Questions
A large part of the CSFQ algorithm involves the edge routers labeling the packets, however they don't explain where in the packet header they plan to put the label.
When looking at the weighted CSFQ, they say that the algorithm can't accommodate flows with different weights within an island. It would be good to know how often this occurs and whether or not it's a practical assumption (especially because they talk about an entire ISP being one island).
When explaining the unfriendly flow problem, they mention that the percentage of noncooperative hosts on the on the network is unknown. Many of the congestion control and flow control algorithms are based on the assumption that there are many noncoopertive hosts in the network - it would be very beneficial to be able to quantify this.
This paper explores an alternative fair queueing algorithm that would require much less complexity in the routers. The authors are attempting to get the fairness that results from FQ and DRR without storing any state in the core routers. The authors propose a distributed algorithm, where the edge routers label the packets based on their arrival rate, etc and the core routers simply use FIFO along with the the labels to determine which packets to drop. This means that only the edge routers will be slowed down. The authors describe in detail several algorithms to calculate the flow arrival rate, link fair rate and label values as well as an alternative parameter to support flows with different weights. They use performance bounds to determine that ill-behaved users can only exploit the system for short amounts of time.
They run several simulations with the CSFQ, comparing it to several other algorithms. Two of the algorithms, FIFO and RED, are used as the baseline with no fairness attempted. The other two algorithms, FRED and DRR, represent two approaches to fairness. DRR is the optimal algorithm for fairness. The results of the simulations are that CSFQ achieves a reasonable amount of fairness. CSFQ is much better than FIFO and RED. It is equivalent to FRED in many cases but has the advantage of requiring less complexity. It's generally worse than DRR.
The authors finish up with an explanation of the unfriendly flow problem and two approaches to solve it as well as how to punish ill-behaved hosts.
Criticism & Questions
A large part of the CSFQ algorithm involves the edge routers labeling the packets, however they don't explain where in the packet header they plan to put the label.
When looking at the weighted CSFQ, they say that the algorithm can't accommodate flows with different weights within an island. It would be good to know how often this occurs and whether or not it's a practical assumption (especially because they talk about an entire ISP being one island).
When explaining the unfriendly flow problem, they mention that the percentage of noncooperative hosts on the on the network is unknown. Many of the congestion control and flow control algorithms are based on the assumption that there are many noncoopertive hosts in the network - it would be very beneficial to be able to quantify this.
Analysis and Simulation of a Fair Queueing Algorithm
Summary
This paper talks about a fair queueing algorithm that is based on Nagle's fair queueing (FQ) algorithm. The authors set forth 3 quantities that they want to be able to measure and optimize: bandwidth, promptness and buffer space. The main requirement they want to meet is fair bandwidth and buffer space allocation. They use the max-min criterion to define fairness. They also define what a user could be and decide that a user will be a source-destination pair.
They describe the algorithm in detail, which is trying to simulate the bit-by-bit round-robin algorithm. They claim the algorithm they describe asymptotically approaches the fair bandwidth allocation of the bit-by-bit round-robin algorithm. They also describe an addition to the algorithm that gives users who use less than their share of the bandwidth more promptness. And, when the queue becomes full, the last packet of the user with the highest amount of buffer space is dropped, which essentially penalizes ill-behaved hosts.
Finally, the authors run several simulations to compare FQ to FCFS using various flow control algorithms. When running the simulations, they looked at 4 performance criterion: total throughput, average rtt of the packets, number of packet retransmissions and number of dropped packets. The results showed that FQ works the best when paired with the JK flow control algorithm, giving end-hosts an incentive to implement more intelligent flow control algorithms.
Criticism & Questions
When choosing their definition of user, they mention the drawback that comes from an ill-behaved host opening up multiple connections with different destinations in order to congest the network. However, they didn't try this scenario out in the simulations - that would have been interesting to see.
They also put forth several alternative definitions to user and mention that the definition can be easily changed in their algorithm. Their algorithm would be more compelling if they could run their simulations with all the types of users to show that it works equally well for all of them or how it works better or worse for some definitions of user.
This paper talks about a fair queueing algorithm that is based on Nagle's fair queueing (FQ) algorithm. The authors set forth 3 quantities that they want to be able to measure and optimize: bandwidth, promptness and buffer space. The main requirement they want to meet is fair bandwidth and buffer space allocation. They use the max-min criterion to define fairness. They also define what a user could be and decide that a user will be a source-destination pair.
They describe the algorithm in detail, which is trying to simulate the bit-by-bit round-robin algorithm. They claim the algorithm they describe asymptotically approaches the fair bandwidth allocation of the bit-by-bit round-robin algorithm. They also describe an addition to the algorithm that gives users who use less than their share of the bandwidth more promptness. And, when the queue becomes full, the last packet of the user with the highest amount of buffer space is dropped, which essentially penalizes ill-behaved hosts.
Finally, the authors run several simulations to compare FQ to FCFS using various flow control algorithms. When running the simulations, they looked at 4 performance criterion: total throughput, average rtt of the packets, number of packet retransmissions and number of dropped packets. The results showed that FQ works the best when paired with the JK flow control algorithm, giving end-hosts an incentive to implement more intelligent flow control algorithms.
Criticism & Questions
When choosing their definition of user, they mention the drawback that comes from an ill-behaved host opening up multiple connections with different destinations in order to congest the network. However, they didn't try this scenario out in the simulations - that would have been interesting to see.
They also put forth several alternative definitions to user and mention that the definition can be easily changed in their algorithm. Their algorithm would be more compelling if they could run their simulations with all the types of users to show that it works equally well for all of them or how it works better or worse for some definitions of user.
Labels:
fair queueing,
fcfs,
first come first served,
mix-max fairness,
queueing
Tuesday, September 8, 2009
Congestion Avoidance and Control
Summary
This paper provides some simple algorithms to deal with congestion avoidance and control. It also gives some examples of what happens with and without those algorithms. Specifically, the authors put forth 7 new algorithms: RTT variance estimation, exponential retransmit timer backoff, slow-start, more aggressive receiver ack policy and dynamic window sizing on congestion. They determined that the main cause for congestion control is a violation of conversation of packets, which can be caused by (i) the connection failing to get to equilibrium, (ii) sender injecting a new packet before old one has exited and (iii) by being unable to reach equilibrium because of resource limits along the path.
To solve the first problem, they propose the slow-start algorithm. This algorithm increases the congestion window by 1 each time it gets an ack for new data. One the first problem is solved, the second problem is caused by a incorrect implementation of the retransmit timer. The authors say that not estimating the variation in RTT is usually the cause of an incorrect implementation and they provide some suggestions on how to improve it. They also state that exponential backoff is the best scheme for backoff after a retransmit. Finally, for the third problem, they suggest using a retransmit to indicate that the network is congested. Once the sender knows this, they should use the additive increase/multiplicative decrease algorithm to adjust their congestion window size.
Finally, the authors give information about their future work. It involves using the gateways to ensure fair sharing of the network capacity. They argue that the gateway should 'self-protect' against misbehaving hosts. They finish off by mentioning work on being able to detect congestion early because it can be fixed faster the earlier it is detected.
Criticisms & Questions
I liked that they talked about how to deal with uncooperative hosts and suggested gateways as a possible place to automatically moderate them. I would like to know more about this mechanism and if it's been implemented, whether or not it works in the real world.
I also liked that they point out that congestion avoidance and slow-start are 2 different algorithms with 2 different objectives. I was a little confused about that myself and was glad to have that cleared up.
This paper provides some simple algorithms to deal with congestion avoidance and control. It also gives some examples of what happens with and without those algorithms. Specifically, the authors put forth 7 new algorithms: RTT variance estimation, exponential retransmit timer backoff, slow-start, more aggressive receiver ack policy and dynamic window sizing on congestion. They determined that the main cause for congestion control is a violation of conversation of packets, which can be caused by (i) the connection failing to get to equilibrium, (ii) sender injecting a new packet before old one has exited and (iii) by being unable to reach equilibrium because of resource limits along the path.
To solve the first problem, they propose the slow-start algorithm. This algorithm increases the congestion window by 1 each time it gets an ack for new data. One the first problem is solved, the second problem is caused by a incorrect implementation of the retransmit timer. The authors say that not estimating the variation in RTT is usually the cause of an incorrect implementation and they provide some suggestions on how to improve it. They also state that exponential backoff is the best scheme for backoff after a retransmit. Finally, for the third problem, they suggest using a retransmit to indicate that the network is congested. Once the sender knows this, they should use the additive increase/multiplicative decrease algorithm to adjust their congestion window size.
Finally, the authors give information about their future work. It involves using the gateways to ensure fair sharing of the network capacity. They argue that the gateway should 'self-protect' against misbehaving hosts. They finish off by mentioning work on being able to detect congestion early because it can be fixed faster the earlier it is detected.
Criticisms & Questions
I liked that they talked about how to deal with uncooperative hosts and suggested gateways as a possible place to automatically moderate them. I would like to know more about this mechanism and if it's been implemented, whether or not it works in the real world.
I also liked that they point out that congestion avoidance and slow-start are 2 different algorithms with 2 different objectives. I was a little confused about that myself and was glad to have that cleared up.
Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks
Summary
The authors in this paper aim to find the "best" congestion avoidance algorithm. The criteria they use to measure an algorithm are efficiency, fairness, convergence, time and size of oscillations. Congestion avoidance algorithms aim to keep the load oscillating around the "knee". They try a set of algorithms which fall under "increase/decrease" algorithms using a binary feedback scheme. This means that the sender will only get feedback about whether the network is overloaded or not. If the sender gets feedback that the network is overloaded, the sender should decrease the packets its sending and similarly if the network is not overloaded. The 4 main combinations that the authors test are 1) additive increase, additive decrease 2) additive increase, multiplicative decrease 3) multiplicative increase, additive decrease and 4) multiplicative increase, multiplicative decrease. The authors then lay out a mechanism to determine if and when a control converges to efficiency. Then, they figure out which policy of the 4 listed above minimize convergence time and the oscillations that occur. They conclude that additive increase with multiplicative decrease is the optimal policy when considering efficiency and fairness.
Criticisms & Questions
One assumption that the authors make is that "all users receive the same feedback and react to it". I would like to know if this assumption holds in the real world. I'm curious to find out if senders are always required to follow congestion control feedback or if there are ways to bypass them. And, if it is possible to bypass them, are there mechanisms in the congestion control protocol that would be able to control these uncooperative senders?
Feedback
I really liked this paper. It was easy to read and very clear in laying out their plan and their logic.
The authors in this paper aim to find the "best" congestion avoidance algorithm. The criteria they use to measure an algorithm are efficiency, fairness, convergence, time and size of oscillations. Congestion avoidance algorithms aim to keep the load oscillating around the "knee". They try a set of algorithms which fall under "increase/decrease" algorithms using a binary feedback scheme. This means that the sender will only get feedback about whether the network is overloaded or not. If the sender gets feedback that the network is overloaded, the sender should decrease the packets its sending and similarly if the network is not overloaded. The 4 main combinations that the authors test are 1) additive increase, additive decrease 2) additive increase, multiplicative decrease 3) multiplicative increase, additive decrease and 4) multiplicative increase, multiplicative decrease. The authors then lay out a mechanism to determine if and when a control converges to efficiency. Then, they figure out which policy of the 4 listed above minimize convergence time and the oscillations that occur. They conclude that additive increase with multiplicative decrease is the optimal policy when considering efficiency and fairness.
Criticisms & Questions
One assumption that the authors make is that "all users receive the same feedback and react to it". I would like to know if this assumption holds in the real world. I'm curious to find out if senders are always required to follow congestion control feedback or if there are ways to bypass them. And, if it is possible to bypass them, are there mechanisms in the congestion control protocol that would be able to control these uncooperative senders?
Feedback
I really liked this paper. It was easy to read and very clear in laying out their plan and their logic.
Thursday, September 3, 2009
Understanding BGP Misconfiguration
Summary
This paper does a quantitative study and analysis of BGP misconfiguration errors. The authors spent 3 weeks analyzing routing table advertisements to track potential misconfiguration errors. After the 3 weeks, they used the final routing tables to poll a random host within each AS to see if it was reachable. If it wasn't reachable, they emailed the network operators to get more information about the misconfiguration.
Their study focused on 2 types of misconfigurations: origin and export. An origin misconfiguration is when an AS accidentally introduces a route into the global BGP tables. To test this, they looked at short-lived routes because they assumed that an operator would quickly correct configuration errors. An export misconfiguration is when an AS accidentally exports a route to a peer that violates the AS's policy. They use Gao's algorithm to infer the relationships between the ASes. Along with Gao's algorithm, they looked for short-lived paths that violated the "valley free" property.
Once they had the potential misconfigurations, they needed to confirm. They emailed the network operators that they could reach to ask if the incident was in fact a misconfiguration. They further confirmed that by trying to reach random hosts in each AS to see if the AS was still live.
In their results, they found that at least 72% of new routes seen in a day resulted from an origin misconfiguration. Out of those, only 4% resulted in a loss of connectivity. In terms of their duration, over half lasted less than 10 minutes and 80% were corrected within the hour.
Then, the authors explain the various causes of origin misconfiguration (initialization bugs, old configuration, hijacks, forgotten filter, etc.) and export misconfiguration (prefix based configuration, bad ACL).
Finally, the authors describe several solutions to mitigate the problem of misconfiguration. The first is improving the human interface design by using principles such as safe defaults and large edit distance between the wrong and correct version. The second is to provide the operators with a high level language to configure the routers rather than the error-prone low-level language. The third is to do more consistency checks on the data directly in the routers. The last is to extend the protocol like S-BGP does.
Criticisms & Questions
One of their statistics is that about 75% of all new prefix advertisements were results of misconfigurations. However, they say that only 1 in 25 affects connectivity. I'm curious about what happens to the remaining packets that aren't affecting connectivity. Are they simply being ignored by the router and not being inserted in the BGP table, or are they added in and just aren't affecting connectivity? I was fairly surprised that such a high fraction of packets are results of misconfigurations.
When talking about the causes of the misconfigurations, the authors focused on slips and mistakes, both of which are not malicious. However, I would like to know if and how many of the misconfigurations were caused by someone using it as an attack. It seems like an attacker or a shady operator could intentionally cause problems with these misconfigurations.
Feedback
I enjoyed reading the paper and the authors did a good job answering many of the concerns I had as I read the paper.
This paper does a quantitative study and analysis of BGP misconfiguration errors. The authors spent 3 weeks analyzing routing table advertisements to track potential misconfiguration errors. After the 3 weeks, they used the final routing tables to poll a random host within each AS to see if it was reachable. If it wasn't reachable, they emailed the network operators to get more information about the misconfiguration.
Their study focused on 2 types of misconfigurations: origin and export. An origin misconfiguration is when an AS accidentally introduces a route into the global BGP tables. To test this, they looked at short-lived routes because they assumed that an operator would quickly correct configuration errors. An export misconfiguration is when an AS accidentally exports a route to a peer that violates the AS's policy. They use Gao's algorithm to infer the relationships between the ASes. Along with Gao's algorithm, they looked for short-lived paths that violated the "valley free" property.
Once they had the potential misconfigurations, they needed to confirm. They emailed the network operators that they could reach to ask if the incident was in fact a misconfiguration. They further confirmed that by trying to reach random hosts in each AS to see if the AS was still live.
In their results, they found that at least 72% of new routes seen in a day resulted from an origin misconfiguration. Out of those, only 4% resulted in a loss of connectivity. In terms of their duration, over half lasted less than 10 minutes and 80% were corrected within the hour.
Then, the authors explain the various causes of origin misconfiguration (initialization bugs, old configuration, hijacks, forgotten filter, etc.) and export misconfiguration (prefix based configuration, bad ACL).
Finally, the authors describe several solutions to mitigate the problem of misconfiguration. The first is improving the human interface design by using principles such as safe defaults and large edit distance between the wrong and correct version. The second is to provide the operators with a high level language to configure the routers rather than the error-prone low-level language. The third is to do more consistency checks on the data directly in the routers. The last is to extend the protocol like S-BGP does.
Criticisms & Questions
One of their statistics is that about 75% of all new prefix advertisements were results of misconfigurations. However, they say that only 1 in 25 affects connectivity. I'm curious about what happens to the remaining packets that aren't affecting connectivity. Are they simply being ignored by the router and not being inserted in the BGP table, or are they added in and just aren't affecting connectivity? I was fairly surprised that such a high fraction of packets are results of misconfigurations.
When talking about the causes of the misconfigurations, the authors focused on slips and mistakes, both of which are not malicious. However, I would like to know if and how many of the misconfigurations were caused by someone using it as an attack. It seems like an attacker or a shady operator could intentionally cause problems with these misconfigurations.
Feedback
I enjoyed reading the paper and the authors did a good job answering many of the concerns I had as I read the paper.
Interdomain Internet Routing
Summary
This paper focuses on BGP (Border Gateway Protocol) - both the technical details of the protocol as well as the policy that controls the flow of traffic.
BGP is the interdomain routing protocol, used by routers at the boundaries of ISP's to share routing information. What actually controls the flow of traffic are the policies put in place by AS's. An AS is owned by a commercial entity and determines its policies based largely on financial reasons as well as its customers' needs. Each AS uses its own Internal Gateway Protocol.
There are 2 types of inter-AS relationships: transit and peering. A transit relationship usually involves a financial settlement from one AS (the customer) to the other (the provider). A peering relationship, on the other hand, does not include a financial settlement. Instead, they are usually between business competitors and involve reciprocal agreements allowing each AS to access a subset of the other's routing tables.
An AS must decide which routes it will export to its neighboring AS's. Exporting a route means that it agrees to take all traffic to that destination IP address. Because an ISP agrees to provide its customers access to anywhere on the web, it will prioritize the exporting of routes to customers first. In addition, it will export all paths to its customers to the other AS's. An AS will also export some peering routes.
When deciding which routes to import, an AS will always prioritize a customer over a peer over a provider. All the import and export rules are stored in the LOCAL PREF.
BGP itself is a simple protocol, but is designed so that it can also implement policy. BGP was designed to meet 3 goals: scalability, policy and cooperation under competitive circumstances. A BGP router initially shares the subset of routes it wants to share with another BGP router. After that, it simply sends "update" and "withdraw" messages.
There are 2 types of BPG sessions: iBGP and eBPG. eBPG sessions are used between BPG routers in different ASes while iBPG sessions are between routers within the same AS. iBGP is used to share externally learned routes within the AS.
Then the paper provides the details for the various attributes in a BGP route announcement. It also goes into some of the security problems with BGP, giving a real-world example involving Pakistan Telecom and Youtube.
Criticisms & Questions
One thing that was unclear to me throughout the paper was the relationship between an AS and an ISP. I couldn't figure out if an ISP was made up several ASes, if an AS can cover several ISP's, if an ISP is an AS, etc. I would have liked it if the paper had made that point clear when it introduced ASes.
The paper went into detail about the "full-mesh" implementation of iBGP. It also offered route reflectors and confederation of BGP routers as 2 alternative implementations. I would like to know which of those implementations is most commonly used in practice? It also wasn't clear to me if the iBGP implementation is something that each AS chooses and so would have to be independent of any other AS's iBGP.
The problem with the lack of origin authentication really surprised me. It seems to me that "hijacking" routes is not very hard to do. And, if that is the case, why is it that more attackers are not taking advantage of it (possibly to launch a DOS attack)? And, are there any other measures being taken to protect against this vulnerability aside from the poorly-maintained registry?
Feedback
I would say definitely keep this paper in the syllabus. It helped me to understand a lot more about the reasoning behind the inter-AS policies. It's also nice to see the clear contrast between what's technically possible and ideal and what actually happens in practice.
This paper focuses on BGP (Border Gateway Protocol) - both the technical details of the protocol as well as the policy that controls the flow of traffic.
BGP is the interdomain routing protocol, used by routers at the boundaries of ISP's to share routing information. What actually controls the flow of traffic are the policies put in place by AS's. An AS is owned by a commercial entity and determines its policies based largely on financial reasons as well as its customers' needs. Each AS uses its own Internal Gateway Protocol.
There are 2 types of inter-AS relationships: transit and peering. A transit relationship usually involves a financial settlement from one AS (the customer) to the other (the provider). A peering relationship, on the other hand, does not include a financial settlement. Instead, they are usually between business competitors and involve reciprocal agreements allowing each AS to access a subset of the other's routing tables.
An AS must decide which routes it will export to its neighboring AS's. Exporting a route means that it agrees to take all traffic to that destination IP address. Because an ISP agrees to provide its customers access to anywhere on the web, it will prioritize the exporting of routes to customers first. In addition, it will export all paths to its customers to the other AS's. An AS will also export some peering routes.
When deciding which routes to import, an AS will always prioritize a customer over a peer over a provider. All the import and export rules are stored in the LOCAL PREF.
BGP itself is a simple protocol, but is designed so that it can also implement policy. BGP was designed to meet 3 goals: scalability, policy and cooperation under competitive circumstances. A BGP router initially shares the subset of routes it wants to share with another BGP router. After that, it simply sends "update" and "withdraw" messages.
There are 2 types of BPG sessions: iBGP and eBPG. eBPG sessions are used between BPG routers in different ASes while iBPG sessions are between routers within the same AS. iBGP is used to share externally learned routes within the AS.
Then the paper provides the details for the various attributes in a BGP route announcement. It also goes into some of the security problems with BGP, giving a real-world example involving Pakistan Telecom and Youtube.
Criticisms & Questions
One thing that was unclear to me throughout the paper was the relationship between an AS and an ISP. I couldn't figure out if an ISP was made up several ASes, if an AS can cover several ISP's, if an ISP is an AS, etc. I would have liked it if the paper had made that point clear when it introduced ASes.
The paper went into detail about the "full-mesh" implementation of iBGP. It also offered route reflectors and confederation of BGP routers as 2 alternative implementations. I would like to know which of those implementations is most commonly used in practice? It also wasn't clear to me if the iBGP implementation is something that each AS chooses and so would have to be independent of any other AS's iBGP.
The problem with the lack of origin authentication really surprised me. It seems to me that "hijacking" routes is not very hard to do. And, if that is the case, why is it that more attackers are not taking advantage of it (possibly to launch a DOS attack)? And, are there any other measures being taken to protect against this vulnerability aside from the poorly-maintained registry?
Feedback
I would say definitely keep this paper in the syllabus. It helped me to understand a lot more about the reasoning behind the inter-AS policies. It's also nice to see the clear contrast between what's technically possible and ideal and what actually happens in practice.
Tuesday, September 1, 2009
The Design Philosophy of the DARPA Internet Protocols
Summary
This paper describes the reasoning that went into creating the major internet protocols. Many people know the "what" of the protocols; this paper explains the "why". The 2 main protocols explored in this paper are IP and TCP.
The author explains that the fundamental goal of the internet was to connect the many smaller networks that existed at the time. Using several assumptions and the knowledge they had of current networks, they came up with the following design: a network made up several interconnected networks that used packet-switched communication, connected together with store-and-forward gateways.
The author lists the 7 second level goals of the internet in order of importance, and stresses that this list strongly reflects its use at the time, which was in a military context. This explains why survivability was so high on the list and accountability so low.
The author goes on to explain each goal in depth. Survivability made it necessary for state to be stored somewhere in the network; the designers decided to take the "fate-sharing" approach and stored the state in the hosts. IP and TCP, which were originally designed to be combined, were split in order to be able to support many different types of services. This led to a datagram being the basic building block. The internet is successfully able to support many different network technologies because it makes a very minimal set of assumptions about what the internet will provide. Then, the paper goes over the remaining goals and explains how some of the main problems we have with the internet today are due to these goals being so low on the priority list and therefore not well implemented.
The paper emphasizes how a datagram is not a service in itself, but rather was designed to be a building block. It also includes a discussion on TCP and some of the issues due to counting by bytes rather than packets. The paper finishes off with an for a new building block for the "next generation of architecture" called flows, which would be able to better address the goals of resource management and accountability.
Criticisms & Questions
I thought it was interesting how the author mentions "the next generation of architecture". The way he talks about it, it almost seems like there was a plan to redesign and rebuild the internet from scratch. It would be really interesting to know if at that time people did believe it was possible to switch over to a new, redesigned internet.
It would also be really interesting to know what the metrics were when evaluating the various goals, particularly the second level goals. Most of these goals are very hard to measure so I would imagine that knowing whether the goal was achieved or to what extent it was achieved would be hard to determine.
Feedback
I really enjoyed reading this paper. It was great to get more insight into what the designers considerations were when designing the internet. It definitely teaches one to look at it from their perspective before criticizing the design of the Internet. I would agree to keep this paper in the syllabus.
This paper describes the reasoning that went into creating the major internet protocols. Many people know the "what" of the protocols; this paper explains the "why". The 2 main protocols explored in this paper are IP and TCP.
The author explains that the fundamental goal of the internet was to connect the many smaller networks that existed at the time. Using several assumptions and the knowledge they had of current networks, they came up with the following design: a network made up several interconnected networks that used packet-switched communication, connected together with store-and-forward gateways.
The author lists the 7 second level goals of the internet in order of importance, and stresses that this list strongly reflects its use at the time, which was in a military context. This explains why survivability was so high on the list and accountability so low.
The author goes on to explain each goal in depth. Survivability made it necessary for state to be stored somewhere in the network; the designers decided to take the "fate-sharing" approach and stored the state in the hosts. IP and TCP, which were originally designed to be combined, were split in order to be able to support many different types of services. This led to a datagram being the basic building block. The internet is successfully able to support many different network technologies because it makes a very minimal set of assumptions about what the internet will provide. Then, the paper goes over the remaining goals and explains how some of the main problems we have with the internet today are due to these goals being so low on the priority list and therefore not well implemented.
The paper emphasizes how a datagram is not a service in itself, but rather was designed to be a building block. It also includes a discussion on TCP and some of the issues due to counting by bytes rather than packets. The paper finishes off with an for a new building block for the "next generation of architecture" called flows, which would be able to better address the goals of resource management and accountability.
Criticisms & Questions
I thought it was interesting how the author mentions "the next generation of architecture". The way he talks about it, it almost seems like there was a plan to redesign and rebuild the internet from scratch. It would be really interesting to know if at that time people did believe it was possible to switch over to a new, redesigned internet.
It would also be really interesting to know what the metrics were when evaluating the various goals, particularly the second level goals. Most of these goals are very hard to measure so I would imagine that knowing whether the goal was achieved or to what extent it was achieved would be hard to determine.
Feedback
I really enjoyed reading this paper. It was great to get more insight into what the designers considerations were when designing the internet. It definitely teaches one to look at it from their perspective before criticizing the design of the Internet. I would agree to keep this paper in the syllabus.
Subscribe to:
Posts (Atom)