MultiWAN: WAN Aggregation for Developing Regions



Similar documents
Requirements of Voice in an IP Internetwork

Network Simulation Traffic, Paths and Impairment

Improving the Performance of TCP Using Window Adjustment Procedure and Bandwidth Estimation

Truffle Broadband Bonding Network Appliance

Virtual Leased Line (VLL) for Enterprise to Branch Office Communications

Frequently Asked Questions

WHITE PAPER: Broadband Bonding for VoIP & UC Applications. In Brief. mushroomnetworks.com. Applications. Challenge. Solution. Benefits.

A Talari Networks White Paper. Transforming Enterprise WANs with Adaptive Private Networking. A Talari White Paper

White Paper: Virtual Leased Line

Encapsulating Voice in IP Packets

How To Make A Multilink Ame Relay Work Faster And More Efficiently

Low-rate TCP-targeted Denial of Service Attack Defense

A Simulation Study of Effect of MPLS on Latency over a Wide Area Network (WAN)

Chapter 5. Data Communication And Internet Technology

WAN Traffic Management with PowerLink Pro100

WAN Performance Analysis A Study on the Impact of Windows 7

Chapter 4. VoIP Metric based Traffic Engineering to Support the Service Quality over the Internet (Inter-domain IP network)

Clearing the Way for VoIP

DOMINO Broadband Bonding Network

TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) Internet Protocol (IP)

AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK

Allocating Network Bandwidth to Match Business Priorities

IP Traffic Engineering over OMP technique

STANDPOINT FOR QUALITY-OF-SERVICE MEASUREMENT

Boosting mobility performance with Multi-Path TCP

Load Balancing Mechanisms in Data Center Networks

Chapter 3 ATM and Multimedia Traffic

Scaling 10Gb/s Clustering at Wire-Speed

Voice Over IP Performance Assurance

NComputing L-Series LAN Deployment

Performance Evaluation of Linux Bridge

Reliable high throughput data connections with low-cost & diverse transport technologies

Computer Networking Networks

Data Center Networking with Multipath TCP

5. DEPLOYMENT ISSUES Having described the fundamentals of VoIP and underlying IP infrastructure, let s address deployment issues.

The Next Generation Network:

Cisco Integrated Services Routers Performance Overview

IP SLAs Overview. Finding Feature Information. Information About IP SLAs. IP SLAs Technology Overview

Technical Bulletin. Enabling Arista Advanced Monitoring. Overview

Distributed Explicit Partial Rerouting (DEPR) Scheme for Load Balancing in MPLS Networks

Quality of Service versus Fairness. Inelastic Applications. QoS Analogy: Surface Mail. How to Provide QoS?

PART III. OPS-based wide area networks

Cisco Application Networking for Citrix Presentation Server

The Next Generation of Wide Area Networking

Quality of Service (QoS) on Netgear switches

Multipath TCP in Practice (Work in Progress) Mark Handley Damon Wischik Costin Raiciu Alan Ford

Data Sheet. V-Net Link 700 C Series Link Load Balancer. V-NetLink:Link Load Balancing Solution from VIAEDGE

TRUFFLE Broadband Bonding Network Appliance BBNA6401. A Frequently Asked Question on. Link Bonding vs. Load Balancing

Quality of Service using Traffic Engineering over MPLS: An Analysis. Praveen Bhaniramka, Wei Sun, Raj Jain

Smart Queue Scheduling for QoS Spring 2001 Final Report

Investigation and Comparison of MPLS QoS Solution and Differentiated Services QoS Solutions

Analysis of Effect of Handoff on Audio Streaming in VOIP Networks

Per-Flow Queuing Allot's Approach to Bandwidth Management

Comparative Analysis of Congestion Control Algorithms Using ns-2

TRUFFLE Broadband Bonding Network Appliance. A Frequently Asked Question on. Link Bonding vs. Load Balancing

VoIP Bandwidth Considerations - design decisions

An Overview of Multipath TCP

Key Components of WAN Optimization Controller Functionality

WHITEPAPER. VPLS for Any-to-Any Ethernet Connectivity: When Simplicity & Control Matter

Bandwidth Measurement in xdsl Networks

Applications. Network Application Performance Analysis. Laboratory. Objective. Overview

APPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM

AN IMPROVED SNOOP FOR TCP RENO AND TCP SACK IN WIRED-CUM- WIRELESS NETWORKS

How To Provide Qos Based Routing In The Internet

Research on Errors of Utilized Bandwidth Measured by NetFlow

A Review on Quality of Service Architectures for Internet Network Service Provider (INSP)

Question: 3 When using Application Intelligence, Server Time may be defined as.

Bonded Internet. Bonded is Better! AllCore Communications... Bonded Internet Features: Who is AllCore Communications?

TDM services over IP networks

Evaluating Bandwidth Optimization Technologies: Bonded Internet

Performance Evaluation of AODV, OLSR Routing Protocol in VOIP Over Ad Hoc

White Paper: Broadband Bonding with Truffle PART I - Single Office Setups

Planning Networks for VOIP. An Introduction

Performance Analysis of AQM Schemes in Wired and Wireless Networks based on TCP flow

Network Level Multihoming and BGP Challenges

MINIMUM NETWORK REQUIREMENTS 1. REQUIREMENTS SUMMARY... 1

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

R2. The word protocol is often used to describe diplomatic relations. How does Wikipedia describe diplomatic protocol?

Preparing Your IP Network for High Definition Video Conferencing

A Link Load Balancing Solution for Multi-Homed Networks

Optimal Network Connectivity Reliable Network Access Flexible Network Management

A Passive Method for Estimating End-to-End TCP Packet Loss

Improving Quality of Service

Router Scheduling Configuration Based on the Maximization of Benefit and Carried Best Effort Traffic

Is Your Network Ready for VoIP? > White Paper

WAN Optimization Integrated with Cisco Branch Office Routers Improves Application Performance and Lowers TCO

Ensuring Real-Time Traffic Quality

Accurate End-to-End Performance Management Using CA Application Delivery Analysis and Cisco Wide Area Application Services

Avaya Aura Session Manager

How To Make A Network More Reliable With A Virtualization System

Understanding Latency in IP Telephony

Mobile SCTP Transport Layer Mobility Management for the Internet

The changing face of global data network traffic

MMPTCP: A Novel Transport Protocol for Data Centre Networks

Monitoring of Tunneled IPv6 Traffic Using Packet Decapsulation and IPFIX

Accelerating File Transfers Increase File Transfer Speeds in Poorly-Performing Networks

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Enabling Real-Time Sharing and Synchronization over the WAN

Transcription:

MultiWAN: WAN Aggregation for Developing Regions Kristin Stephens, Shaddi Hasan, Yahel Ben-David UC Berkeley ABSTRACT For rural ISPs and organizations, purchasing high-bandwidth, high-quality Internet connections is expensive, if such connections are even available. We propose MultiWAN, a system for loadbalancing at packet granularity across a number of low-bandwidth, low-quality wide area network links. Doing so allows us to create an aggregated connection with performance properties similar to that of a single high-bandwidth connection, in particular that a single flow is able to use available capacity on any connection, rather than being tied to a single link. MultiWAN requires no modification to end hosts and operates as a transparent middlebox. We demonstrate that it provides performance at least as good and in some cases better than traditional flow-based load balancers, approximating that of a single high-bandwidth connection, and show that implementing MultiWAN can be cost-effective for rural ISPs and organizations.. INTRODUCTION As organizations grow, they face a limited number of options for increasing the capacity of their Internet connection from their ISP. One option is simply upgrading to a higher class of service which provides higher bandwidth or usage limits, or better quality of service. This is ideal from both a management and performance perspective: the organization does not have to deal with the complexities of multihoming, and the single fat pipe allows users within that organization to burst to the full capacity of the link. However, this option quickly becomes more expensive as the class of service increase. Moreover, higher classes of service are not available in all areas from all ISPs; this is particularly the case for organizations operating in rural or developing regions with poor infrastructure. The main alternative available to such organizations is purchasing additional circuits from one or more ISPs and bonding them together. Even if such links were present, purchasing multiple small links and aggregating them has the potential to be more costeffective than purchasing one single large link. For example, India s largest telecommunications provider, BSNL, charges roughly three times as much for a 55Mbps link than it does for five 34Mbps links [8], which provide similar aggregate bandwidth. If aggregating the smaller links proves more economical than purchasing a single high-bandwidth link, rural ISPs and organizations would benefit from a substantial reduction in costs. When upgrading a single connection to handle more traffic is not an option, a common solution is to use a load balancer to distribute flows at the transport level across multiple connections. Choosing the flow as the unit of load balancing is a logical decision due to the predominance of TCP traffic on the Internet: splitting a single TCP flow across multiple network paths causes packet re-ordering, greatly reducing the flow s performance [, 7]. While this approach is both simple to implement and to manage, it limits the burst rate of individual flows to that of the connection on which they are placed. This is particularly unfortunate for typical web-heavy traffic mixes, where the bulk of flows are short and can complete more quickly if able to burst to high bandwidths. In this paper, we propose MultiWAN, a system for loadbalancing traffic at the packet level, allowing full utilization of multiple WAN connections without significantly impacting TCP performance. We achieve this by creating a logical tunnel for each WAN connection from the organization using MultiWAN to another location under their control with cheap, plentiful bandwidth, such as a cloud provider s data center or educational institution. Because MultiWAN runs at both ends of these tunnels, we are able to apply more sophisticated load balancing techniques than previous approaches. We use in-band available bandwidth measurements to dynamically redistribute load across connections and to control the degree of reordering caused by splitting a single flow across multiple network paths. By doing so, we enable a single flow to achieve a throughput near the aggregate throughput of the available WAN connections, significantly improving utilization of network resources. MultiWAN also decreases the average flow completion time, improving a key metric for user-perceived performance of web traffic. In addition to improving the web-usage experience for users in the developing world, MultiWAN is useful for improving performance for mobile Internet access, as one might find on a bus or train. These systems typically load-balance users connected via 802. Wifi across multiple Internet connections to the cell phone network. As a result, bandwidth is both very limited and highly erratic. The remainder of this paper is structured as follows. We first describe related work on load balancing across multiple networks and the active measurement techniques we use. In section 3, we explain our design goals and section 4 describes the design of MultiWAN. We then evaluate the performance of MultiWAN and compare it against both a flow-based load balancer as well as a single "fat pipe". We also provide a comparison of costs for each before concluding. 2. RELATED WORK Most of the individual components of MultiWAN are not novel we proudly borrow solutions to particular sub-problems we face from other work, in particular our flowlet-level traffic splitting scheme to avoid re-ordering [5] and our latency-based congestion detection [?]. However, we are the first to combine these pieces into a real-time, dynamically adaptive system and to optimize our system for the unique needs of organizations and ISPs in the developing world.

Devices that perform flow-based load balancing are available commercially, and this feature is supported in modern operating systems such as Linux. These systems operate by binding transport layer flows to particular upstream connections, but do not allow any single flow to take advantage of excess capacity on other connections. Link aggregation is another commonly used load balancing technique, which like MultiWAN uses packets (frames) as the unit of load balancing. However, link aggregation operates at the link layer and neither works across a WAN nor provides any accommodation for preventing re-ordering. There have been a variety of proposals for taking advantage of the aggregate bandwidth multiple independent paths to improve application performance, such as MPTCP [3] and SCTP []. However, these protocols are meant to be run on end systems as replacements for standard single-path transport protocols. MPTCP in particular maintains a congestion window for each path that a sub-flow uses, and determines allocation across paths based on the relative growth rate of each path s congestion window. MultiWAN operates as a middlebox, requires no modification or special configuration of end hosts, and because it operates at the network layer supports arbitrary transport protocols. Several recent papers have considered the problem of dynamic load balancing, such as COPE [2], TeXCP [4], and MATE [2]. These have primarily considered the case of load balancing traffic within an ISP s backbone network. In particular, these assume some degree of cooperation from routers across all paths utilized, which is infeasible for the deployment scenarios MultiWAN is designed for. While MATE is specifically designed for MultiProtocol Label Switching (MPLS) networks, it uses method similar to our own (described in 4.2) for detecting congestion based on relative latency differences between probe arrivals. We adopt the reordering-reducing traffic splitting mechanism used by FLARE [5]. While the original FLARE design assumed the desired traffic distribution was provided by an oracle and attempted to match a periodically-updated desired distribution of traffic, our approach uses in-band measurements to dynamically adapt to changing network conditions in real-time. The novelty of our approach is this integrated system, as well as its deployment model. [9] 3. DESIGN GOALS Small Internet Service Providers (ISP) operating in rural communities, especially in developing countries, must rely on scarce, expensive, and poor quality upstream Internet bandwidth. As providing reliable service is a priority for a service provider, these operators prefer to buy bandwidth from multiple upstream ISPs to increase their availability. High bandwidth Internet connections are often not available in rural areas; in the rare locations where they are, customers must often cope with extremely long lead times to establish connectivity and generally unaffordable prices. As a result, purchasing multiple residential-targeted connections and aggregating them together via some form of load balancing becomes an attractive alternative, promising higher reliability and bandwidth at reasonable costs. A secondary effect of the poor connectivity situation in rural developing regions is that network operators must struggle to achieve profitability by using an arsenal of techniques aimed at increasing their oversubscription ratio while attempting to minimize the negative impact such measures may have on user satisfaction. Common techniques used to make more efficient use of limited upstream bandwidth include aggressive caching, traffic prioritization, and filtering or rate-limiting of resource-intensive applications such as high-definition video and peer-to-peer file transfer. As bandwidth demand increases, operators may experiment with more drastic measures such as selective compression of incoming traffic. Such a proxy, ideally located in a well-connected data center, compresses incoming traffic and routes the scaled-down content back to the rural community. Our core insight is that once an operator begins using a tunneling proxy to improve their network performance, the operator can apply much more sophisticated load balancing techniques to aggregate their multiple WAN connections together, in addition to other desirable traffic engineering mechanisms. We aim to decrease costs for a network operator in the developing world by increasing the degree of oversubscription of their upstream Internet connections without affecting user-perceived performance. When we consider how to build a system to achieve this, several key design goals come to mind. Emulate a single high-bandwidth link as closely as possible. The ideal Internet connection is both high-capacity and reliable. In particular, a single large connection allows traffic to burst to the full capacity of the link, leading to efficient utilization of available resources. This allows user flows to complete more quickly, increasing user-perceived performance and delaying the onset of congestion in the link. Unfortunately, such connections are not available in much of the developing world, and even when they are they are not cost effective. Perform no worse than a flow-based load balancer. The current state-of-the-art for aggregating multiple Internet connections is flow-based load balancing, in which traffic is load balanced at the transport layer such that a single flow is bound to one particular connection. Network appliances that implement this scheme are commercially available and widely deployed, with well-understood and accepted performance characteristics. While a flow-based load balancer is not the most efficient approach to aggregating multiple Internet connections, deploying one is a low-risk proposition for a network operator. As a result, any alternative load balancing technique, particularly one which violates conventional wisdom by splitting a single flow across multiple paths, must perform no worse than a flow-based load balancer. Otherwise, few network providers would be willing to deploy it. Improve the performance of web traffic. Numerous analysis have shown that web traffic comprises the bulk of the traffic on the Internet [?]. Users are highly sensitive to web performance, While other applications such as peer-to-peer file sharing and VOIP are popular and important, a load balancing system should not optimize for their performance. In the former case, the volume of bandwidth generated by peer-to-peer applications presents significant challenges to network operators; indeed a system that reduced the performance of these applications would be considered beneficial by many network operators. This goal also imposes important and challenging constraints. Web traffic uses TCP as the transport layer, which provides both end-to-end congestion control and reliability for each flow. In order to provide good performance, any load balancing technique must not create conditions which cause TCP to suffer poor performance. An ideal load balancing system would allow TCP flows to reach their steady-state fair traffic allocation as quickly as possible, while not unnecessarily causing a flow to reduce its sending rate. In particular, this means that the load balancer should keep a flow in the aggressive-growth "slow-start" phase as long as possible and should keep packet re-ordering to a minimum. We specifically target a usage profile that consists primarily of asymmetric web traffic, with the network connected to the local

endpoint making primarily small HTTP requests, and hosts beyond the remote endpoint sending primarily large responses in return. Our key metric is decreasing flow completion time, as this is a key factor determining users perception of network performance. Queuing theory suggests that processor sharing at the packet level would result in the shortest average service times for each flow. However, because the bulk of our traffic uses TCP, the sending rate of an individual flow will be dominated first by the size of the TCP window, and later by the available bandwidth of the particular connection. Measurement results indicate that a large percentage [?] of web requests complete while the TCP connection is still in the slow-start phase of operation, so ensuring that connections can remain in slow-start as long as possible is a desirable property. Because window growth rate in TCP is determined primarily by RTT [?], this suggests that an intelligent load balancer should keep new flows on the lowest latency link possible for as long as possible. Once a flow has left the slow-start phase and is no longer aggressively growing its window size, the primary determinant of throughput is the available bandwidth of the connection it is using. Thus, we should minimize the congestion on all WAN connections in the tunnel. This will both keep flows in slow-start for as long as possible and optimize the utilization of available bandwidth in the tunnel. Intelligently route traffic across available paths. Since most of our traffic is TCP and TCP connections often complete while still in slow-start we need to intelligently choose what paths these flows take. Keeping flows in slow-start on connections with low latency will increase their window size much faster than those on high latency lines, just by the way TCP using the RTT to clock itself and grow its window. In addition not all traffic is best suited to travel over certain available paths. The path s end-point may be no where near the intended destination. This could be domestic traffic intended for local destinations. A possible solution to this is using geo-located IP s to classify traffic as domestic or international and send them down particular paths accordingly. 4. THE MULTIWAN SYSTEM 4. Overview The MultiWAN system consists of two tunnel endpoints, each of which runs the congestion-proportional load balancer described in section 4.2. The local endpoint is physically located within the organization which is bonding together multiple independent WAN connections. The remote endpoint is physically located in a facility with a high-bandwidth Internet connection, such as an educational institution or data center. Ideally, the remote endpoint would be located at a site with low latency to the providers of the content most requested by users at the organization running MultiWAN. For this reason, we envision remote endpoints being run on public cloud infrastructure. Each endpoint runs a symmetrical instance of MultiWAN, and each has one or more physical network connections as well. The physical network connections may (and in fact should) be independent at the IP level; in particular, we do not require that the connections share the same routing view, ISP, or even link layer technology. The MultiWAN instances establish a number of logical MultiWAN tunnels between each endpoint equal to the number of independent physical network connections available at the local endpoint. We do not require any particular tunneling strategy, nor do we require that the number of physical connections available at each endpoint be equal. While we used a simple IP-in-IP tunneling scheme in our evaluations, MultiWAN could run equally well over tunnels created with standard tools such as OpenVPN; doing so would allow MultiWAN users to enjoy encryption and compression of their data. While designing our system we put ourselves under the restriction of non-synchronized clocks. Since one endpoint is likely to be in a developing region where even power is unreliable, the idea of keeping the clocks on either side of the tunnel synchronized seemed beyond our reach. However we do assume they progress at the same rate. To full fill our goal of emulating a single high-bandwidth link as closely as possible and improve the performance of web traffic we aim to maximize our WAN connection utilization and maximize TCP performance. We maximized our connection utilization through the traffic allocation across the different connections, explained in section 4.2. To perform no worse than flow-based load balancing we implemented FLARE [5] along with in-band latency measurements in section 4.3. Finally the goal to intelligently choosing route traffic across available paths led us to the idea of grouping our connections by latency, explained in 4.4. 4.2 Traffic Allocation Our goal of maximizing our WAN connection utilization can be defined as minimizing the congestion on each connection. Notice we are not trying to eliminate the congestion. We do not have control of the packet rate nor do we want to throttle it ourselves. We wanted to do the least amount of disturbance to the traffic. So decided to let the network itself do the throttling, while we simply try to keep the congestion on any given connection as low as possible. It is also important to note we do not have omniscience when it comes to what is going on in the tunnel. What happens to the packets between our two endpoints is unknown, the only information we have are the measurements we can manage at the endpoints. This means we also cannot predict the rate of the packets arriving to travel across our tunnel. However since the connections are relatively small in comparison to what most web servers today expect a single user to have, we assume that the packet rate will usually be enough to fill our tunnel and engineer for this common case. 4.2. Congestion Measurement A common practice to measure available bandwidth is to send packet trains with a specific elapsed time between each packet and then measure the difference between the packets arrivals [?]. If the gap between the packets has grown the path is considered congested because the latter packet spent more time waiting in a router s queue than the former packet due to more packets in the network. Instead of creating packet trains we use the actual traffic we are sending down each path. Since we cannot have synchronized clocks we insert our own header, referred to in the rest of the paper as the MultiWAN header, with a timestamp between the encapsulating IP header and the packet. We use the timestamp in the MultiWAN header to calculate the inter-send time for the packets in each WAN connection in the tunnel. The inter-send time gap is compared to the inter-arrival time gap. If the inter-arrival time gap is greater than the inter-send time gap we consider the packets to have detected congestion, which we will call a congestion event. However this is a single measurement and could be just a temporary phenomena. On top of that we are detecting congestion towards that tunnel endpoint, however the tunnel endpoint that can utilize the congestion information is the other end. To solve both of these issues we include another field in the MultiWAN header, a bitmap about the last 6 packet inter-send/arrival gaps. If the bit is

Local Endpoint FlareSwitch VSAT Remote Endpoint FlareSwitch Local Network CalcDistribution DSL CalcDistribution Internet StripMWANHeader StripMWANHeader Figure : System diagram. In this example, one tunnel endpoint is at an ISP in rural India, while the other endpoint is at a colocation facility in California. The ISP has both a VSAT connection and a DSL connection from two separate providers, and MultiWAN distributes traffic across them. Note that incoming traffic to an endpoint is immediately forwarded after removing the MultiWAN header; a copy of the packet is used to calculate per-connection congestion. MultiWAN uses no internal queuing. the inter-arrival gap is greater than the inter-send gap, 0 otherwise. This bitmap is placed on all outgoing packets, so the endpoint receiving the packets has the most up-to-date information on the state of the WAN connections leading from it to the other endpoint. A sequence number is included in the MultiWAN header to know the freshness of the bitmap. It is incremented for each new measurement, so the endpoints know how many of the measurements are new and have not yet been reacted to. 4.2.2 Distribution Algorithm To reiterate our goal is to minimize the congestion across the connections. A general idea of the algorithm is to use the bitmap information for each WAN connection to move traffic from more congested connections to less congested ones. First we convert the bitmap to what we termed a congestion score. A congestion score is the number of s in the congestion bitmap, which gives us the number congestion events since our last update or the last 6 packet intervals. The average of the congestion scores is calculated and all scores above the average will have a percentage of their traffic moved to a connection below the average. There are two possible ways to choose which below average WAN connection receives the moved traffic. The first possibility is there is a connection(s) with a congestion score of zero. If such a connection(s) exists it receives a percentage of traffic from each of the above average congestion WAN connections. The second, and more likely, possibility is all the connections have a congestion score greater than zero. In this case, for each of the connections with an above average congestion score a below average congested connection is uniformly chosen based on a distribution of the inverse of their scores. Or in other words the lower the congestion score the more likely that connection is chosen to receive the moved percentage of traffic. The update interval, percentage of traffic moved per interval, and minimum traffic allocation per line are important variables in our implementation. The update interval and traffic percentage control how quickly we react to congestion. We chose 00ms for the update interval due to the bitmap size and RTT of the tunnel. Since the bitmap is only 6 bits (small for the sake of keeping the header as small as possible) the interval should not be so big that we lose information, nor so small that the bitmap has barely any new congestion measurements in it. The highest bit rate we tested was Mbps, so for 500 byte packets we have 6 packets every 400ms. However the tunnel s RTT in our experiments is 300ms. An interval of 400ms means we could potentially be reacting to congestion that the endpoints have already noticed and reacted to. After experimenting in our test bed we found that 00ms, gave us the trade-offs we wanted. Since we do not want the traffic distribution to not change very quickly nor vary drastically over time, a small update interval means we should have a small percentage change, while a large interval should accordingly have a larger percentage. With the update interval set at 00ms we set the percent of traffic we move per interval to %. Finally both our traffic measurements are in-band and our needed information exchange is sent in-band, we needed all the connections to have a minimum of traffic. After testing we set this number to 5%. After testing we found the distribution would periodically become completely off balance with some connections set to the minimum allocation. Deciding this was caused by a lack of exploring for potentially available bandwidth, we added a mechanism that would move the distribution towards uniform. Whenever the congestion scores for all the connections is equal, we move % of traffic from the connection with the biggest allocation to the smallest allocation. This was found to help stabilize the distributions. 4.3 Reorder Handling TCP is very sensitive to reordering and reacts to it as if there is a loss and therefore congestion on the network. However since we are sending packets down connections that have potentially different latencies, we are introducing the potential of reordering that has nothing to do with loss. To prevent possible reordering we implemented FLARE [5], however it required periodic measurements to find the maximum RTT delta to use as its minimum time before switching-ability (MTBS). We already send the timestamp in our own MultiWAN header to detect congestion, we realized that we can also use this information to determine the max one-waylatency delta, which is a more accurate value to use for deciding what line a packet can travel. To calculate the max one-way-latency delta we use the timestamp in the MultiWAN header and the timestamp of when packets arrive. Using these timestamps we can get the latency between two lines, line A and B by doing the following. Let line A s send time of a packet be t A and arrival time be t A, and line

Traffic Generator (local side) GigE Switch MultiWAN Router Constrained Links MultiWAN Router Monitor Figure 2: Experimental testbed GigE Switch Traffic Generator (remote side) B s send and arrival times t B and t B respectively. Even though the clocks on each tunnel endpoint are not synchronized we can assume they progress at the same rate, therefore the difference between the send and arrival times is simply the latency plus the difference in the endpoint s clocks. So the latency of line A and B would be l A = t A t A + c and l B = t B t B + c respectively, where c is the difference in the two clocks. To get the latency delta between the two lines one just needs to subtract them which is D AB = l A l B = (t A t A) (t B t B). To find the max latency delta we simply calculate the difference for all the lines and find the longest time difference. 4.4 Latency Control Developing regions utilize a diverse set of Internet connections, including DSL, satellite, and broadband. Therefore there is a high possibility the tunnel will include both a low latency connection and a high latency connection. However these two kinds of connections have very different latencies, which can cause a high variance in the RTT if a flow s packets are sent down both connections. A high variance can cause problems in detecting lost packets and more importantly window size growth. To handle this we divide the WAN connections into classes based on latency. All flows start traveling on the class with the lowest latency. When that latency class becomes too congested the oldest flows are moved to the next higher latency class. We move the oldest flows because they are the most likely to have finished their slow-start phase. Therefore their aggressive window growth has already passed and increasing their RTT will not be as strong a performance hit as those that are still in slow-start. Also most TCP flows are relatively short, the older the flow is the more likely it is a long lived flow and therefore getting it finished quickly is not as important as the short flows that are still in slow-start. We defined a class s congestion score to be the exponentially weighted moving average of the average of its connections congestion scores. When the class s congestion score is above a threshold of 3 we move flow from the current class to next higher class. The weight of the move average is set to 0.5, this way we react to congest that is currently happening, but with an equal emphasis on historical congestion as well so we are not reacting to something that is only momentary. However we noticed that with the weight moving average after reacting to congestion we continue to react to it because of the history. To remedy this whenever we move flows to another class we also reset the moving average to zero. 5. EVALUATION 5. Methodology We evaluated the performance of three scenarios that a rural network operator could face: using MultiWAN, a flow-based load balancer, and a single "fat pipe" connection. We implemented both Throughput (Mbps) Throughput (kbps) 2 0 9 8 7 6 5 4 3 2 Original Trace Throughput 0 200 400 600 800 000 200 400 600 800 000 800 600 400 200 Time (seconds) Figure 3: Original input trace throughput Trace-derived Experimental Traffic Throughput 0 0 200 400 600 800 000 200 400 600 800 Time (seconds) Figure 4: Replay trace throughput MultiWAN and a simple flow-based load balancer in Click [6], and used Dummynet [0] to simulate network latency and bandwidth constraints across simulated WAN connections between the routers. We simulate multiple WAN connections between the routers by differentiating packets source and destination IP addresses using NAT in Click, or in the case of MultiWAN through the outer header s source and destination IP address from our IPin-IP encapsulation scheme. Our experimental testbed is arranged in a dumbbell topology, and is shown in Figure 2. Both routers are identically-configured servers with two 2.6GHz Opteron CPUs and a dual-port Broadcom BCM5704 Gbps NIC. Each router is connected via a switch to its respective traffic generator and is directly connected to the monitor, a quad-core 2.4GHz Opteron server with its two Gbps NICs operating bridged together. All traffic between the routers is captured by a monitor server, allowing us to capture and analyze our experimental traffic as it appears on the wire. We run two traffic generator tools on our end systems. The first of these is iperf, which we use to analyze how our MultiWAN affects the performance of a small number of flows. The second tool is tmix [3]. The tmix traffic generator replays the application level behavior of a trace, which allows us to generate highly realistic, trace-derived traffic for our experiments. Thus, we are able to evaluate MultiWAN per- Our flow-based load balancer randomly assigns new flows to a particular WAN connection based on a hash of the flow s source and destination IP address and port.

5 4.5 Throughput (Mbps) 4 3.5 3 2.5 2 0.8 0.6 0.4.5 Total Line Line 2 Per-line capacity 0.5 0 5 20 25 30 35 40 45 50 Time (seconds) 0.2 0 0 00 000 0000 Connection Duration (seconds) mwan flowlb fatpipe Figure 5: Throughput of a single flow using MultiWAN. Both lines were limited to a maximum throughput of 2Mbps. Figure 6: Connection duration of each scenario. Both lines were limited to a maximum throughput of 256Kbps. formance for a typical web-heavy traffic mix we would expect to see in a rural network operator s network. Our input to the traffic generators was derived from a network trace we obtained from a rural Internet service provider in India taken at the border gateway between their network and their upstream provider. This trace was taken from a point after their caching proxy, so it only includes requests that were not handled by the cache and that would incur a WAN access under normal circumstances. The input trace was four hours long, and was captured in February 20. We consider only the first 30 minutes of the trace for our experiments. For the input to our traffic generators, we further consider only connections that complete within the first 30 minutes, allowing us to focus on only relatively short connections. In 4, we see the throughput over time of a replay using our sample of the original trace. The difference in offered load between our replay trace and the original input trace is accounted for by limiting ourselves to only short, completed connections. 5.2 Throughput We first compare the maximum achieved throughput of a single TCP flow under our three experimental conditions. We simulated two WAN connections at the local endpoint, each constrained to 2Mbps, with an RTT on both of 300 milliseconds, a typical latency for intercontinental Internet traffic. We modeled this scenario with a single flow generated by iperf from the local-side traffic generator to the remote-side traffic generator. Each line was given a bandwidth constraint of 2Mbps, and we emulated an RTT of 300ms per line. The results of our experiment are shown in??. As expected, traffic is effectively split between the two lines, and our aggregate performance exceeds the per-line bandwidth constraint of the experiment. 5.3 Improving Flow Completion Times While MultiWAN can enhance the performance of a small number of long lived flows, the real metric that matters to users is how quickly their web pages load. Thus, we next consider how Multi- WAN affects the flow completion time (FCT) of typical web traffic. We consider two scenarios, and compare the performance of MultiWAN to the flow-based load balancer and the single "fat pipe". Constrained. We first constrain our two links to 256Kbps and 256Kbps, well below the offered load of our replay trace. Our results are in figure 6. 0.8 0.6 0.4 0.2 0 0 00 000 0000 Connection Duration (seconds) Figure 7: Connection duration of each scenario. were limited to a maximum throughput of Mbps. mwan flowlb fatpipe Both lines Unconstrained. Next, we set the bandwidth constraint of each line to Mbps. This ensures the replay does not suffer any constraint. 6. DISCUSSION 6. Limitations While the results of our evaluation indicate that MultiWAN achieves its design goals, we do not consider MultiWAN to be a philosopher s stone for Internet connections, transmuting a number of poor-quality connections into a high-quality one. Indeed, the aggregate connection that MultiWAN provides is only as good as its constituent underlying physical links. For instance, aggregating multiple VSAT connections with MultiWAN may increase performance by allowing flows to use bandwidth on all connections rather than just one, but latency can see no improvement. MultiWAN specifically only addresses network-layer issues such as congestion and routing outages, and in general cannot compensate for poorlyperforming underlying infrastructure 2. We do not see MultiWAN as a replacement for high-quality network connections, but rather as a system for extracting the most performance out of a limited set of existing resources. 2 It can and does, however, compensate for unreliable underlying infrastructure by dynamically re-allocating traffic when a link becomes unusable

6.2 Cost-Effectiveness Our key goal for MultiWAN is to allow rural network operators in developing regions to operate more efficiently and at lower peruser cost. Thus, any performance advantages MultiWAN offers are useless if it cannot be deployed cost-effectively. Indeed, one key disadvantage of MultiWAN is that it requires high-quality bandwidth at the remote endpoint: thus, a user of MultiWAN must pay not only for their expensive local Internet connections, but also the remote endpoint s Internet connection as well. Costs of deployment are highly dependent on the business model used by the network operator and the entity that deploys the MultiWAN system, which may not be the same. We believe that MultiWAN is cost-effective under certain reasonable business models. Exploiting the fact that MultiWAN has very low CPU and memory demands, sharing a single remote endpoint among several local endpoints is feasible. The operator of the remote endpoint either a separate business entity or a consortium of network operators would deploy the remote MultiWAN endpoint in a colocation facility, taking advantage of the relatively low-cost bandwidth available there. Such a colocation facility uses wholesale bandwidth rather than retail, which is found to be orders of magnitude more expensive than wholesale. [8] B. S. N. LTD. Bharat sanchar nigam ltd. http: //www.bsnl.co.in/service/leased_tariff.htm, December 200. [9] D. Phatak and T. Goff. A novel mechanism for data streaming across multiple ip links for improving throughput and reliability in mobile environments. In INFOCOM 2002. Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 2, pages 773 78 vol.2, 2002. [0] L. Rizzo. Dummynet: A simple approach to the evaluation of network protocols. ACM SIGCOMM Computer Communication Review, 27():3 4, Jan. 997. [] R. R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. J. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960, IETF, Oct. 2000. [2] H. Wang, H. Xie, L. Qiu, Y. R. Yang, Y. Zhang, and A. Greenberg. Cope: traffic engineering in dynamic networks. In Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, SIGCOMM 06, pages 99 0, New York, NY, USA, 2006. ACM. [3] D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley. Design, implementation and evaluation of congestion control for multipath tcp. In Proceedings of the 8th USENIX conference on Networked systems design and implementation, NSDI, pages 8 8, Berkeley, CA, USA, 20. USENIX Association. 7. CONCLUSIONS Our proposed system, MultiWAN, achieves the key design objectives of an ideal load balancer for network operators in rural developing regions. Not only does MultiWAN outperform a traditional flow-based load balancer, it roughly approximates the performance characteristics of a single high-bandwidth Internet connection. At the same time, it still preserves the availability benefits of having multiple independent upstream Internet providers. We have shown with trace-derived simulations that MultiWAN can improve flow completion times for a typical web-heavy traffic mix of a rural ISP, thereby improving user-perceived performance and increasing user satisfaction. While MultiWAN is not a panacea for the challenging connectivity situation faced by rural network operators, it is a practical and cost-effective system in many scenarios. The system is a drop-in middlebox that operates transparently (and without any modifications) to both end hosts and other middleboxes the network operator may be using to achieve their traffic engineering goals. 8. REFERENCES [] J. C. R. Bennett, C. Partridge, and N. Shectman. Packet reordering is not pathological network behavior. IEEE/ACM Trans. Netw., 7:789 798, December 999. [2] A. Elwalid, C. Jin, S. Low, and I. Widjaja. Mate: Mpls adaptive traffic engineering. pages 300 309, 200. [3] F. Hernandez-Campos, K. Jeffay, and F. D. Smith. Modeling and Generation of TCP Application Workloads. In Proceedings of the Fourth IEEE International Conference on Broadband Communications Review, 2007. [4] S. Kandula, D. Katabi, B. Davie, and A. Charny. Walking the Tightrope: Responsive Yet Stable Traffic Engineering. In ACM SIGCOMM, Philadelphia, PA, August 2005. [5] S. Kandula, D. Katabi, S. Sinha, and A. Berger. Dynamic load balancing without packet reordering. SIGCOMM Comput. Commun. Rev., 37:5 62, March 2007. [6] E. Kohler, R. Morris, B. Chen, J. Jannotti, and M. F. Kaashoek. The click modular router. ACM Transactions on Computer Systems, 8(3):263 297, August 2000. [7] M. Laor and L. Gendel. The effect of packet reordering in a backbone link on application throughput. Network, IEEE, 6(5):28 36, sep/oct 2002.