Data Center Challenges Building Networks for Agility



Similar documents
Data Center Challenges Building Networks for Agility

Albert Greenberg Principal Researcher

Data Center Network Topologies: VL2 (Virtual Layer 2)

Outline. VL2: A Scalable and Flexible Data Center Network. Problem. Introduction 11/26/2012

Walmart s Data Center. Amadeus Data Center. Google s Data Center. Data Center Evolution 1.0. Data Center Evolution 2.0

Load Balancing Mechanisms in Data Center Networks

Datacenter Networks Are In My Way

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

Data Center Network Architectures

Chapter 6. Paper Study: Data Center Networking

Advanced Computer Networks. Datacenter Network Fabric

Operating Systems. Cloud Computing and Data Centers

What Goes Into a Data Center? Albert Greenberg, David A. Maltz

Data Center Network Topologies: FatTree

Lecture 7: Data Center Networks"

How To Make A Vpc More Secure With A Cloud Network Overlay (Network) On A Vlan) On An Openstack Vlan On A Server On A Network On A 2D (Vlan) (Vpn) On Your Vlan

Towards a Next Generation Data Center Architecture: Scalability and Commoditization

Juniper Networks QFabric: Scaling for the Modern Data Center

Computer Networks COSC 6377

Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya and Amin Vahdat

Optimizing Data Center Networks for Cloud Computing

ADVANCED NETWORK CONFIGURATION GUIDE

A previous version of this paper was published in the Proceedings of SIGCOMM 09 (Barcelona, Spain, Aug , 2009). ACM, New York.

Portland: how to use the topology feature of the datacenter network to scale routing and forwarding

OVERLAYING VIRTUALIZED LAYER 2 NETWORKS OVER LAYER 3 NETWORKS

Chapter 1 Reading Organizer

基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器

Brocade Solution for EMC VSPEX Server Virtualization

The Software Defined Hybrid Packet Optical Datacenter Network SDN AT LIGHT SPEED TM CALIENT Technologies

Cloud Networking: A Novel Network Approach for Cloud Computing Models CQ1 2009

Core and Pod Data Center Design

Scala Storage Scale-Out Clustered Storage White Paper

BUILDING A NEXT-GENERATION DATA CENTER

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

Data Center Network Topologies

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Cisco s Massively Scalable Data Center

Network Virtualization for Large-Scale Data Centers

Flexible SDN Transport Networks With Optical Circuit Switching

Load Balancing in Data Center Networks

TRILL Large Layer 2 Network Solution

PROPRIETARY CISCO. Cisco Cloud Essentials for EngineersV1.0. LESSON 1 Cloud Architectures. TOPIC 1 Cisco Data Center Virtualization and Consolidation

Technology Insight Series

Ethernet Fabrics: An Architecture for Cloud Networking

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

Paolo Costa

Introduction to Network Virtualization in IaaS Cloud. Akane Matsuo, Midokura Japan K.K. LinuxCon Japan 2013 May 31 st, 2013

Analysis of Network Segmentation Techniques in Cloud Data Centers

SDN and Data Center Networks

Panel: Cloud/SDN/NFV 黃 仁 竑 教 授 國 立 中 正 大 學 資 工 系 2015/12/26

VMDC 3.0 Design Overview

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

STATE OF THE ART OF DATA CENTRE NETWORK TECHNOLOGIES CASE: COMPARISON BETWEEN ETHERNET FABRIC SOLUTIONS

Unified Computing Systems

Networking Topology For Your System

How To Build A Policy Aware Switching Layer For Data Center Data Center Servers

BURSTING DATA BETWEEN DATA CENTERS CASE FOR TRANSPORT SDN

Block based, file-based, combination. Component based, solution based

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

Cloud Optimize Your IT

Microsoft s Cloud Networks

IP Networking. Overview. Networks Impact Daily Life. IP Networking - Part 1. How Networks Impact Daily Life. How Networks Impact Daily Life

How To Switch A Layer 1 Matrix Switch On A Network On A Cloud (Network) On A Microsoft Network (Network On A Server) On An Openflow (Network-1) On The Network (Netscout) On Your Network (

CCNA R&S: Introduction to Networks. Chapter 5: Ethernet

SummitStack in the Data Center

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

VXLAN: Scaling Data Center Capacity. White Paper

Software Defined Networking What is it, how does it work, and what is it good for?

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Data Center Convergence. Ahmad Zamer, Brocade

Brocade One Data Center Cloud-Optimized Networks

Extending Networking to Fit the Cloud

Easier - Faster - Better

PortLand:! A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric

(R)Evolution im Software Defined Datacenter Hyper-Converged Infrastructure

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

Oracle SDN Performance Acceleration with Software-Defined Networking

Applying NOX to the Datacenter

Multipath TCP in Data Centres (work in progress)

Global Headquarters: 5 Speen Street Framingham, MA USA P F

TEST METHODOLOGY. Hypervisors For x86 Virtualization. v1.0

Photonic Switching Applications in Data Centers & Cloud Computing Networks

WHITE PAPER. Data Center Fabrics. Why the Right Choice is so Important to Your Business

How Router Technology Shapes Inter-Cloud Computing Service Architecture for The Future Internet

Virtual PortChannels: Building Networks without Spanning Tree Protocol

Network Technologies for Next-generation Data Centers

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support

BEST PRACTICES GUIDE: Nimble Storage Best Practices for Scale-Out

Avoiding Network Polarization and Increasing Visibility in Cloud Networks Using Broadcom Smart- Hash Technology

Challenges of Sending Large Files Over Public Internet

Transcription:

Data Center Challenges Building Networks for Agility reenivas Addagatla, Albert Greenberg, James Hamilton, Navendu Jain, rikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel udipta engupta 1

Capacity Issues in Real Data Centers Bing has many applica6ons that turn network BW into useful work Data mining more jobs, more data, more analysis Index more documents, more frequent updates These apps can consume lots of BW They press the DC s boelenecks to their breaking point Core links in intra- data center fabric at 85% u6liza6on and growing Got to point that loss of even one aggrega6on router would result in massive conges6on and incidents Demand is always growing (a good thing ) 1 team wanted to ramp up traffic by 10Gbps over 1 month 2

The Capacity Well Runs Dry We had already exhausted all ability to add capacity to the current network architecture Capacity upgrades June 25-80G to 120G July 20-120G to 240G July 27-240G to 320G 100% Utilization on a Core Intra-DC Link We had to do something radically different 3

Target Architecture Internet implify mgmt: Broad layer of devices for resilience & ROC RAID for the network More capacity: Clos network mesh, VLB traffic engineering Fault Domains for resilience and scalability: Layer 3 rou6ng Reduce COG: commodity devices 4

Deployment uccessful! Draining traffic from congested locations 5

<shameless plug> Want to design some of the biggest data centers in the world? Want to experience what scalable and reliable really mean? Think measuring compute capacity in millions of MIPs is small potatoes? Bing s AutoPilot team is hiring! </shameless plug> 6

Agenda Brief characterization of mega cloud data centers Costs Pain-points with today s network Traffic pattern characteristics in data centers VL2: a technology for building data center networks Provides what data center tenants & owners want Network virtualization Uniform high capacity and performance isolation Low cost and high reliability with simple mgmt Principles and insights behind VL2 VL2 prototype and evaluation (VL2 is also known as project Monsoon) 7

What s a Cloud ervice Data Center? Figure by Advanced Data Centers Electrical power and economies of scale determine total data center size: 50,000 200,000 servers today ervers divided up among hundreds of different services cale-out is paramount: some services have 10s of servers, some have 10s of 1000s 8

Data Center Costs Amortized Cost* Component ub-components ~45% ervers CPU, memory, disk ~25% Power infrastructure UP, cooling, power distribution ~15% Power draw Electrical utility costs ~15% Network witches, links, transit Total cost varies Upwards of $1/4 B for mega data center erver costs dominate Network costs significant The Cost of a Cloud: Research Problems in Data Center Networks. igcomm CCR 2009. Greenberg, Hamilton, Maltz, Patel. *3 yr amortization for servers, 15 yr for infrastructure; 5% cost of money 9

Data Centers are Like Factories Number 1 Goal: Maximize useful work per dollar spent Ugly secrets: 10% to 30% CPU utilization considered good in DCs There are servers that aren t doing anything at all Cause: erver are purchased rarely (roughly quarterly) Reassigning servers among tenants is hard Every tenant hoards servers olution: More agility: Any server, any service 10

Improving erver ROI: Need Agility Turn the servers into a single large fungible pool Let services breathe : dynamically expand and contract their footprint as needed Requirements for implementing agility Means for rapidly installing a service s code on a server Virtual machines, disk images Means for a server to access persistent data Data too large to copy during provisioning process Distributed filesystems (e.g., blob stores) Means for communicating with other servers, regardless of where they are in the data center Network 11

The Network of a Modern Data Center Internet Data Center Layer 3 Internet CR CR AR AR AR AR Layer 2 LB LB A A A Ref: Data Center: Load Balancing Data Center ervices, Cisco 2004 A A A Key: CR = L3 Core Router AR = L3 Access Router = L2 witch LB = Load Balancer A = Rack of 20 servers with Top of Rack switch ~ 2,000 servers/podset Hierarchical network; 1+1 redundancy Equipment higher in the hierarchy handles more traffic, more expensive, more efforts made at availability scale-up design ervers connect via 1 Gbps UTP to Top of Rack switches Other links are mix of 1G, 10G; fiber, copper 12

Internal Fragmentation Prevents Applications from Dynamically Growing/hrinking CR Internet CR AR AR AR AR LB LB LB LB A A A A A A A A A A A A VLANs used to isolate properties from each other IP addresses topologically determined by ARs Reconfiguration of IPs and VLAN trunks painful, errorprone, slow, often manual 13

No Performance Isolation CR Internet CR AR AR AR AR LB LB Collateral damage LB LB A A A A A A A A A A A A VLANs typically provide only reachability isolation One service sending/recving too much traffic hurts all services sharing its subtree 14

Network has Limited erver-to-erver Capacity, and Requires Traffic Engineering to Use What It Has AR CR Internet 10:1 over-subscription or worse (80:1, 240:1) AR CR AR AR LB LB LB LB A A A A A A A A A A A A Data centers run two kinds of applications: Outward facing (serving web pages to users) Internal computation (computing search index think HPC) 15

Network Needs Greater Bisection BW, and Requires Traffic Engineering to Use What It Has CR Internet CR LB AR Dynamic reassignment of servers and AR Map/Reduce-style computations mean AR traffic matrix LB is constantly LB changing AR LB Explicit traffic engineering is a nightmare A A A A A A A A A A A A Data centers run two kinds of applications: Outward facing (serving web pages to users) Internal computation (computing search index think HPC) 16

Measuring Traffic in Today s Data Centers 80% of the packets stay inside the data center Data mining, index computations, back end to front end Trend is towards even more internal communication Detailed measurement study of data mining cluster 1,500 servers, 79 ToRs Logged: 5-tuple and size of all socket-level R/W ops Aggregated into flow and traffic matrices every 100 s rc, Dst, Bytes of data exchange More info: DCTCP: Efficient Packet Transport for the Commoditized Data Center http://research.microsoft.com/en-us/um/people/padhye/publications/dctcp-sigcomm2010.pdf The Nature of Datacenter Traffic: Measurements and Analysis http://research.microsoft.com/en-us/um/people/srikanth/data/imc09_dctraffic.pdf 17

Flow Characteristics DC traffic!= Internet traffic Most of the flows: various mice Most of the bytes: within 100MB flows Median of 10 concurrent flows per server 18

Traffic Matrix Volatility - Collapse similar traffic matrices into clusters - Need 50-60 clusters to cover a day s traffic - Traffic pattern changes nearly constantly - Run length is 100s to 80% percentile; 99 th is 800s 19

Today, Computation Constrained by Network* 1Gbps.4 Gbps erver To 3 Mbps 20 Kbps.2 Kbps 0 Figure: ln(bytes/10sec) between servers in operational cluster Great efforts required to place communicating servers under the same ToR Most traffic lies on the diagonal tripes show there is need for inter-tor communication *Kandula, engupta, Greenberg, Patel erver From 20

What Do Data Center Faults Look Like? Need very high reliability near top of the tree Very hard to achieve Example: failure of a temporarily unpaired core switch affected ten million users for four hours 0.3% of failure events knocked out all members of a network redundancy group Typically at lower layers in tree, but not always LB CR CR AR AR AR AR A A A Ref: Data Center: Load Balancing Data Center ervices, Cisco 2004 LB A A A 21

Objectives for the Network of ingle Data Center Developers want network virtualization: a mental model where all their servers, and only their servers, are plugged into an Ethernet switch Uniform high capacity Capacity between two servers limited only by their NICs No need to consider topology when adding servers Performance isolation Traffic of one service should be unaffected by others Layer-2 semantics Flat addressing, so any server can have any IP address erver configuration is the same as in a LAN Legacy applications depending on broadcast must work 22

VL2: Distinguishing Design Principles Randomizing to Cope with Volatility Tremendous variability in traffic matrices eparating Names from Locations Any server, any service Leverage trengths of End ystems Programmable; big memories Building on Proven Networking Technology We can build with parts shipping today Leverage low cost, powerful merchant silicon AICs, though do not rely on any one vendor Innovate in software 23

What Enables a New olution Now? Programmable switches with high port density Fast: AIC switches on a chip (Broadcom, Fulcrum, ) Cheap: mall buffers, small forwarding tables Flexible: Programmable control planes Centralized coordination cale-out data centers are not like enterprise networks Centralized services already control/monitor health and role of each server (Autopilot) Centralized directory and control plane acceptable (4D) 24 port 10GE switch. List price: $10K 24

An Example VL2 Topology: Clos Network D/2 switches D ports... Intermediate node switches in VLB Node degree (D) of available switches & # servers supported D/2 ports D/2 ports 10G... Aggregation switches D switches 20 ports Top Of Rack switch [D 2 /4] * 20 ervers A scale-out design with broad layers ame bisection capacity at each layer no oversubscription Extensive path diversity Graceful degradation under failure ROC philosophy can be applied to the network switches 25

Use Randomization to Cope with Volatility D/2 switches D ports... Intermediate node switches in VLB Node degree (D) of available switches & # servers supported D/2 ports D/2 ports 10G... Aggregation switches D switches 20 ports Top Of Rack switch [D 2 /4] * 20 ervers Valiant Load Balancing Every flow bounced off a random intermediate switch Provably hotspot free for any admissible traffic matrix ervers could randomize flow-lets if needed 26

eparating Names from Locations: How mart ervers Use Dumb witches Headers Dest: N rc: Dest: TD rc: Dest: D rc: Payload Dest: N rc: Dest: TD rc: Dest: TD rc: Dest: D rc: Dest: D rc: Payload Payload Intermediate 2 Node (N) 3 ToR (T) ToR (TD) 1 4 ource () Dest (D) Dest: D Payload rc: Encapsulation used to transfer complexity to servers Commodity switches have simple forwarding primitives Complexity moved to computing the headers Many types of encapsulation available IEEE 802.1ah defines MAC-in-MAC encapsulation; VLANs; etc. 27

Leverage trengths of End ystems User Kernel Applica6on TCP IP ARP Encapsulator NIC Resolve remote IP VL2 Agent MAC Resolu6on Cache Lookup(AA) EncapInfo(AA) Provision(AA, ) CreateVL2VLAN( ) AddToVL2VLAN( ) Directory ystem Provisioning ystem erver machine Data center Oes already heavily modified for VMs, storage, etc. A thin shim for network support is no big deal Applications work with Application Addresses AA s are flat names; infrastructure addresses invisible to apps No change to applications or clients outside DC 28

eparating Network Changes from Tenant Changes How to implement VLB while avoiding need to update state on every host at every topology change? I ANY I 1 I ANY I 2 I ANY I 3 Links used for up paths L3 Network running OPF Links used for down paths I ANY I? T 35 T 1 [ IP T 2 anycast T 3 + flow-based T 4 TECMP 5 ] T 6 Harness huge bisection bandwidth y z payload Obviate esoteric traffic engineering or optimization Ensure robustness x y to failures z Work with switch mechanisms available today 29

VL2 Analysis and Prototyping Will it work? VLB traffic engineering depends on there being few long flows Will it work? Control plane has to be stable at large scale DHCP Discover /s CPU load DHCP Discover Delivered DHCP Offer Delivered 100 7% 100% 100% 200 9% 100% 73.3% 300 10% 100% 50.0% 400 11% 100% 37.4% 500 12% 100% 31.2% 1000 17% 99.8% 16.8% 1500 22% 99.7% 12.0% 2000 27% 99.4% 11.2% 2500 30% 99.4% 9.0% Prototype Results: Huge amounts of traffic with excellent efficiency 154 Gbps goodput sustained among 212 servers 10.2 TB of data moved in 530s Fairness of 0.95/1.0 great performance isola6on 91% of maximum capacity TCP RTT 100-300 microseconds on quiet network low latency 30

VL2 Prototype Experiments conducted with 40, 80, 300 servers Results have near perfect scaling Gives us some confidence that design will scale-out as predicted 31

VL2 Achieves Uniform High Throughput Experiment: all-to-all shuffle of 500 MB among 75 servers 2.7 TB Excellent metric of overall efficiency and performance All2All shuffle is superset of other traffic patterns Results: Ave goodput: 58.6 Gbps; Fairness index:.995; Ave link util: 86% Perfect system-wide efficiency would yield aggregate goodput of 75G Monsoon efficiency is 78% of perfect 10% inefficiency due to duplexing issues; 6% header overhead VL2 efficiency is 94% of optimal 32

VL2 Provides Performance Isolation - ervice 1 unaffected by service 2 s activity 33

VLB vs. Adaptive vs. Best Oblivious Routing - VLB does as well as adaptive routing (traffic engineering using an oracle) on Data Center traffic - Worst link is 20% busier with VLB, median is same 34

Related Work OpenFlow hares idea of simple switches controlled by external W VL2 is a philosophy for how to use the switches Fat-trees of commodity switches [Al-Fares, et al., IGCOMM 08] hares a preference for a Clos topology Monsoon provides a virtual layer 2 using different techniques: changes to servers, an existing forwarding primitive, directory service Dcell [Guo, et al., IGCOMM 08] Uses servers themselves to forward packets EATTLE [Kim, et al., IGCOMM 08] hared goal of a large L2, different approach to directory service Formal network theory and HPC Valiant Load Balancing, Clos networks Logically centralized routing 4D, Tesseract, Ethane 35

ummary Key to economic data centers is agility Any server, any service Today, the network is the largest blocker The right network model to create is a virtual layer 2 per service Uniform High Bandwidth Performance Isolation Layer 2 emantics VL2 implements this model via several techniques Randomizing to cope with volatility (VLB) uniform BW/perf iso Name/location separation & end system changes L2 semantics End system changes & proven technology deployable now Performance is scalable VL2: Any server/any service agility via scalable virtual L2 networks that eliminate fragmentation of the server pool 36

<shameless plug> Want to design some of the biggest data centers in the world? Want to experience what scalable and reliable really mean? Think measuring compute capacity in millions of MIPs is small potatoes? Bing s AutoPilot team is hiring! </shameless plug> 37

More Information The Cost of a Cloud: Research Problems in Data Center Networks http://research.microsoft.com/~dmaltz/papers/dc-costs-ccr-editorial.pdf VL2: A calable and Flexible Data Center Network http://research.microsoft.com/apps/pubs/default.aspx?id=80693 Towards a Next Generation Data Center Architecture: calability and Commoditization http://research.microsoft.com/~dmaltz/papers/monsoon-presto08.pdf DCTCP: Efficient Packet Transport for the Commoditized Data Center http://research.microsoft.com/en-us/um/people/padhye/publications/dctcp-sigcomm2010.pdf The Nature of Datacenter Traffic: Measurements and Analysis http://research.microsoft.com/en-us/um/people/srikanth/data/imc09_dctraffic.pdf What Goes into a Data Center? http://research.microsoft.com/apps/pubs/default.aspx?id=81782 James Hamilton s Perspectives Blog http://perspectives.mvdirona.com Designing & Deploying Internet-cale ervices http://mvdirona.com/jrh/talksandpapers/jamesrh_lisa.pdf Cost of Power in Large cale Data Centers http://perspectives.mvdirona.com/2008/11/28/costofpowerinlargecaledatacenters.aspx 38

BACK UP LIDE 39

Other Issues Dollar costs of a VL2 network Cabling costs and complexity Directory ystem performance TCP in-cast Buffer allocation policies on the switches 40

Cabling Costs and Issues Cabling complexity is not a big deal Monsoon network cabling fits nicely into conventional open floor plan data center Containerized designs available Cost is not a big deal Computation shows it as 12% of total network cost Estimate: FP+ cable = $190, two 10G ports = $1K, cabling should be ~19% of switch cost ToR ToR Network Cage Int Aggr............ 41

Directory ystem Performance Key issues: Lookup latency (LA set at 10ms) How many servers needed to handle a DC s lookup traffic? Update latency Convergence latency 42

Directory ystem RM 3. Replicate RM ervers RM RM......... Directory D D 2. et 4. Ack (6. Disseminate) D ervers 2. Reply 1. Lookup 2. Reply 1. Update 5. Ack Agent Lookup Agent Update 43

Directory ystem Performance Lookup latency Each server assigned to the directory system can handle 17K lookups/sec with 99 th percentile latency < 10ms caling is linear as expected (verified with 3,5,7 directory servers) Directory ystem sizing How many lookups per second? Median node has 10 connections, 100K servers = 1M entries Assume (worst case?) that all need to be refreshed at once 64 servers handles the load w/i 10ms LA Directory system consumes 0.06% of total servers 44

Directory ystem Performance 45

The Topology Isn t the Most Important Thing Two-layer Clos network seems optimal for our current environment, but Other topologies can be used with Monsoon d1= 40 ports Ring/Chord topology makes organic growth easier Multi-level fat tree, parallel Clos networks i n/(d1-2) positions n2 = 72 switches... 144 ports n2 = 72 switches... d2 = 100 ports Type (1) switches Type (2) switches layer 1 or 2 links layer 1 links n/(d1-2) positions...... A TOR B B A TOR n1 = 144 switches n1 = 144 switches Number of servers = 2 x 144 x 36 x 20 = 207,360 46

VL2 is resilient to link failures - Performance degrades and recovers gracefully as links are failed and restored 47

Abstract (this won t be part of the presented slide deck I m just keeping the information together) Here s an abstract and slide deck for a 30 to 45 min presentation on VL2, our data center network. I can add more details on the Monsoon design or more background on the enabling HW, the traffic patterns, etc. as desired. ee http://research.microsoft.com/apps/pubs/default.aspx?id=81782 for possibilities. (we could reprise the tutorial if you d like it ran in 3 hours originally) We can do a demo if that would be appealing (takes about 5 min) To be agile and cost effective, data centers must allow dynamic resource allocation across large server pools. Today, the highest barriers to achieving this agility are limitations imposed by the network, such as bandwidth bottlenecks, subnet layout, and VLAN restrictions. To overcome this challenge, we present VL2, a practical network architecture that scales to support huge data centers with 100,000 servers while providing uniform high capacity between servers, performance isolation between services, and Ethernet layer-2 semantics. VL2 uses (1) flat addressing to allow service instances to be placed anywhere in the network, (2) Valiant Load Balancing to spread traffic uniformly across network paths, and (3) end-system based address resolution to scale to large server pools, without introducing complexity to the network control plane. VL2 s design is driven by detailed measurements of traffic and fault data from a large operational cloud service provider. VL2 s implementation leverages proven network technologies, already available at low cost in high-speed hardware implementations, to build a scalable and reliable network architecture. As a result, VL2 networks can be deployed today, and we have built a working prototype with 300 servers. We evaluate the merits of the VL2 design using measurement, analysis, and experiments. Our VL2 prototype shuffles 2.7 TB of data among 75 servers in 395 seconds sustaining a rate that is 94% of the maximum possible. 48