Netflix Open Connect 2 Terabits from 6 racks 1
Who am I? Ryan Woolley Network Architect for Netflix rwoolley@netflix.com 2
What is Netflix? Unlimited, flat-fee streaming to hundreds of devices Movies, television and original programming 3
Netflix no Brasil Launched in September 2011 Present at PTT São Paulo ATM (MLPA) participant Open peering policy Planning to build into PTT Rio in February 2014 Evaluating expansion into other locations 4
Netflix Share of US ISP Traffic Facebook 2% Other 24% Ne#lix 29% Flash Video 2% Hulu 2% itunes 2% MPEG 2% SSL 2% BitTorrent 9% HTTP 11% YouTube 15% Source: Sandvine Global Internet Phenomena 1H 2013 5
Netflix-Developed Adaptive Client All content delivered via HTTP Clients actively measure network performance to select bitrate and CDN (primary and backup cache clusters) Stream start During playback Very large library of catalog titles Wide distribution of viewing across the entire library driven by highly personalized recommendation engine 6
Netflix CDN delivery mechanisms We deliver our content to our customers via one of three methods: Open Connect Appliances embedded within providers Peering at carrier-neutral data center sites Transit from carrier-neutral data center sites 7
Some Background on Open Connect We began the Open Connect project approximately two years ago 100% of our traffic served from the Open Connect platform in 40 countries We have >16 Terabits of network and server capacity located around the world 8
9
Not a typical network hardware deployment No aggregation layer We have no east-west traffic Load-balancing and content-routing intelligence is in the application I need high-density 10GE, but also full BGP tables I know what will be popular, so I can place content appropriately, to level load 10
Transit Peering Big Expensive Router ~4000 ports deployed Server Server Server 11
But there s a lot of logic in the back-end Broadband ISP Netflix Control Servers Netflix OCA User routing is done by Netflix control servers, not dependent on client DNS configuration 3. Client connects to local OCA 4. Local OCA delivers video stream Request is routed to the nearest available OCA
Multi-Tier Architecture Cache hardware is identical in each tier variations in content sharding create different roles Headend Network Small Aggregation Location Large Aggregation Location Each cache has identical content 80% offload Sharded content 95+% offload Sharded content 100% of active catalog 13
The Netflix Open Connect Appliance (OCA) Developed in response to ISP requests to help scale Netflix traffic efficiently Reduces ISP cost by serving Netflix traffic from the local ISP datacenter, CO or headend, rather than upstream network interconnects Speeds up internet access for consumers to all thirdparty internet sites, because Netflix traffic is no longer a source of middle-mile or backbone congestion Netflix bears the capital and maintenance costs, not ISP ISP provides space, power and a network port An OCA is a component of the Netflix CDN (vs a cache) 14
The OCA Hardware Space optimized: 4U high-density storage Power optimized for low power/cooling requirements ( 500W) 10GE optical interfaces Redundant power supplies (AC or DC) 216 TB in one unit 15
Disk-based OCAs 8.6 PB in four racks Contains full catalog Deployed in two stacks of 20 ~15 Gbps each 600 Gbps total 16
900Gbit/sec in another rack 17
SSD-based Open Connect Appliances In sites with 1+ Tbps of Netflix traffic at peak: 14 TB per 1U system Commodity SSD (< US$0.60/GB, Micron m500) 1 TB in 2.5" form factor 3x 10 Gbps SFP+ NIC 4th left unused due to bus limitations Except on Juniper installations to manage oversubscription Total system power 125W per 1U Software stack (same as spinning disk systems, which these complement) FreeBSD / nginx / bird / Netflix application code 18
2 Terabits in a day We keep configurations templated and homogenous Cabling are custom made pre-wire bundles (MTP to LC breakout) the only options we select are length Every colo looks basically the same 5-7 racks We decide how much infrastructure to deploy based on geographic sizing Colo vendors never touch our routers Cross connects are run to MTP panels which are pre-wired to routers All of this means that we can deploy 2T of infrastructure in ~1 day 19
MTP Connector Overview Multiple fiber push-on 12 fibers (available as 24) 6 duplex pairs Critical for cable management at this scale 20
MTP Cabling Aggregates 20 10GE ports (one rack of 4U appliances) into 4 MTP connectors 21
MTP Cabling (SSD) Each host uses a MTP to LC whip that allows for rapid deployment of cabling to each rack A rack of 30 Flash Hosts (120 10G ports) takes approximately 45 minutes to wire 22
MTP Cabling (Router) Custom-built MTP to LC cables connect racks to routers 23
MTP Cabling (Demarc) We do the same for demarcation of transit and peering 24
MTP enables growth In addition to a quick build, we use this to enable low-impact upgrades 25
MTP enables growth 26
Netflix BGP Outage Events Blue dots are advertisements Yellow-red dots are withdrawals Nearly 400 IPv4 peers represented here, one Renesys peer per line Percentage peers with valid route plotted at top 2013 Renesys Corporation 27
LAX2: Friday 8 March 2013 (1h35m) 43m12s 1m53s 0m55s 2013 Renesys Corporation 28
ATL1: Friday 22 March 2013 (22m7s) 12:58:19 13:20:26 2013 Renesys Corporation 29
LAX1: Monday 1 April 2013 (31 seconds) 20:20:51 21:21:22 2013 Renesys Corporation 30
ORD1: Tuesday 16 April 2013 (7m42s) 19:22:29 19:30:11 2013 Renesys Corporation 31
ATL2: Tuesday 23 April 2013 (1m28s) 12:48:45 12:50:13 2013 Renesys Corporation 32
What next? Double-size cluster 200 OCAs per site Cisco ASR 9922 1440 10GE ports (2x 9922) Or some combination of 10GE and 100GE 33
Questions?