CDN and Traffic-structure
Outline Basics CDN Traffic Analysis 2
Outline Basics CDN Building Blocks Services Evolution Traffic Analysis 3
A Centralized Web! Slow content must traverse multiple backbones and long distances Unreliable delivery may be prevented by congestion or backbone peering problems Not scalable usage limited by bandwidth available at master site Inferior streaming quality packet loss, congestion, and narrow pipes degrade stream quality The old centralized server oriented web did not scale. 4 Source: Bruce Maggs, CCGrid 2001 Keynote
Content Delivery Network (CDN) Content Delivery Networks (CDNs) emerged as a solution to Internet service degradation Moving content to the edge of the Internet, close to end-users Alternatives Increased bandwidth, Web caching, Web pre-fetching CDN advantages Reduced server loads Distributed network traffic Reduced latency First CDN emerged in 2001 Source: Content Delivery Networks: Overlay Networks for Scaling and Enhancing the Web www.gridbus.org 5
Content Delivery Network (CDN) Basic Building Blocks of a CDN. Basic Building Blocks Infrastructure: servers that replicate the content of the server that publishes content (infrastructure, replication) Redirection mechanism to forward requests to servers, typically done through DNS Content synchronization and consistency: mechanism that guarantees that the content is fresh Operation support Simple building blocks, Elementary interaction with DNS needed 6
Content Delivery Network (CDN) Details of building blocks. CDN functional components Content delivery component Origin server and a set of edge servers (surrogates) to replicate content Request-routing component Direct user requests to edge servers Interact with the distribution component to keep an up-to-date view of content Content distribution component Moves content from the origin to edge servers and ensures consistency Accounting component Maintains logs of client accesses and records usage of the servers Assists in traffic reporting and usage-based billing Web User 1 Request Routing System ` Replica Server 1 Origin Server CDN Distribution System Replica Server N Accounting System Web User M Billing Organization CDNs can be realized on top of carriers IP networks: Easy traffic management by third parties Source: Content Delivery Networks: Overlay Networks for Scaling and Enhancing the Web www.gridbus.org ` 7
Content Delivery Network (CDN) CDN Supported Content and Services. Static content Static HTML pages, images, documents, software patches Streaming media Audio, real-time video User Generated Video (UGV) Content services Directory, e-commerce, file transfer services Sources of content Large enterprises, Web service providers, media companies, and news broadcasters Customers Media and Internet advertisement companies, data centers, ISPs, online music retailers, mobile operators, consumer electronics manufacturers, and other carrier companies User interaction Cell phone, smart phone/pda, laptop, and desktop Contents / services E - docs Web Pages Streaming media Music ( MP 3 ) / Audio CDN Clients Cell Phone Smart phone / PDA Desktop Laptop 8
Content Delivery Network (CDN). Content replication and synchronization. Content Content outsourcing outsourcing mode Push: mostly for real-time, dynamic Pull: static content Non-cooperative pull-based Upon cache miss, surrogate servers pull content from the origin server Used by most CDN providers Cooperative pull-based Surrogate servers cooperate with each other to get the requested content in case of a cache miss Only used by Coral CDN, making use of DHTs CDN is highly decoupled from standard IP transport business 9
Content Delivery Network (CDN). Deployment and redirection. CDN Infrastructure Deployment Schemes Location of servers: close to customers (akamai model): put hardware inside the ISP (leased or not), advanced virtualization, acts as a cache/proxy for the ISP so reduces ISP backbone network utilization well-connected data-centers (Google, Limelight): directly peer with large ISPs and at IXPs at strategic locations, not so ISP backbone friendly Relation to peerings: close to customers does not require large peerings, only to populate the caches large peerings that require timely renegociations and SLAs Sharing of infrastructure: CDN dedicated to a single service/ customer base general-purpose Static: pre-allocation of user to a (set of) server (IPTV) Dynamic: CDN Redirection Schemes load-based: load-balancing across the CDN servers (irrespective of the network) (CDN optimization) network-based: aware of network properties such as latency and proximity-based (to the user) (service-oriented optimization) Highly decoupled from standard IP transport business 10
Easy change of traffic structure by third parties. CDN Evolutions. C h a n g e d f o c u s, i n c r e a s e d f u n c t i o n a l a b i l i t y, i m p r o v e d p e r f o r m a n c e, u s e r - c e n t r i c Pre CDN Evolution Hierarchical caching Caching proxy deployment Improved Web server Server farms First Gen.CDN Static and Dynamic Content Second Gen.CDN Video on Demand, media streaming, mobile CDNs Third Gen.CDN Community - based CDNs Pre - evolutionary period Late 90 ' s 2002 2005 2007 2010 onwards 2010 Usually CDN provider do not have detailed information about the entry point of the customer Most of the incumbent are not active in the CDN market Traffic of ISP strongly depend on activities of CDN provides and OTT CDN: Is there an opportunity for incumbent operator? 11
Outline Basics CDN Traffic Analysis Application MIX Diversity of Traffic Sources Application ISP collaboration 12
Application Mix. Which applications DT network. Volume (GB) 350 300 250 200 150 Unclassified Classified NTTP HTTP Edonkey BitTorrent TOP Applications Based on Aug 09 Data 1. HTTP 63% 2. BitTorrent 9% 3. RTMP 3% 4. edonkey 3% 5. NNTP 2% 6. SSL 2% 7. Shoutcast 1% 100 50 0 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 Traffic dominated by HTTP ~ 63% 13
Measurement results and analysis: Application mix. Application mix & HTTP by content types and Top domains. HTTP dominates (57% of Bytes) Content types used: 25% of HTTP is Flash- Video types, e.g., YouTube 15% of HTTP is RAR- Archive, e.g. RapidShare Top Domains include: One-Click Hoster Video streaming Software Downloads/ Updates No significant hiding/ tunneling via HTTP HTTP dominance due to popular high-volume content 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% HTTP Content-Types Flash-video 25% RAR-archive 15% image 11% video 8% other 23% unclassified 17% TOP Domains (Aug 09) 1. 12.6% 2. 10.8% 3. 2.8% 4. 2.2% 5. 1.8% 6. 1.5% 7. 1.4% 8. 1.4% 9. 1.3% 10. 1.1% 11. 1.0% 14
Application mix: Which applications comparison to other networks? 15
Application mix: Residential Traces Details (Maier 2009). Source: Maier et al, IMC 09 16
Outline Basics CDN Traffic Analysis Application MIX Diversity of Traffic Sources Application ISP collaboration 17
Internet Traffic Characteristics. Location diversity.. Internet and P2P Traffic Characteristics Textbox Headline Location Diversity of HTTP Content Textbox Headline 100% 80% 60% 40% 20% 0% Other HTTP P2P ISP Arbor (global) Sandvine (global) Internet Traffic Distribution 2009 Availability at # locations 120 100 80 60 40 20 0 75% 50% 25% 10% Proportion of P2P traffic in Internet decreasing. HTTP is the dominant traffic. High location diversity of important content. 18
Internet Traffic Characteristics. Consolidation of Content Diversity of Paths Top-10 Applications or Content Providers are responsible for around 50% the HTTP traffic. More than 60% of the HTTP traffic can be download from at least 3 different locations 19
Diversity of content Public peering points: Google. General trend for huge OTT provider: Peering close to the customer, bypassing of tier 1 level. 20
Diversity of content Public peering points: Akamai 21
Traffic Engineering Opportunities. Opportunities for redirection. 80% of HTTP traffic can be re-directed DNS queries of the top 10,000 hosts by volume as monitored in residential network of 20,000 DSL users. Opportunity from for DNS usage: 80% of HTTP traffic can be re-directed 22 5
Traffic redirection via DNS 5 External DNS 2 1 3 Provider DNS Internet Service Provider (ISP) 4 Client Servers 23
Download times Example: CDN Zum Teil suboptimale Downloadzeiten! 24
Traffic engineering via server selection Host A Host B Host C Client 25
Traffic engineering via server selection Host A Host B Host C Client 26
Traffic engineering via server selection Host A Host B Host C Client 27
Server selection: How? External DNS 2 1 6 3 Provider DNS Internet Service Provider 5 (ISP) 4 PaDIS Client Host Provider-aided Distance Information System 28