Content Delivery Networks Terena 2000 ftp://ftpeng.cisco.com/sgai/t2000cdn.pdf Silvano Gai Cisco Systems, USA Politecnico di Torino, IT sgai@cisco.com Terena 2000 1
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 2
At the beginning were Web Caches S1 S2 IP Network IP Network S3 Web Cache A Web Cache is a device that stores a local copy of more recently required HTTP objects and reacts as proxy server to clients requests 3
Motivations for Content Delivery Networks Web site performance improvement Static Web pages or nonlive streaming media Server farms are far from users Better RTT better user s experience Bandwidth Cost saving Less probability to incur in a bottleneck Availability Load balancing Traffic peaks crashes sites 4
Content Delivery Networks (CDNs) Distributed Web Hosting Video-On-Demand MPEG on LAN Low/Mid-rate streaming on WAN Scalable Live Streaming Dynamic Content Conditional-Access Content advertisements 5
Example of CDN Content Updates Server Live Streams 1K to 10M Client Requests 6
SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH SIGHTPATH An Overlay Network over Internet A CDN is an overlaid network of Caches, a.k.a. Content Servers, a.k.a. Delivery Nodes, a.k.a. Replicas 7
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 8
The idea: a new DNS Server Architecture DNS queries for www.terena2000.com Standard DNS interface Routing Engine {DNS-names, IP-SA} IP-DA 9
DNS-based CDNs Host Names are used to redirect the traffic to the best replica the replica selections happens when the name is translated to an IP address DNS servers become Content Routers they measure as many metric as possible (RTT, Server Load, Layer 3 metrics, response time, etc.) to compute a replica routing table {{DNS-names, IP-SA} IP-DA} Metric measurement is not easy Layer 3 metrics alone are not particularly meaningful 10
Traditional Browsing Access Provider MSPG DNS XYZ Backbone Provider AT&T UUNET MCI Hosting Provider EXDS ABOV DNS Content Provider CNN 11
DNS-based CDN Browsing Access Provider MSPG DNS XYZ Replica Replica Backbone Provider AT&T UUNET Replica MCI Hosting Provider EXDS GBLX DNS Content Provider CNN CDN DNS 12
Example Akamaized content Akamaization HTML delivered by CNN Entire page delivered by CNN 13
DNS-based CDNs Limitations There are limitations The granularity of redirection is an host name, not a URL Content of large web sites cannot be split into multiple caches It is difficult to use the same host name for static and dynamic content The Akamai approach: Akamaized URLs: http://a836.g.akamaitech.net/7/836/123/e358f5db0045e/ www.terena2000.com/logo.gif 14
How Akamai works (1) It creates new domain names for each client content provider E.g., a128.g.akamai.net The CDNs DNS servers are authoritative for the new domains CDN s The content provider modifies its content so that embedded URLs reference the new domains http://cnn.com/a.gif --> http://a128.g.akamai.net/7/23/cnn.com/a.gif Apache module or IIS ISAPI filter 15
How Akamai works (2) Using multiple domain names for each client allows the CDN to further subdivide the content into groups DNS sees only the requested domain name, but it can route requests for different domains independently www.repubblica.it may point to mainly Italian caches www.elpais.es may point to mainly Spanish caches 16
Extension to DNS-based CDNs How to implement more granular DNS-based CDNs (e.g. how to look for the complete URL)? HTTP/RTSP Redirect Redirection can be obtained in two ways every server in the farm is capable to redirect An SLB (Server Load Balancer) is capable to redirect Effective only in a Local Area 17
The Next Step: URL-based CDNs URLs are used to redirect the traffic to the best Content Server URL routing requires TCP termination TCP termination is complex and expensive TCP termination introduces delay There will be only one TCP termination point Close to the client? Close to the server? 18
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 19
Server Farms A reality today Clients see a unique Virtual Server (IP address) Traffic destined to the Virtual Server is load balanced among different Real Server R IP Network Client (Browser) Data Server Application Server Web Server Server Farm 20
Server Load Balancing Real Server S1 Real Server S2 SLB IP Network Real Server S3 Client (Browser) Virtual Server 21
Server Load Balancing Content-unaware (layer 4 switching) TCP connections are not terminated by the SLB All packets belonging to a given TCP connection must be terminated always on the same server Examples Load distribution based on source IP address Load distribution based on hash (IP src/dst, TCP sport/dport), Content-aware (layer 7 switching) TCP connections with both clients and servers are terminated To support SSL (https) the SLB requires the server keys 22
TCP Proxy Client SYN Content Aware Router (Layer 7 Switch) Server SYN/ACK ACK GET URL SYN SYN/ACK ACK GET URL Data Data 23
Limitations with SLB Some applications require that TCP connections from the same client are redirected to the same server (Sticky Connections): Shopping Cart Searches Forms Economic Transactions Stickiness may be addressed/complicated by: source IP address cookies SSL ID 24
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 25
URL routing Can we build a router that routes on URLs? YES, but: statefull (we must terminate TCP) complex packet parsing (we need the URL) anycast router (a URL is associated to multiple replicas) Do we have URL routing tables? Do we have URL routing protocols? Do we have metrics? How do we compute them? 26
IP routing IP vs. Content Routing H1 R1 R2 R3 H2 Content routing or CR2 H1 CR1 H2 CR3 27
Or even more complex H1 CR1 CR2 CR3 H2 CR2 CR4 H1 CR1 CR5 H2 CR3 28
Content Delivery Control Protocols Content Routers in series cannot all terminate the TCP session: we don t want to reinvent X.25 URL must be extracted by the first Content Router propagate by a Content Delivery Control Protocol Some protocols have been proposed: HUP Christmas Tree ICAP Still in a very preliminary phase: if successful, they can be integrated in the hosts. 29
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 30
The Ethical question Is it ethical to deploy Content Routers in the Internet? They hijack the packets They spoof the addresses They break the end-to-end model of IP 31
Where, is the question Access Provider Here No, or may be Backbone Provider NOT HERE!!! Hosting Provider Content Provider Here YES 32
Agenda What are Content Delivery Networks? DNS based routing Server Load Balancing Content Routers Ethical questions Conclusion 33
Sometime CDNs are very good! 34
Sometime are not so good! 35
Content Provider Content Peering CDN1 CDN2 Control POP Routing POP Delivery POP CDN3 36
Conclusions Content Delivery Networks (CDNs) DNS-based will be widely deployed CDNs are not only for web traffic, but also for multimedia streaming Replicas will have slightly different content (e.g. local advertisement) Content Peering is still an unsolved problem Server Farms and Server Load Balancing will be widely deployed Intrusive content routing poses: ethical questions scalability concerns 37
The End Thank You 38