Web Caching and CDNs. Aditya Akella



Similar documents
DATA COMMUNICATOIN NETWORKING

Content Distribu-on Networks (CDNs)

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

How To Understand The Power Of A Content Delivery Network (Cdn)

Distributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski

Distributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

CSC2231: Akamai. Stefan Saroiu Department of Computer Science University of Toronto

Data Center Content Delivery Network

Global Server Load Balancing

Internet Content Distribution

Content Delivery Networks (CDN) Dr. Yingwu Zhu

SiteCelerate white paper

Content Delivery Networks

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid

The Value of Content Distribution Networks Mike Axelrod, Google Google Public

GLOBAL SERVER LOAD BALANCING WITH SERVERIRON

Measuring CDN Performance. Hooman Beheshti, VP Technology

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Advanced Networking Technologies

Request Routing, Load-Balancing and Fault- Tolerance Solution - MediaDNS

Overview. Tor Circuit Setup (1) Tor Anonymity Network

Content Delivery Networks. Shaxun Chen April 21, 2009

The Critical Role of an Application Delivery Controller

Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.

Internet Content Distribution

FortiBalancer: Global Server Load Balancing WHITE PAPER

Zscaler Internet Security Frequently Asked Questions

THE MASTER LIST OF DNS TERMINOLOGY. v 2.0

Global Server Load Balancing

Akamai CDN, IPv6 and DNS security. Christian Kaufmann Akamai Technologies DENOG 5 14 th November 2013

Single Pass Load Balancing with Session Persistence in IPv6 Network. C. J. (Charlie) Liu Network Operations Charter Communications

THE MASTER LIST OF DNS TERMINOLOGY. First Edition

Implementing Reverse Proxy Using Squid. Prepared By Visolve Squid Team

AKAMAI WHITE PAPER. Delivering Dynamic Web Content in Cloud Computing Applications: HTTP resource download performance modelling

CS514: Intermediate Course in Computer Systems

EECS 489 Winter 2010 Midterm Exam

Rapid IP redirection with SDN and NFV. Jeffrey Lai, Qiang Fu, Tim Moors December 9, 2015

High Availability HTTP/S. R.P. (Adi) Aditya Senior Network Architect

Advanced Computer Networks. Layer-7-Switching and Loadbalancing

Scalable Internet Services and Load Balancing

FAQs for Oracle iplanet Proxy Server 4.0

Content Delivery Networks

high-quality steaming over the Internet

The Application Front End Understanding Next-Generation Load Balancing Appliances

High volume Internet data centers. MPLS-based Request Routing. Current dispatcher technology. MPLS-based architecture

Tema 5: Distribución de contenidos

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

CDN Brokering. Content Distribution Internetworking

How To Model The Content Delivery Network (Cdn) From A Content Bridge To A Cloud (Cloud)

The Effectiveness of Request Redirection on CDN Robustness

Testing & Assuring Mobile End User Experience Before Production. Neotys

Product Overview. UNIFIED COMPUTING Managed Load Balancing Data Sheet

A Tool for Evaluation and Optimization of Web Application Performance

Characterizing and Mitigating Web Performance Bottlenecks in Broadband Access Networks

The Effect of Caches for Mobile Broadband Internet Access

COMP 631: COMPUTER NETWORKS. How to distribute content without requiring centralized, heavy-duty servers? Peer-to-peer content distribution

Enabling Media Rich Curriculum with Content Delivery Networking

CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS

Content Distribution Networks (CDN)

Indirection. science can be solved by adding another level of indirection" -- Butler Lampson. "Every problem in computer

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Scalable Internet Services and Load Balancing

Demand Routing in Network Layer for Load Balancing in Content Delivery Networks

CDN and Traffic-structure

Combining Global Load Balancing and Geo-location with Emissary TM

DNS traffic analysis -- Issues of IPv6 and CDN --

Comparative Performance Report

John S. Otto Fabián E. Bustamante

The Web History (I) The Web History (II)

Cisco Videoscape Distribution Suite Service Broker

Front-End Performance Testing and Optimization

Web Application Hosting Cloud Architecture

Enterprise Buyer Guide

DEPLOYMENT GUIDE Version 1.1. Deploying F5 with IBM WebSphere 7

Accelerating Wordpress for Pagerank and Profit

ExamPDF. Higher Quality,Better service!

Citrix NetScaler Global Server Load Balancing Primer:

FortiGate Multi-Threat Security Systems I

From Internet Data Centers to Data Centers in the Cloud

Reverse Proxy with SSL - ProxySG Technical Brief

Azure Media Service Cloud Video Delivery KILROY HUGHES MICROSOFT AZURE MEDIA

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

Configuration Guide. BES12 Cloud

DEPLOYMENT GUIDE Version 1.2. Deploying the BIG-IP System v10 with Microsoft IIS 7.0 and 7.5

WAN Optimization, Web Cache, Explicit Proxy, and WCCP. FortiOS Handbook v3 for FortiOS 4.0 MR3

Transcription:

Web Caching and CDNs Aditya Akella 1

Where can bottlenecks occur? First mile: client to its ISPs Last mile: server to its ISP Server: compute/memory limitations ISP interconnections/peerings: congestion inside the network Caching at various locations to overcome these bottlenecks

3 Proxy Caches Proxy server origin server client client 3

Forward Proxy Cache close to the client Under administrative control of client-side AS Explicit proxy Requires configuring browser client Proxy server Implicit proxy client Service provider deploys an on path proxy that intercepts and handles Web requests 4

Reverse Proxy Cache close to server Either by proxy run by server or in third-party content distribution network (CDN) Directing clients to the proxy Map the site name to the IP address of the proxy Proxy server origin server origin server 5

Google Design... Servers Data Centers Servers Router Router Private Backbone Reverse Proxy Internet Reverse Proxy Requests Client Client Client 6

Limitations of Web Caching Much content is not cacheable Dynamic data: stock prices, scores, web cams CGI scripts: results depend on parameters Cookies: results may depend on passed data SSL: encrypted data is not cacheable Analytics: owner wants to measure hits Stale data Or, overhead of refreshing the cached data 7

Proxy Caches (A) Forward (B) Reverse (C) Both (D) Neither Reactively replicates popular content Reduces origin server costs Reduces client ISP costs Intelligent load balancing between origin servers Offload form submissions (POSTs) and user auth Content reassembly or transcoding on behalf of origin Smaller round-trip times to clients Maintain persistent connections to avoid TCP setup delay (handshake, slow start) 8

Content Distribution Networks 10

Content Distribution Network Proactive content replication Content provider (e.g., CNN) contracts with a CDN CDN replicates the content On many servers spread throughout the Internet origin server in North America CDN distribution node Updating the replicas Updates pushed to replicas when the content changes CDN server in S. America CDN server in Europe CDN server in Asia 11

Server Selection Policy Requires continuous monitoring of liveness, load, and performance Live server For availability Lowest load To balance load across the servers Closest Nearest geographically, or in round-trip time Best performance Throughput, latency, Cheapest bandwidth, electricity, 12

Server Selection Mechanism Application HTTP redirection GET Redirect GET OK Advantages Fine-grain control Selection based on client IP address Disadvantages Extra round-trips for TCP connection to server Overhead on the server 13

Server Selection Mechanism Routing Anycast routing 1.2.3.0/24 1.2.3.0/24 Advantages No extra round trips Route to nearby server Disadvantages Does not consider network or server load Different packets may go to different servers Used only for simple request-response apps 14

DNS query Server Selection Mechanism Naming DNS-based server selection local 1.2.3.4 1.2.3.5 Advantages Avoid TCP set-up delay DNS caching reduces overhead Relatively fine control Disadvantage Based on IP address of local Hidden load effect DNS TTL limits adaptation 15

How Works 16

Distributed servers Servers: ~100,000 Networks: ~1,000 Countries: ~70 Many customers Apple, BBC, FOX, GM IBM, MTV, NASA, NBC, NFL, NPR, Puma, Red Bull, Rutgers, SAP, Statistics Client requests Hundreds of billions per day Half in the top 45 networks 15-20% of all Web traffic worldwide 17

18 How Uses DNS cnn.com (content provider) DNS root server GET index. html http://cache.cnn.com/foo.jpg 1 2 HTTP HTTP global regional End user Nearby

19 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP DNS lookup cache.cnn.com 3 ALIAS: 4 g.akamai.net global regional End user Nearby

20 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP 3 5 4 6 ALIAS a73.g.akamai.net DNS lookup g.akamai.net global regional End user Nearby

21 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP 3 5 4 6 7 global regional 8 Address End user 1.2.3.4 Nearby

22 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP 3 5 4 6 7 global regional 8 End user GET /foo.jpg Host: cache.cnn.com 9 Nearby

23 How Uses DNS cnn.com (content provider) DNS TLD server GET foo.jpg 11 12 1 2 HTTP 3 5 4 6 7 global regional 8 End user GET /foo.jpg Host: cache.cnn.com 9 Nearby

24 How Uses DNS cnn.com (content provider) DNS TLD server 11 12 1 2 HTTP 3 5 4 6 7 global regional 8 End user 9 10 Nearby

25 How Works: Cache Hit cnn.com (content provider) DNS TLD server 1 2 HTTP 3 4 global regional End user 5 6 Nearby

Mapping System Equivalence classes of IP addresses IP addresses experiencing similar performance Quantify how well they connect to each other Collect and combine measurements Ping, traceroute, BGP routes, server logs E.g., over 100 TB of logs per days Network latency, loss, and connectivity 26

Mapping System Map each IP class to a preferred server Based on performance, health, etc. Updated roughly every minute Map client request to a server in the Load balancer selects a specific server E.g., to maximize the cache hit rate 27