Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Similar documents
Web Caching and CDNs. Aditya Akella

DATA COMMUNICATOIN NETWORKING

Content Distribu-on Networks (CDNs)

How To Understand The Power Of A Content Delivery Network (Cdn)

Distributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013

The old Internet. Software in the Network: Outline. Traditional Design. 1) Basic Caching. The Arrival of Software (in the network)

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

Indirection. science can be solved by adding another level of indirection" -- Butler Lampson. "Every problem in computer

Akamai CDN, IPv6 and DNS security. Christian Kaufmann Akamai Technologies DENOG 5 14 th November 2013

CSC2231: Akamai. Stefan Saroiu Department of Computer Science University of Toronto

Content Delivery Networks (CDN) Dr. Yingwu Zhu

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Internet Content Distribution

Overview. Tor Circuit Setup (1) Tor Anonymity Network

GLOBAL SERVER LOAD BALANCING WITH SERVERIRON

Content Delivery Networks

Request Routing, Load-Balancing and Fault- Tolerance Solution - MediaDNS

Data Center Content Delivery Network

Content Delivery Networks. Shaxun Chen April 21, 2009

The Effect of Caches for Mobile Broadband Internet Access

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.

CDN and Traffic-structure

DNS, CDNs Weds March Lecture 13. What is the relationship between a domain name (e.g., youtube.com) and an IP address?

Content Delivery Networks

THE MASTER LIST OF DNS TERMINOLOGY. First Edition

THE MASTER LIST OF DNS TERMINOLOGY. v 2.0

Akamai CDN, IPv6 and DNS security. Christian Kaufmann Akamai Technologies APNIC th August 2013

From Internet Data Centers to Data Centers in the Cloud

CSCI-1680 CDN & P2P Chen Avin

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. Caching, Content Distribution and Load Balancing

Alteon Global Server Load Balancing

Computer Networks - CS132/EECS148 - Spring

Rapid IP redirection with SDN and NFV. Jeffrey Lai, Qiang Fu, Tim Moors December 9, 2015

Internet Content Distribution

Domain Name System (DNS) Fundamentals

Making the Internet fast, reliable and secure. DE-CIX Customer Summit Steven Schecter <schecter@akamai.com>

COMP 631: COMPUTER NETWORKS. How to distribute content without requiring centralized, heavy-duty servers? Peer-to-peer content distribution

Enabling Media Rich Curriculum with Content Delivery Networking

CONTENT DELIVERY WHITE PAPER proinity GmbH 1

The Effectiveness of Request Redirection on CDN Robustness

The Value of Content Distribution Networks Mike Axelrod, Google Google Public

Demand Routing in Network Layer for Load Balancing in Content Delivery Networks

EECS 489 Winter 2010 Midterm Exam

Global Server Load Balancing

IERG 4080 Building Scalable Internet-based Services

Advanced Networking Technologies

Multi-layer switch hardware commutation across various layers. Mario Baldi. Politecnico di Torino.

DNS Tampering and Root Servers

C 1. Last Time. CSE 486/586 Distributed Systems Domain Name System. Review: Causal Ordering. Review: Causally Ordered Multicast.

FortiBalancer: Global Server Load Balancing WHITE PAPER

Domain Name System (DNS) RFC 1034 RFC

Scalable Internet Services and Load Balancing

architecture: what the pieces are and how they fit together names and addresses: what's your name and number?

Advanced Computer Networks. Layer-7-Switching and Loadbalancing

1 Introduction: Network Applications

Communications and Networking

Deliuery Networks. A Practical Guide to Content. Gilbert Held. Second Edition. CRC Press. Taylor & Francis Group

high-quality steaming over the Internet

Content Delivery and the Natural Evolution of DNS

Content Distribution Networks (CDN)

Analysing the impact of CDN based service delivery on traffic engineering

Getting Started with AWS. Hosting a Static Website

KAREL UCAP DNS AND DHCP CONCEPTS MANUAL MADE BY: KAREL ELEKTRONIK SANAYI ve TICARET A.S. Organize Sanayi Gazneliler Caddesi 10

CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

SiteCelerate white paper

Accelerating Wordpress for Pagerank and Profit

The Application Front End Understanding Next-Generation Load Balancing Appliances

DNS traffic analysis -- Issues of IPv6 and CDN --

Testing & Assuring Mobile End User Experience Before Production. Neotys

Hashing in Networked Systems

Domain Name System (DNS) Session-1: Fundamentals. Ayitey Bulley

Internet-Praktikum I Lab 3: DNS

Internet Traffic Evolution

Where Do You Tube? Uncovering YouTube Server Selection Strategy

The secret life of a DNS query. Igor Sviridov <sia@nest.org>

Measuring CDN Performance. Hooman Beheshti, VP Technology

what s in a name? taking a deeper look at the domain name system mike boylan penn state mac admins conference

Web Application Hosting Cloud Architecture

CS514: Intermediate Course in Computer Systems

ExamPDF. Higher Quality,Better service!

BGP and Traffic Engineering with Akamai. Caglar Dabanoglu Akamai Technologies AfPIF 2015, Maputo, August 25th

DNS Resolving using nslookup

Transcription:

Measuring the Web: Part I - - Content Delivery Networks Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Acknowledgement Material presented in these slides is borrowed from presentajons by: Mike Freedman (Princeton University), Ravi Sundaram (Northeastern University, ), Craig Labovitz (Arbor Networks, DeepField), Fabian Bustamante (Northeastern University), Tobias Flach (University of South California) 2

Today s Internet Traffic Web becomes main transport for video and everything else Thousands small web sites subsumed by large content providers 3

Single Server, Poor Performance Single server Single point of failure Easily overloaded Far from most clients Popular content Popular site Flash crowd (aka Slashdot effect ) Denial of Service axack 4

The Web: Simple on the Outside Content Providers End Users Internet

But ProblemaBc on the Inside Content Providers Peering Points Network Providers End Users IXP IXP

For TCP Distance MaXers 7

UJlizing Caching: 8 Proxy Caches Proxy server origin server client client 8

Forward Proxy Cache close to the client Under administrajve control of client- side AS Explicit proxy Requires configuring browser client Proxy server Implicit/Transparent proxy client Service provider deploys an on path proxy that intercepts and handles Web requests 9

Reverse Proxy Cache close to server Either by proxy run by server or in third- party content distribujon network (CDN) DirecJng clients to the proxy Map the site name to the IP address of the proxy Proxy server origin server origin server 10

Content DistribuJon Network ProacJve content replicajon Content provider (e.g., CNN) contracts with a CDN CDN replicates the content On many servers spread throughout the Internet origin server in North America CDN distribution node UpdaJng the replicas Updates pushed to replicas when the content changes CDN server in S. America CDN server in Europe CDN server in Asia 11

Live server Server SelecJon Policy For availability Lowest load To balance load across the servers Closest Nearest geographically, or in round- trip Jme Best performance Throughput, latency, Requires conjnuous monitoring of liveness, load, and performance. Monitoring includes traceroutes, pings, BGP updates etc Cheapest bandwidth, electricity, 12

Server SelecJon Mechanism ApplicaJon HTTP redirecjon GET Redirect GET OK Advantages Fine- grain control SelecJon based on client IP address Disadvantages Extra round- trips for TCP connecjon to server Overhead on the server 13

Server SelecJon Mechanism RouJng Anycast roujng 1.2.3.0/24 1.2.3.0/24 Advantages No extra round trips Route to nearby server Disadvantages Does not consider network or server load Different packets may go to different servers Used only for simple request- response apps 14

Server SelecJon Mechanism Naming DNS- based server selecjon 1.2.3.4 Advantages Avoid TCP set- up delay DNS caching reduces overhead RelaJvely fine control DNS query local DNS server 1.2.3.5 Disadvantage Based on IP address of local DNS server Hidden load effect DNS TTL limits adaptajon 15

CDN example: Content Providers Servers at Network Edge End Users IXP IXP

StaJsJcs Distributed servers Servers: ~150,000 Networks: ~1,100 Countries: ~80 Many customers - Web portals - Streaming - E- commerce Client requests - Hundreds of billions per day - 15-20% of all Web traffic worldwide 17

18 How Uses DNS cnn.com (content provider) DNS root server GET index. html http://cache.cnn.com/foo.jpg 1 2 HTTP HTTP global DNS server regional DNS server cluster End user Nearby cluster

19 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP DNS lookup cache.cnn.com 3 4ALIAS: g.akamai.net global DNS server regional DNS server cluster End user Nearby cluster

20 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP 3 4 6 ALIAS a73.g.akamai.net DNS lookup g.akamai.net 5 global DNS server regional DNS server cluster End user Nearby cluster

21 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP End user 5 3 4 6 DNS a73.g.akamai.net Address 1.2.3.4 7 8 global DNS server regional DNS server Nearby cluster cluster

22 How Uses DNS cnn.com (content provider) DNS TLD server 1 2 HTTP 3 5 4 6 7 global DNS server regional DNS server cluster 8 End user GET /foo.jpg Host: cache.cnn.com 9 Nearby cluster

23 How Uses DNS cnn.com (content provider) DNS TLD server GET foo.jpg 11 12 1 2 HTTP 3 5 4 6 7 global DNS server regional DNS server cluster 8 End user GET /foo.jpg Host: cache.cnn.com 9 Nearby cluster

24 How Uses DNS cnn.com (content provider) DNS TLD server 11 12 1 2 HTTP 3 5 4 6 7 global DNS server regional DNS server cluster 8 End user 9 10 Nearby cluster

25 How Works: Cache Hit cnn.com (content provider) DNS TLD server 1 2 HTTP 3 4 global DNS server regional DNS server cluster End user 5 6 Nearby cluster

OpJmizaJons to Improve Performance and Increase Cache Hit - Terminate the connecbon close to the end- user - UBlize proprietary protocols between the servers 26

Example $ dig www.audi.com ; <<>> DiG 9.6- ESV- R4- P3 <<>> www.audi.com ;; global opjons: +cmd ;; Got answer: ;; - >>HEADER<<- opcode: QUERY, status: NOERROR, id: 47918 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 8, ADDITIONAL: 8 ;; QUESTION SECTION: ;www.audi.com. IN A RedirecBon ;; ANSWER SECTION: www.audi.com. 900 IN CNAME www.audi.com.edgesuite.net. www.audi.com.edgesuite.net. 8850 IN CNAME a1805.r.akamai.net. a1805.r.akamai.net. 20 IN A 23.62.61.73 a1805.r.akamai.net. 20 IN A 23.62.61.65 ;; AUTHORITY SECTION: r.akamai.net. 5386 IN NS n5r.akamai.net. r.akamai.net. 5386 IN NS n6r.akamai.net. r.akamai.net. 5386 IN NS n7r.akamai.net. r.akamai.net. 5386 IN NS n3r.akamai.net. r.akamai.net. 5386 IN NS n4r.akamai.net. r.akamai.net. 5386 IN NS n0r.akamai.net. r.akamai.net. 5386 IN NS n1r.akamai.net. r.akamai.net. 5386 IN NS n2r.akamai.net. 27 ;; ADDITIONAL SECTION:

Example $ dig www.audi.com ; <<>> DiG 9.6- ESV- R4- P3 <<>> www.audi.com ;; global opjons: +cmd ;; Got answer: ;; - >>HEADER<<- opcode: QUERY, status: NOERROR, id: 47918 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 8, ADDITIONAL: 8 ;; QUESTION SECTION: ;www.audi.com. IN A RedirecBon ;; ANSWER SECTION: www.audi.com. 900 IN CNAME www.audi.com.edgesuite.net. www.audi.com.edgesuite.net. 8850 IN CNAME a1805.r.akamai.net. a1805.r.akamai.net. 20 IN A 23.62.61.73 a1805.r.akamai.net. 20 IN A 23.62.61.65 ;; AUTHORITY SECTION: r.akamai.net. 5386 IN NS n5r.akamai.net. r.akamai.net. 5386 IN NS n6r.akamai.net. Try: r.akamai.net. 5386 IN NS n7r.akamai.net. r.akamai.net. 5386 IN NS n3r.akamai.net. r.akamai.net. 5386 IN NS n4r.akamai.net. curl H Host:www.audi.com hxp://23.62.61.73/index.html r.akamai.net. 5386 IN NS n0r.akamai.net. r.akamai.net. 5386 IN NS n1r.akamai.net. r.akamai.net. 5386 IN NS n2r.akamai.net. 28 ;; ADDITIONAL SECTION:

Measuring CDNs UJlize a number of vantage points or Open DNS resolvers Every e.g., 60 secs, each vantage point queries an appropriate URL delivered by, or CNAMES (e.g., *.akamai.net) Similar technique can be used for other CDNs (e.g., Limemight) Low-Level DNS Server Edge Server 1 Edge Server 2. 29 PL Node Edge Server 3

Measuring CDNs By ujlizing a large number of vantage points or open resolvers it is possible to collect all the IPs of the CDNs! Example of measurement in 2009: AKAMAI CDN: Country # of IP United States 16,843 United Kingdom 1,690 Japan 1,622 Germany 1,103 Netherlands 857 France 722 Australia 514 Canada 438 Sweden 396 Hong Kong SAR 370 Others 3018 Total 27,573 Limelight CDN: Country # of IP United States 2,830 Germany 314 United Kingdom 300 Netherlands 199 Japan 126 Canada 121 France 120 Hong Kong SAR 83 China 53 Australia 1 Total 4,147 30

Measuring CDNs Berkeley Purdue day night 31

Where to Measure CDN and Web traffic? OpJon 1: IXP infrastructure Complex system Centralized monitoring Source: DE- CIX, 2012

Internet exchange Point (IXP) AS1 AS2 AS3 AS4 Layer-2 switch AS5 AS6

EsJmaJon of CDN and Web traffic AS (AS1) Link IXP AS2 AS4 AS3

IXP Results Around 70% of the traffic is Web (Large IXP, 2012) A relajve small number of large CDNs, Hosters, and Streaming providers are responsible for >50% of the traffic 35

Where to Measure CDN and Web traffic? OpJon 2: Private peering locajons ARBOR study: - 110+ ISPs / content providers - Including 3,000 edge routers and 100,000 interfaces - And an esjmated ~25% all inter- domain traffic AT&T Backbone study: Backbone, access, and mobile network Deutsche Telekom Study: Passive measurements from 20K residenjal users Source: DE- CIX, 2012

A Few Large CDNs and Datacenters are responsible for most of the Web Arbor Study (2009-13): ConsolidaJon of Web Traffic %Traffic due to CDNs 2009: 25% 2011: 35% 2013: 50% AT&T Study (2010)

A Few Large CDNs and Datacenters are responsible for most of the Web Deutsche Telekom Study (2010)

..and the CDN traffic increases 39