CS640: Introduction to Computer Networks. HTTP Caching. Example Cache Check Request

Similar documents
How To Understand The Power Of A Content Delivery Network (Cdn)

Communications Software. CSE 123b. CSE 123b. Spring Lecture 13: Load Balancing/Content Distribution. Networks (plus some other applications)

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid

Web Caching and CDNs. Aditya Akella

DATA COMMUNICATOIN NETWORKING

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

CSC2231: Akamai. Stefan Saroiu Department of Computer Science University of Toronto

Overview. Tor Circuit Setup (1) Tor Anonymity Network

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Content Distribu-on Networks (CDNs)

CS640: Introduction to Computer Networks. Applications FTP: The File Transfer Protocol

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Distributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

CS514: Intermediate Course in Computer Systems

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

Implementing Reverse Proxy Using Squid. Prepared By Visolve Squid Team

The Web History (I) The Web History (II)

Protocolo HTTP. Web and HTTP. HTTP overview. HTTP overview

Content Delivery Networks

Distributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013

First Midterm for ECE374 03/09/12 Solution!!

SiteCelerate white paper

Large-Scale Web Applications

The Value of Content Distribution Networks Mike Axelrod, Google Google Public

Hashing in Networked Systems

GLOBAL SERVER LOAD BALANCING WITH SERVERIRON

Internet Content Distribution

Web Performance. Lab. Bases de Dados e Aplicações Web MIEIC, FEUP 2014/15. Sérgio Nunes

Load Balancing Web Applications

TELE 301 Network Management. Lecture 17: File Transfer & Web Caching

Front-End Performance Testing and Optimization

Introduction to ServerIron ADX Application Switching and Load Balancing. Module 6: Content Switching (CSW) Revision 0310

Request Routing, Load-Balancing and Fault- Tolerance Solution - MediaDNS

Advanced Networking Technologies

P and FTP Proxy caching Using a Cisco Cache Engine 550 an

HTTP Fingerprinting and Advanced Assessment Techniques

The Effectiveness of Request Redirection on CDN Robustness

CloudOYE CDN USER MANUAL

Final for ECE374 05/06/13 Solution!!

HTTP Caching & Cache-Busting for Content Publishers

Apache Usage. Apache is used to serve static and dynamic content

Configuring Nex-Gen Web Load Balancer

Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.

Performance Comparison of low-latency Anonymisation Services from a User Perspective

Content Delivery Networks

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Data Center Content Delivery Network

N-tier ColdFusion scalability. N-tier ColdFusion scalability WebManiacs 2008 Jochem van Dieten

Accelerating Wordpress for Pagerank and Profit

CSCI-1680 CDN & P2P Chen Avin

Reliable Distributed Systems

Performance Report for: Report generated: Friday, April 24, 2015, 7:29 AM (via API)

Content Delivery Networks (CDN) Dr. Yingwu Zhu

1. When will an IP process drop a datagram? 2. When will an IP process fragment a datagram? 3. When will a TCP process drop a segment?

A Standard Modest WebSite

CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS

S y s t e m A r c h i t e c t u r e

Advanced Computer Networks. Layer-7-Switching and Loadbalancing

reference: HTTP: The Definitive Guide by David Gourley and Brian Totty (O Reilly, 2002)

Exercises: FreeBSD: Apache and SSL: pre SANOG VI Workshop

Learning Management Redefined. Acadox Infrastructure & Architecture

EE 7376: Introduction to Computer Networks. Homework #3: Network Security, , Web, DNS, and Network Management. Maximum Points: 60

Network Technologies

CS 5480/6480: Computer Networks Spring 2012 Homework 1 Solutions Due by 9:00 AM MT on January 31 st 2012

Content Delivery Networks. Shaxun Chen April 21, 2009

Cache All The Things

Indirection. science can be solved by adding another level of indirection" -- Butler Lampson. "Every problem in computer

Tema 5: Distribución de contenidos

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

Lecture 8a: WWW Proxy Servers and Cookies

2 Downloading Access Manager 3.1 SP4 IR1

LOAD BALANCING TECHNIQUES FOR RELEASE 11i AND RELEASE 12 E-BUSINESS ENVIRONMENTS

Using Steelhead Appliances and Stingray Aptimizer to Accelerate Microsoft SharePoint WHITE PAPER

Evaluating Cooperative Web Caching Protocols for Emerging Network Technologies 1

Application Layer -1- Network Tools

DATA COMMUNICATOIN NETWORKING

By Bardia, Patit, and Rozheh

COMP 361 Computer Communications Networks. Fall Semester Midterm Examination

Homework 2 assignment for ECE374 Posted: 02/21/14 Due: 02/28/14

q Admin and recap q Case studies: Content Distribution o Forward proxy (web cache) o Akamai o YouTube q P2P networks o Overview

How To Load Balance On A Bgg On A Network With A Network (Networking) On A Pc Or Ipa On A Computer Or Ipad On A 2G Network On A Microsoft Ipa (Netnet) On An Ip

Rapid IP redirection with SDN and NFV. Jeffrey Lai, Qiang Fu, Tim Moors December 9, 2015

THE PROXY SERVER 1 1 PURPOSE 3 2 USAGE EXAMPLES 4 3 STARTING THE PROXY SERVER 5 4 READING THE LOG 6

A BASELINE FOR WEB PERFORMANCE WITH PHANTOMJS

World Wide Web. Before WWW

Accelerating Zope applications with Squid and ESI

Carnegie Mellon Computer Science Department Spring 2007 Theory Problem Set 2

The Web: some jargon. User agent for Web is called a browser: Web page: Most Web pages consist of: Server for Web is called Web server:

HTTP Authentication. RFC 2617 obsoletes RFC 2069

APV9650. Application Delivery Controller

Internet Content Distribution

Lecture 8b: Proxy Server Load Balancing

Securing the Apache Web Server

The Measured Performance of Content Distribution Networks

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Transcription:

CS640: Introduction to Computer Networks Aditya Akella Lecture 1 - Improving Web Experience: Caching and CDNs Why caching? HTTP Caching Clients often cache documents Challenge: update of documents If-Modified-Since requests to check HTTP 0.9/1.0 used just date HTTP 1.1 has an opaque entity tag (could be a file signature, etc.) as well When/how often should the al be checked for changes? Check every time? Check each session? Day? Etc? Use Expires header If no Expires, often use Last-Modified as estimate 2 Example Cache Check Request GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate If-Modified-Since: Mon, 29 Jan 2001 17:54:1 GMT If-None-Match: "7a11f-10ed-3a75ae4a" User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.intel-iris.net Connection: Keep-Alive 3 1

Example Cache Check Response HTTP/1.1 304 Not Modified Date: Tue, 27 Mar 2001 03:50:51 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Connection: Keep-Alive Keep-Alive: timeout=15, max=100 ETag: "7a11f-10ed-3a75ae4a" 4 Web Caches Assumptions Average object size = 100,000 bits Avg. request rate from institution s browser to s = 15/sec Delay from institutional router to any and back to router = 2 sec Consequences Utilization on LAN = 15% Utilization on access link = 100% Total delay = Internet delay + access delay + LAN delay = 2 sec + minutes + milliseconds institutional network public Internet 1.5 Mbps access link 10 Mbps LAN s 5 Web Caches Possible solution Increase bandwidth of access link to, say, 10 Mbps Often a costly upgrade Consequences Utilization on LAN = 15% Utilization on access link = 15% Total delay = Internet delay + access delay + LAN delay = 2 sec + msecs + msecs institutional network public Internet 10 Mbps access link 10 Mbps LAN s 6 2

Web Caches Install cache Suppose hit rate is.4 Consequence 40% requests will be satisfied almost immediately (say 10 msec) 60% requests satisfied by Utilization of access link reduced to 60%, resulting in negligible delays Weighted average of delays =.6*2 sec +.4*10msecs < 1.3 secs institutional network public Internet 1.5 Mbps access link 10 Mbps LAN s institutional cache 7 User configures browser: Web accesses via cache Web Proxy Caches Browser sends all HTTP requests to cache Object in cache: cache returns object Else cache requests object from, then returns object to client client client HTTP request HTTP response HTTP request HTTP response Proxy HTTP request HTTP response Problems Over 50% of all HTTP objects are uncacheable why? Not easily solvable Dynamic data stock prices, scores, web cams CGI scripts results based on passed parameters SSL encrypted data is not cacheable Most web clients don t handle mixed pages well many generic objects transferred with SSL Cookies results may be based on passed data Hit metering owner wants to measure # of hits for revenue, etc. 9 3

Server Selection Replicate content on many s Load and latency savings Challenges Which content to replicate How to replicate content Where to place replicas How to find replicated content How to choose among know replicas How to direct clients towards replica 10 Server Selection Which? Lowest load to balance load on s Best performance to improve client performance Based on Geography? RTT? Throughput? Load? Any alive node to provide fault tolerance How to direct clients to a particular? As part of routing anycast, cluster load balancing Not covered today As part of application HTTP redirect As part of naming DNS 11 Application-Based Redirection HTTP supports simple way to indicate that Web page has moved (30X responses) Server receives Get request from client Decides which is best suited for particular client and object Returns HTTP redirect to that Can make informed application specific decision May introduce additional overhead multiple connection setup, name lookups, etc. 12 4

Naming Based Client does name lookup for service Name chooses appropriate address A-record returned is best one for the client What information can name base decision on? Server load/location must be collected Information in the name lookup request Name service client typically the local name for client 13 Content Distribution Networks (CDNs) The content providers are the CDN customers. Content replication CDN company installs hundreds of CDN s throughout Internet Close to users CDN replicates its customers content on CDN s in an on demand fashion. Example: Akamai networks 14 How Akamai Works Clients fetch html document from primary E.g. fetch index.html from cnn.com Akamaized URLs for replicated content are replaced in html E.g. <img src= http://cnn.com/af/x.gif > replaced with <img src= http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif > Client is forced to resolve axyz.g.akamaitech.net hostname 15 5

How Akamai Works Only static content is Akamaized Modified name contains al file name and content provider ID Akamai is asked for content First checks local cache If not in cache, requests file from primary ; caches file 16 Overview of How Akamai Works cnn.com (content provider) DNS root Get 11 index. html 1 2 3 End-user Get foo.jpg 4 12 9 5 6 7 10 Get /cnn.com/foo.jpg Akamai high-level DNS Akamai low-level DNS Nearby matching Akamai 17 Akamai Subsequent Requests cnn.com (content provider) DNS root Get index. html 1 2 Akamai high-level DNS End-user 9 7 Get 10 /cnn.com/foo.jpg Akamai low-level DNS Nearby matching Akamai 1 6

Recap: How Akamai Works Root gives NS record for akamai.net Akamai.net name returns NS record for g.akamaitech.net Name chosen to be in region of client s name Out-of-band measurements to obtain this G.akamaitech.net name chooses in region A collection of serves in each region Which to choose? Uses axyz name 19 Simple Hashing Given document XYZ, we need to choose a to use Suppose we use the mod function Number s from 1 n Place document XYZ on (XYZ mod n) What happens when a s fails? n n-1 Why might this be bad? 20 Consistent Hash Desired features Balanced load is equal across buckets Smoothness little impact on hash bucket contents when buckets are added/removed Spread small set of hash buckets that may hold a set of object Load # of objects assigned to hash bucket is small 21 7

Consistent Hash Example Construction Assign each of C hash buckets to random points on mod 2 n circle, where, hash key size = n. Map object to random position on circle Hash of object = closest clockwise bucket 12 14 0 Bucket 4 Smoothness addition of bucket does not cause movement between existing buckets Spread & Load small set of buckets that lie near object Balance no bucket is responsible for large number of objects 22 Taking Server Load into Account SelectServer (URL, S) //S set of s for each s_i in the set S weight_i = hash (URL)-hash(address(s_i)) // the difference is not arithmetic difference // instead, it is the difference on a circle sort weight_i for each s_i in increasing order of weight_i if load(s_i) < threshold return s_i else return with highest weight 23 Proximity How to select s closest to client? Same ISP as client? Same AS as client? Use BGP to identify network aware clusters Localize client location by using ping triangulation 24