Content'Delivery'Infrastructure' HTTP'Overview' A'Web'Page' Computer Networks. Lecture'9:'HTTP'

Similar documents
Protocolo HTTP. Web and HTTP. HTTP overview. HTTP overview

The Web: some jargon. User agent for Web is called a browser: Web page: Most Web pages consist of: Server for Web is called Web server:

Network Technologies

HTTP Protocol. Bartosz Walter

The Web History (I) The Web History (II)

1. When will an IP process drop a datagram? 2. When will an IP process fragment a datagram? 3. When will a TCP process drop a segment?

The Hyper-Text Transfer Protocol (HTTP)

Application Layer: HTTP and the Web. Srinidhi Varadarajan

CONTENT of this CHAPTER

CS640: Introduction to Computer Networks. Applications FTP: The File Transfer Protocol

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

HTTP. Internet Engineering. Fall Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

Hypertext for Hyper Techs

Outline Definition of Webserver HTTP Static is no fun Software SSL. Webserver. in a nutshell. Sebastian Hollizeck. June, the 4 th 2013

reference: HTTP: The Definitive Guide by David Gourley and Brian Totty (O Reilly, 2002)

No. Time Source Destination Protocol Info HTTP GET /ethereal-labs/http-ethereal-file1.html HTTP/1.

By Bardia, Patit, and Rozheh

HTTP Response Splitting

World Wide Web. Before WWW

Internet Technologies Internet Protocols and Services

Data Communication I

DATA COMMUNICATOIN NETWORKING

TCP/IP Networking An Example

CloudOYE CDN USER MANUAL

Review of Networking Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

Internet Technologies. World Wide Web (WWW) Proxy Server Network Address Translator (NAT)

CTIS 256 Web Technologies II. Week # 1 Serkan GENÇ

Chapter 27 Hypertext Transfer Protocol

Internet Technologies 4-http. F. Ricci 2010/2011

HTTP Caching & Cache-Busting for Content Publishers

Architecture of So-ware Systems HTTP Protocol. Mar8n Rehák

loss-tolerant and time sensitive loss-intolerant and time sensitive loss-intolerant and time insensitive

URLs and HTTP. ICW Lecture 10 Tom Chothia

Lektion 2: Web als Graph / Web als System

1 Introduction: Network Applications

Information Extraction Art of Testing Network Peripheral Devices

Application layer Web 2.0

The HTTP protocol (HyperText Transfer Protocol) Short history of HTTP. The HTTP 1.0 protocol. 07/07/2011(dec'09)

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP Abstract Message Format. The Client/Server model is used:

Security-Assessment.com White Paper Leveraging XSRF with Apache Web Server Compatibility with older browser feature and Java Applet

GET /FB/index.html HTTP/1.1 Host: lmi32.cnam.fr

Module 45 (More Web Hacking)

THE PROXY SERVER 1 1 PURPOSE 3 2 USAGE EXAMPLES 4 3 STARTING THE PROXY SERVER 5 4 READING THE LOG 6

WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007

Layer 7 Load Balancing and Content Customization

WHAT IS A WEB SERVER?

Domain Name System (DNS)

Web. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture # Apache.

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 20

Modern Web Development From Angle Brackets to Web Sockets

Web Services April 21st, 2009 with Hunter Pitelka

Demystifying cache. Kristian Lyngstøl Product Specialist Varnish Software AS

Computer Networking LAB 2 HTTP

International Journal of Engineering & Technology IJET-IJENS Vol:14 No:06 44

Computer Networks. Lecture 7: Application layer: FTP and HTTP. Marcin Bieńkowski. Institute of Computer Science University of Wrocław

Remote System Monitor for Linux [RSML]: A Perspective

Cookies Overview and HTTP Proxies

The Application Layer. CS158a Chris Pollett May 9, 2007.

Instructor: Betty O Neil

Cyber Security Workshop Ethical Web Hacking

Table of Contents. Open-Xchange Authentication & Session Handling. 1.Introduction...3

COS 461: Computer Networks

Varnish Tips & Tricks, 2015 edition

CS 5480/6480: Computer Networks Spring 2012 Homework 1 Solutions Due by 9:00 AM MT on January 31 st 2012

Communicating Applications

Ethical Hacking as a Professional Penetration Testing Technique

Sticky Session Setup and Troubleshooting

All You Can Eat Realtime

Arnaud Becart ip- label 11/9/11

Using TestLogServer for Web Security Troubleshooting

CDN Operation Manual

HOST EUROPE CLOUD STORAGE REST API DEVELOPER REFERENCE

Research of Web Real-Time Communication Based on Web Socket

Chapter 2: Interactive Web Applications

Security Testing is performed to reveal security flaws in the system in order to protect data and maintain functionality.

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP - Message Format. The Client/Server model is used:

Apache Server Implementation Guide

HTTP Authentication. RFC 2617 obsoletes RFC 2069

Load-balancing web servers presented at AAU by Peter Dolog, Fall 2009, lecture 5, Web Engineering

Web Application Security

Alteon Browser-Smart Load Balancing

DDoS Protecion Total AnnihilationD. DDoS Mitigation Lab

Introduction to Computer Security

CMPE 80N: Introduction to Networking and the Internet

Vodia PBX RESTful API (v2.0)

Web applications. Web security: web basics. HTTP requests. URLs. GET request. Myrto Arapinis School of Informatics University of Edinburgh

CS 188/219. Scalable Internet Services Andrew Mutz October 8, 2015

Transcription:

ContentDeliveryInfrastructure Computer Networks Lecture9:HTTP Peer9to9peer(p2p): hybridp2pwithacentralizedserver purep2p hierarchicalp2p end9host(p2p)multicast Content9DistributionNetwork(CDN) HTTPOverview HTTPPerformance HTTPCaching ContentDistributionNetwork AWebPage AwebpageconsistsofabaseHTML9filewhich mayincludereferencestooneormoreobjects anobjectcanbeanhtmlfile,ajpegimage,ajava applet,anaudiofile,aflashvideo,etc. eachobjectisaddressablebyaurl exampleurl: http://www.mgoblue.com/images/pic.gif protocol hostname pathname HTTPOverview HTTP:HyperTextTransferProtocol Web sapplication9layerprotocol client/servermodelusingtcp client:browserthatrequests, receives,and displays Webobjects clientconnectstoserveratport80 HTTPmessagesexchangedbetweenbrowser andwebserver server:sendsobjectsinresponsetorequests TwoversionsofHTTP: HTTP1.0:RFC1945 HTTP1.1:RFC2068 PCrunning Firefox Macrunning Safari Server running Apache Web server

HTTPRequestMessage TwotypesofHTTPmessages:request,response HTTPrequestmessage: inascii(human9readableformat) generalformat: GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu User-agent: Mozilla/4.0 Connection: close Accept-language: fr (extra carriage return, line feed) example Carriagereturn,linefeed indicatesendofmessage MethodTypes(HTTP1.1) GET,POST,HEAD PUT uploadsfileinentitybodytopathspecifiedinurlfield DELETE deletesfilespecifiedintheurlfield Uploadingform,inputalternatives: 1. POSTmethod: webpagesoftenincludeforminput inputisuploadedtoserverinentitybody 2. asparametertogeturlmethod: inputisuploadedinurlfieldofrequestline: www.somesite.com/animalsearch?monkeys&banana inputparameters HTTPResponseMessage:Example HTTPResponse:StatusLine header lines blankline data,e.g., requested HTMLfile HTTP/1.1 200 OK Connection close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998... Content-Length: 6821 Content-Type: text/html data data data data data... firstline:statusline (protocolstatuscode, statusphrase) HTTP9version39digit9response9codeReason9phrase 1XX informational 2XX success 200 OK:requestsucceeded,requestedobjectlaterinthismessage 3XX redirection 301 Moved Permanently:requestedobjectmoved,newlocation specifiedlaterinthismessage( Location: inheader) 303 Moved Temporarily 304 Not Modified 4XX clienterror 400 Bad Request:"requestmessagenotunderstoodbyserver 404 Not Found:requesteddocumentnotfoundonthisserver 5XX servererror 505 HTTP Version Not Supported

Client9sideStates:Cookies HTTPis stateless servermaintainsnoinformationaboutpastclientrequests butsometimesitmaybeusefultokeepper9clientstates, forexamplefor: aside authorization Protocolsthatmaintain shoppingcarts state arecomplex! wishlist pasthistory(state)mustbe recommendations maintained usersessionstate(webe9mail) ifserver/clientcrashes,theirviews of state maybeinconsistent,and mustbereconciled StatesoruserIDkeptatclientsideusingcookies User9sideServerState:Cookies Fourcomponents: 1. cookieheaderlineinthehttpresponsemessage 2. cookieheaderlineinhttprequestmessage 3. cookiefilekeptonclienthostandmanagedbyclientbrowser 4. back9enddatabaseatwebserver client Cookie"file" ebay:8734 Cookie"file" amazon:1678 ebay:8734 oneweeklater: Cookie"file" amazon:1678 ebay:8734 usualhttprequestmsg usualhttpresponse+ Set9cookie:1678 usualhttprequestmsg cookie:1678 usualhttpresponsemsg usualhttprequestmsg cookie:1678 usualhttpresponsemsg server createsid 1678foruser cookie9 specific action,e.g., wishlist cookie9 specific action amazonserver Client9sideStates:Cookies Excellentmarketingopportunitiesand concernsforprivacy: cookiespermitsitestolearnalotaboutyou youmayunknowingly"supplypersonalinfotosites searchenginesuseredirectionandcookiestolearn evenmoreaboutyourpreferences advertisingcompaniestracksyourpreferencesand viewinghistoryacrosssites adcompanycontractedwithasocialnetworkingsite, abookstore,andaclothingstore youviewyourfriend stravelphotostohawaiiatthesocialnetworkingsite whenyouvisitthebookstore,atravelbookabouthawaiiispushedtoyou whenyouvisittheclothingstore,aswimminggoggleispushedtoyou atallthreeplacesatravelagency sextra9lowprice,expiringin30seconds, Hawaiivacationpackageispushedtoyou ObjectRequestResponseTime RTT(round9triptime):timefora smallpackettotravel fromclienttoserverandback Responsetime: 1RTTtoinitiateTCPconnection 1RTTforHTTPrequestand thefirstfewbytesofhttp responsetoreturn filetransmissiontime =2RTT+transmittime initiatetcp connection RTT requestfile RTT filereceived time time timeto transmit file

HTTP1.0 HTTP1.0usesnon9persistentconnections: atmostoneobjectissentoveratcpconnection objecttransmissioncompletiondetectedbyrecv() returning0(connectionclosed) whyisthisnotagooddesign? Client Server HTTP1.1 HTTP1.1usespersistentconnections: serverleavesconnectionopenaftersendingresponses subsequenthttpmessagesbetweenthesameclient/server tofetchmultipleobjectsaresentoverthesameconnection Client Server 0 RTT ClientopensTCPconnection 1 RTT ClientsendsHTTPrequestfor HTML SYN SYN Serverreadsfromdisk 0 RTT ClientsendsHTTPrequestfor HTML 1 RTT Serverreadsfromdisk ClientparsesHTML ClientopensTCPconnection 2 RTT 3 RTT ClientsendsHTTPrequestfor image Imagebeginstoarrive 4 RTT FIN SYN FIN SYN Serverreadsfromdisk ClientparsesHTML ClientsendsHTTPrequestfor image Imagebeginstoarrive 2 RTT Serverreadsfromdisk HowtoMarkEndofMessage? Content9Lengthinheader Impliedlength,e.g.,304(cachefresh)neverhascontent Transfer9Encoding:chunked(HTTP1.1) afterheaders,eachchunkcomprisescontentlengthinhex, CRLF,thenbody;length0indicatesend9of9chunk HTTP/1.1 200 OK<CRLF> Transfer-Encoding: chunked<crlf> <CRLF> 25<CRLF> This is the data in the first chunk<crlf> 1A<CRLF> and this is the second one<crlf> 0<CRLF> PipelinedandParallelConnections Persistentwithoutpipelining: clientissuesnewrequestonlywhen previousresponsehasbeenreceived onerttforeachreferencedobject Persistentwithpipelining: clientsendsrequestsassoonasit encountersareferencedobject aslittleasonerttforallreferencedobjects defaultinhttp1.1 BrowserscanopenparallelTCPconnectionstofetch referencedobjects(eveninhttp1.0) 15

HTTPModeling AssumeWebpageconsistsof: 1baseHTMLpage(ofsizeLbits) Mimages(eachalsoofsizeLbits) Non9persistentHTTP: M+1TCPconnectionsinseries responsetime= (M+1)*2*RTT + (M+1)*L/µ, µ:pathspeed PersistentHTTP(withpipelining): 2RTTstorequestandreceivebaseHTMLfile 1RTTtorequestandreceiveMimages responsetime = 3*RTT + (M+1)*L/µ HTTPModeling AssumeWebpageconsistsof: 1baseHTMLpage(ofsizeLbits) Mimages(eachalsoofsizeLbits) Non9persistentHTTPwithnparallelconnections supposem/nevenly 1TCPconnectionforbasefile M/nparallelconnectionsforimages n9parallelresponsetime= (M/n + 1)*2*RTT + (M/n+1)*L/µ compare: non9persistentresponsetime= (M+1)*2*RTT + (M+1)*L/µ persistentresponsetime = 3*RTT + (M+1)*L/µ HTTPResponsetime(inseconds) RTT= 100msec,L = 5Kbytes,M = 10,andn = 5 HTTPResponsetime(inseconds) RTT= 1 sec,l = 5Kbytes,M = 10,andn = 5 Forlowbandwidth,transmissiontime dominatesoverconnectionandresponsetime performanceofpersistentconnections comparabletothatofparallelconnections ForlargerRTT,TCPestablishmentandslowstartdelays dominateoverresponsetime persistentconnectionsnowgivesignificantimprovement: particularlyinhighbandwidth delaynetworks

HTTP2.0 BasedonGoogle sspdy(2009) RFCtocomeout anydaynow (writtenbythe twoauthorsofspdy) ChromebrowseralreadyhasSPDYbuilt9in LimitationsofHTTP: pipeliningstillsuffersfromhead9of9lineblocking (iffirstitemislarge,theresthastowait) parallelstreamssolvesholblocking,butonbandwidth9 limitedchannel,toomanystreamsclogupthechannel HTTP2.0 Somechangesfrom1.1: headersnolongerintextformat separatecontrolanddataheaders streammultiplexingoverasingletcp connection: eachstreamhasanid,dataistaggedwith streamid eachstreamcanalsohavedifferentpriority serverpush:don thavetowaitforclient toparsepagebeforeinitiatingdownload headercompression Performanceimprovement:upto64% reductioninpageloadtime [Grigorik] VariableDelay WebCaches(ProxyServer) browser cache DNS resolution TCP open Sourcesofvariabilityofdelay 1 st byte response browsercachehit/miss,needforcacherevalidation DNScachehit/miss,multipleDNSservers,errors Lastbyte response TCPhandshake,packetloss,highRTT,serveracceptqueue RTT,busyserver,CPUoverhead(e.g.,CGIscript) responsesize,receivebuffersize,congestion Goal:satisfyclientrequestwithoutinvolvingoriginserver usersetsbrowsertodirectallwebaccessesviacache browsersendsallhttprequeststocache ifobjectisnotcached,cacherequests objectfromoriginserver,then returnsobjecttoclient elsecachereturnsobject cacheactsasbothclientandserver typicallycacheisinstalledbyisp (university,company,residentialisp) mustbetransparent,allowforplug9n9play client client Proxy server origin server

WebCachingExample:NoCaching Parameters: averageobjectsize=100,000bits avg.#ofrequeststoservers=15/sec Internetlatencybetweenarouteronthe publicinternetandanyserver=2secs Resultingperformance: utilizationonlan=15% utilizationonaccesslink=100%! institutional network public Internet over9utilizedlinkcauseslongqueue(delayofminutes) totaldelay =Internetdelay+accessdelay+LANdelay =2secs+minutes+milliseconds 1.5Mbps accesslink origin servers 10MbpsLAN WebCachingExample:NoCaching Possiblesolution increaseaccesslinkbandwidthto, say,10mbps(oftenacostlyupgrade) Performance: utilizationonlan=15% utilizationonaccesslink=15% institutional network public Internet totaldelay=internetdelay+accessdelay+landelay =2secs+msecs+msecs 10Mbps accesslink origin servers 10MbpsLAN WebCachingExample:WithCaching Anothersolution:installcache assumehitrateof0.4 Performance: 40%requestswillbesatisfied almostimmediately 60%requestssatisfiedbyoriginserver utilizationofaccesslinkreducedto60%, resultinginnegligibledelays(say10msecs) institutional network public Internet 1.5Mbps accesslink avg.totaldelay =Internetdelay+accessdelay+LANdelay =.6*(2.01)secs+msecs<1.4secs origin servers 10MbpsLAN cache ConditionalGET Goal:don tsendobjectifcachehasup9to9dateversion cache:specifiesdateof cachedcopyinhttprequest If-modified-since: <date> cache server:responsecontains noobjectifcachedcopy isup9to9date: HTTP/1.0 304 Not Modified MaybeusedwithorwithoutTTL, TTLhardtoset,dependsonsitecontent HTTPrequestmsg If-modified-since: <date> HTTPresponse HTTP/1.0 304 Not Modified HTTPrequestmsg If-modified-since: <date> HTTPresponse HTTP/1.0 200 OK <data> server object not modified object modified

CooperativeCaching Multiplecachesmayforma distributedcache cse Insteadofgoing directlytoorigin csecache server,acachemayquery oneormoreothercachesforobjectfirst, e.g.,csecachequeriesececachefirst public Internet origin servers Toeliminatefrequentinter9cachequery9reply,each cachemaypushanindexofitscontentstoothercaches, i.e.,ececachetellscsecachealltheobjectsitisholding Frequently,this index isintheformofabloomfilter ece ececache BloomFilter Anefficient,lossywayofdescribingaset,comprising: abitvectoroflengthw afamilyofindependenthashfunctions eachmapsanelementofthesettoanintegerin[0, w) Toinsertanelement: foreachhashfunction,set thebittheelementhashesto Tosearchforanelement: search: insert: search: foreachhashfunction,examinethebittheelementhashesto ifanybitisnotset,theelementisdefinitelynotintheset ifallthebitsareset,theelementmaybeintheset(potential forfalsepositive) BloomFilter Thefalsepositiverateisawell9defined,linear functionof: 1. w, 2. thenumberofhashfunctions,and 3. thenumberofelementsintheset widerfiltersarealwaysmoreaccurate optimaltradeoffbetweenfilterstorageandaccuracy iswhenabouthalfofthebitsareset BloomFiltersalsousefulinmaintainingp2p supernodebackboneanddistributedstoragein datacenternetwork LimitationsofWebCaching Significantfraction(>50%)ofHTTPobjectsare notcacheable Whynot? dynamicdata:stockprices,scores,webcams scripts:resultsbasedonpassedparameters useofcookies:resultsmaybebasedonpasseddata advertising/analytics:ownerwantstomeasure#hits randomstringsincontentensureuniquecounting HTTPS:encrypteddataisnotcacheable multimedia:objectlargerthancacheornotallowedtobe cachedduetointellectualpropertyrights Howtoensurescalabilityofwebserverwhen contentisnotcacheable?

ContentDistributionNetworks(CDNs) Streaminglargefiles(e.g.,video)from asingleoriginserverinrealtimerequires largeamountofbandwidth Solution:replicatecontenttohundreds ofserversthroughouttheinternet Maintainingyourownnetworkofsuchserversis expensive(bothcapexandopex) originserver inn.america CDNprovidersmaintainanetworkofserversandsell contentreplicationservicetomultiplecontentowners CDNdistribution node exampleofcontentowners:abc,hbo,netflix exampleofcdnproviders:akamai,limelight Akamaihas!~25Kserversspreadover!~1Kclustersworld9wide placeserversinedge/accessnetwork contentpre9downloadedtoservers whenuserdownloadscontent, directusertotheserverclosesttoit placingcontent closeto useravoids networkdelayandlossoflongpaths CDNsvs.ContentOwners CDNserverin S.America CDNserver inasia CDNserver ineurope SampleWebPage(ExampleOnly) home.ex/index.html home.ex/logo.gif CDNreplicatesowners contentincdnservers Whenownerupdatescontent,CDNupdatesservers SomelargecontentownersoperatetheirownCDNs: Amazon,Google/YouTube,Netflix(virtual) Sample Delivery (Example Only) stadium.mp4, tvlogo.mp4 www1.cdi.ex shirtad.gif cdi.ex/stadium.mp4 adserver.ex/shirtad.gif cdi.ex/tvlogo.mp4 Whydon twe storeindex.html andshirtad.gifat thecdnalso? index.html, logo.gif www3.cdi.ex [Frank13] shirtad.gif index.html, logo.gif www2.cdi.ex stadium.mp4, tvlogo.mp4

ContentDistributionNetwork CDNnodescreateapplication9 layeroverlaynetwork LargerCDNsmayhave theirownwans,e.g.,google s B4,thatinterconnectwiththe restoftheinternetlikeany otherisp snetwork CDNdirectsarequesttothe serverclosesttotheclient (how?) Tier91Backbones IXPs ISPs CDNs:e.g.,Akamai, Amazon,Google [afterwalrand] AccessAggregators ClientRedirection Twoissues: Howtodirectclientstoaparticularserver? Howtochoosewhichservertodirectaclient? ClientRedirection Howtodirectclientstoaparticularserver? Aspartofapplication:HTTPredirect pros:application9level,fine9grainedcontrol cons:additionalloadandrtts,hardtocache Aspartofnaming:DNS pros:well9suitedtocaching,reducertts cons:reliesonproxiesandestimations,notaccurate DNS9basedRedirection Clientsaredirectedtotheclosestserveraspartof thednsnameresolutionprocess: 1. clientasksitslocaldnsresolvertoresolvecdn s server sname 2. thelocaldnsresolverisdirectedtocdn sauthoritative nameserverbydns 3. CDN snameservereitherreturnstheaddressofserver closesttothednsresolveroranorderedlistof addresses,rankedbydistancetolocaldnsresolver Prosandconsofeach?

CDNExample 2 5 originserver 1 4 nearbycdnserver HTTPrequestfor home.ex/index.html contains cdi.ex/stadium.mp4 client slocalnameserver 3 DNSqueryforcdi.ex DNSqueryforcdi.ex CDN sauthoritativednsserver HTTPrequestfor cdi.ex/stadium.mp4 ServerSelection Howtochoosewhichservertodirectaclient? serverload client9serverdistance CDNmaintainsa map,estimatingdistancesbetweenleaf ISPsandCDNnodes CDN snameserveruses map todetermineserverclosestto thelocaldnsresolver DNSresolverusedasproxyforclient inaccuratelocation CDNdoesn tknowclient saddressatnameresolutiontime distancecanbemeasuredusingdifferentmetrics, e.g.,latency,lossrate onlyestimated deliverycost(isppricing)