ContentDeliveryInfrastructure Computer Networks Lecture9:HTTP Peer9to9peer(p2p): hybridp2pwithacentralizedserver purep2p hierarchicalp2p end9host(p2p)multicast Content9DistributionNetwork(CDN) HTTPOverview HTTPPerformance HTTPCaching ContentDistributionNetwork AWebPage AwebpageconsistsofabaseHTML9filewhich mayincludereferencestooneormoreobjects anobjectcanbeanhtmlfile,ajpegimage,ajava applet,anaudiofile,aflashvideo,etc. eachobjectisaddressablebyaurl exampleurl: http://www.mgoblue.com/images/pic.gif protocol hostname pathname HTTPOverview HTTP:HyperTextTransferProtocol Web sapplication9layerprotocol client/servermodelusingtcp client:browserthatrequests, receives,and displays Webobjects clientconnectstoserveratport80 HTTPmessagesexchangedbetweenbrowser andwebserver server:sendsobjectsinresponsetorequests TwoversionsofHTTP: HTTP1.0:RFC1945 HTTP1.1:RFC2068 PCrunning Firefox Macrunning Safari Server running Apache Web server
HTTPRequestMessage TwotypesofHTTPmessages:request,response HTTPrequestmessage: inascii(human9readableformat) generalformat: GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu User-agent: Mozilla/4.0 Connection: close Accept-language: fr (extra carriage return, line feed) example Carriagereturn,linefeed indicatesendofmessage MethodTypes(HTTP1.1) GET,POST,HEAD PUT uploadsfileinentitybodytopathspecifiedinurlfield DELETE deletesfilespecifiedintheurlfield Uploadingform,inputalternatives: 1. POSTmethod: webpagesoftenincludeforminput inputisuploadedtoserverinentitybody 2. asparametertogeturlmethod: inputisuploadedinurlfieldofrequestline: www.somesite.com/animalsearch?monkeys&banana inputparameters HTTPResponseMessage:Example HTTPResponse:StatusLine header lines blankline data,e.g., requested HTMLfile HTTP/1.1 200 OK Connection close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998... Content-Length: 6821 Content-Type: text/html data data data data data... firstline:statusline (protocolstatuscode, statusphrase) HTTP9version39digit9response9codeReason9phrase 1XX informational 2XX success 200 OK:requestsucceeded,requestedobjectlaterinthismessage 3XX redirection 301 Moved Permanently:requestedobjectmoved,newlocation specifiedlaterinthismessage( Location: inheader) 303 Moved Temporarily 304 Not Modified 4XX clienterror 400 Bad Request:"requestmessagenotunderstoodbyserver 404 Not Found:requesteddocumentnotfoundonthisserver 5XX servererror 505 HTTP Version Not Supported
Client9sideStates:Cookies HTTPis stateless servermaintainsnoinformationaboutpastclientrequests butsometimesitmaybeusefultokeepper9clientstates, forexamplefor: aside authorization Protocolsthatmaintain shoppingcarts state arecomplex! wishlist pasthistory(state)mustbe recommendations maintained usersessionstate(webe9mail) ifserver/clientcrashes,theirviews of state maybeinconsistent,and mustbereconciled StatesoruserIDkeptatclientsideusingcookies User9sideServerState:Cookies Fourcomponents: 1. cookieheaderlineinthehttpresponsemessage 2. cookieheaderlineinhttprequestmessage 3. cookiefilekeptonclienthostandmanagedbyclientbrowser 4. back9enddatabaseatwebserver client Cookie"file" ebay:8734 Cookie"file" amazon:1678 ebay:8734 oneweeklater: Cookie"file" amazon:1678 ebay:8734 usualhttprequestmsg usualhttpresponse+ Set9cookie:1678 usualhttprequestmsg cookie:1678 usualhttpresponsemsg usualhttprequestmsg cookie:1678 usualhttpresponsemsg server createsid 1678foruser cookie9 specific action,e.g., wishlist cookie9 specific action amazonserver Client9sideStates:Cookies Excellentmarketingopportunitiesand concernsforprivacy: cookiespermitsitestolearnalotaboutyou youmayunknowingly"supplypersonalinfotosites searchenginesuseredirectionandcookiestolearn evenmoreaboutyourpreferences advertisingcompaniestracksyourpreferencesand viewinghistoryacrosssites adcompanycontractedwithasocialnetworkingsite, abookstore,andaclothingstore youviewyourfriend stravelphotostohawaiiatthesocialnetworkingsite whenyouvisitthebookstore,atravelbookabouthawaiiispushedtoyou whenyouvisittheclothingstore,aswimminggoggleispushedtoyou atallthreeplacesatravelagency sextra9lowprice,expiringin30seconds, Hawaiivacationpackageispushedtoyou ObjectRequestResponseTime RTT(round9triptime):timefora smallpackettotravel fromclienttoserverandback Responsetime: 1RTTtoinitiateTCPconnection 1RTTforHTTPrequestand thefirstfewbytesofhttp responsetoreturn filetransmissiontime =2RTT+transmittime initiatetcp connection RTT requestfile RTT filereceived time time timeto transmit file
HTTP1.0 HTTP1.0usesnon9persistentconnections: atmostoneobjectissentoveratcpconnection objecttransmissioncompletiondetectedbyrecv() returning0(connectionclosed) whyisthisnotagooddesign? Client Server HTTP1.1 HTTP1.1usespersistentconnections: serverleavesconnectionopenaftersendingresponses subsequenthttpmessagesbetweenthesameclient/server tofetchmultipleobjectsaresentoverthesameconnection Client Server 0 RTT ClientopensTCPconnection 1 RTT ClientsendsHTTPrequestfor HTML SYN SYN Serverreadsfromdisk 0 RTT ClientsendsHTTPrequestfor HTML 1 RTT Serverreadsfromdisk ClientparsesHTML ClientopensTCPconnection 2 RTT 3 RTT ClientsendsHTTPrequestfor image Imagebeginstoarrive 4 RTT FIN SYN FIN SYN Serverreadsfromdisk ClientparsesHTML ClientsendsHTTPrequestfor image Imagebeginstoarrive 2 RTT Serverreadsfromdisk HowtoMarkEndofMessage? Content9Lengthinheader Impliedlength,e.g.,304(cachefresh)neverhascontent Transfer9Encoding:chunked(HTTP1.1) afterheaders,eachchunkcomprisescontentlengthinhex, CRLF,thenbody;length0indicatesend9of9chunk HTTP/1.1 200 OK<CRLF> Transfer-Encoding: chunked<crlf> <CRLF> 25<CRLF> This is the data in the first chunk<crlf> 1A<CRLF> and this is the second one<crlf> 0<CRLF> PipelinedandParallelConnections Persistentwithoutpipelining: clientissuesnewrequestonlywhen previousresponsehasbeenreceived onerttforeachreferencedobject Persistentwithpipelining: clientsendsrequestsassoonasit encountersareferencedobject aslittleasonerttforallreferencedobjects defaultinhttp1.1 BrowserscanopenparallelTCPconnectionstofetch referencedobjects(eveninhttp1.0) 15
HTTPModeling AssumeWebpageconsistsof: 1baseHTMLpage(ofsizeLbits) Mimages(eachalsoofsizeLbits) Non9persistentHTTP: M+1TCPconnectionsinseries responsetime= (M+1)*2*RTT + (M+1)*L/µ, µ:pathspeed PersistentHTTP(withpipelining): 2RTTstorequestandreceivebaseHTMLfile 1RTTtorequestandreceiveMimages responsetime = 3*RTT + (M+1)*L/µ HTTPModeling AssumeWebpageconsistsof: 1baseHTMLpage(ofsizeLbits) Mimages(eachalsoofsizeLbits) Non9persistentHTTPwithnparallelconnections supposem/nevenly 1TCPconnectionforbasefile M/nparallelconnectionsforimages n9parallelresponsetime= (M/n + 1)*2*RTT + (M/n+1)*L/µ compare: non9persistentresponsetime= (M+1)*2*RTT + (M+1)*L/µ persistentresponsetime = 3*RTT + (M+1)*L/µ HTTPResponsetime(inseconds) RTT= 100msec,L = 5Kbytes,M = 10,andn = 5 HTTPResponsetime(inseconds) RTT= 1 sec,l = 5Kbytes,M = 10,andn = 5 Forlowbandwidth,transmissiontime dominatesoverconnectionandresponsetime performanceofpersistentconnections comparabletothatofparallelconnections ForlargerRTT,TCPestablishmentandslowstartdelays dominateoverresponsetime persistentconnectionsnowgivesignificantimprovement: particularlyinhighbandwidth delaynetworks
HTTP2.0 BasedonGoogle sspdy(2009) RFCtocomeout anydaynow (writtenbythe twoauthorsofspdy) ChromebrowseralreadyhasSPDYbuilt9in LimitationsofHTTP: pipeliningstillsuffersfromhead9of9lineblocking (iffirstitemislarge,theresthastowait) parallelstreamssolvesholblocking,butonbandwidth9 limitedchannel,toomanystreamsclogupthechannel HTTP2.0 Somechangesfrom1.1: headersnolongerintextformat separatecontrolanddataheaders streammultiplexingoverasingletcp connection: eachstreamhasanid,dataistaggedwith streamid eachstreamcanalsohavedifferentpriority serverpush:don thavetowaitforclient toparsepagebeforeinitiatingdownload headercompression Performanceimprovement:upto64% reductioninpageloadtime [Grigorik] VariableDelay WebCaches(ProxyServer) browser cache DNS resolution TCP open Sourcesofvariabilityofdelay 1 st byte response browsercachehit/miss,needforcacherevalidation DNScachehit/miss,multipleDNSservers,errors Lastbyte response TCPhandshake,packetloss,highRTT,serveracceptqueue RTT,busyserver,CPUoverhead(e.g.,CGIscript) responsesize,receivebuffersize,congestion Goal:satisfyclientrequestwithoutinvolvingoriginserver usersetsbrowsertodirectallwebaccessesviacache browsersendsallhttprequeststocache ifobjectisnotcached,cacherequests objectfromoriginserver,then returnsobjecttoclient elsecachereturnsobject cacheactsasbothclientandserver typicallycacheisinstalledbyisp (university,company,residentialisp) mustbetransparent,allowforplug9n9play client client Proxy server origin server
WebCachingExample:NoCaching Parameters: averageobjectsize=100,000bits avg.#ofrequeststoservers=15/sec Internetlatencybetweenarouteronthe publicinternetandanyserver=2secs Resultingperformance: utilizationonlan=15% utilizationonaccesslink=100%! institutional network public Internet over9utilizedlinkcauseslongqueue(delayofminutes) totaldelay =Internetdelay+accessdelay+LANdelay =2secs+minutes+milliseconds 1.5Mbps accesslink origin servers 10MbpsLAN WebCachingExample:NoCaching Possiblesolution increaseaccesslinkbandwidthto, say,10mbps(oftenacostlyupgrade) Performance: utilizationonlan=15% utilizationonaccesslink=15% institutional network public Internet totaldelay=internetdelay+accessdelay+landelay =2secs+msecs+msecs 10Mbps accesslink origin servers 10MbpsLAN WebCachingExample:WithCaching Anothersolution:installcache assumehitrateof0.4 Performance: 40%requestswillbesatisfied almostimmediately 60%requestssatisfiedbyoriginserver utilizationofaccesslinkreducedto60%, resultinginnegligibledelays(say10msecs) institutional network public Internet 1.5Mbps accesslink avg.totaldelay =Internetdelay+accessdelay+LANdelay =.6*(2.01)secs+msecs<1.4secs origin servers 10MbpsLAN cache ConditionalGET Goal:don tsendobjectifcachehasup9to9dateversion cache:specifiesdateof cachedcopyinhttprequest If-modified-since: <date> cache server:responsecontains noobjectifcachedcopy isup9to9date: HTTP/1.0 304 Not Modified MaybeusedwithorwithoutTTL, TTLhardtoset,dependsonsitecontent HTTPrequestmsg If-modified-since: <date> HTTPresponse HTTP/1.0 304 Not Modified HTTPrequestmsg If-modified-since: <date> HTTPresponse HTTP/1.0 200 OK <data> server object not modified object modified
CooperativeCaching Multiplecachesmayforma distributedcache cse Insteadofgoing directlytoorigin csecache server,acachemayquery oneormoreothercachesforobjectfirst, e.g.,csecachequeriesececachefirst public Internet origin servers Toeliminatefrequentinter9cachequery9reply,each cachemaypushanindexofitscontentstoothercaches, i.e.,ececachetellscsecachealltheobjectsitisholding Frequently,this index isintheformofabloomfilter ece ececache BloomFilter Anefficient,lossywayofdescribingaset,comprising: abitvectoroflengthw afamilyofindependenthashfunctions eachmapsanelementofthesettoanintegerin[0, w) Toinsertanelement: foreachhashfunction,set thebittheelementhashesto Tosearchforanelement: search: insert: search: foreachhashfunction,examinethebittheelementhashesto ifanybitisnotset,theelementisdefinitelynotintheset ifallthebitsareset,theelementmaybeintheset(potential forfalsepositive) BloomFilter Thefalsepositiverateisawell9defined,linear functionof: 1. w, 2. thenumberofhashfunctions,and 3. thenumberofelementsintheset widerfiltersarealwaysmoreaccurate optimaltradeoffbetweenfilterstorageandaccuracy iswhenabouthalfofthebitsareset BloomFiltersalsousefulinmaintainingp2p supernodebackboneanddistributedstoragein datacenternetwork LimitationsofWebCaching Significantfraction(>50%)ofHTTPobjectsare notcacheable Whynot? dynamicdata:stockprices,scores,webcams scripts:resultsbasedonpassedparameters useofcookies:resultsmaybebasedonpasseddata advertising/analytics:ownerwantstomeasure#hits randomstringsincontentensureuniquecounting HTTPS:encrypteddataisnotcacheable multimedia:objectlargerthancacheornotallowedtobe cachedduetointellectualpropertyrights Howtoensurescalabilityofwebserverwhen contentisnotcacheable?
ContentDistributionNetworks(CDNs) Streaminglargefiles(e.g.,video)from asingleoriginserverinrealtimerequires largeamountofbandwidth Solution:replicatecontenttohundreds ofserversthroughouttheinternet Maintainingyourownnetworkofsuchserversis expensive(bothcapexandopex) originserver inn.america CDNprovidersmaintainanetworkofserversandsell contentreplicationservicetomultiplecontentowners CDNdistribution node exampleofcontentowners:abc,hbo,netflix exampleofcdnproviders:akamai,limelight Akamaihas!~25Kserversspreadover!~1Kclustersworld9wide placeserversinedge/accessnetwork contentpre9downloadedtoservers whenuserdownloadscontent, directusertotheserverclosesttoit placingcontent closeto useravoids networkdelayandlossoflongpaths CDNsvs.ContentOwners CDNserverin S.America CDNserver inasia CDNserver ineurope SampleWebPage(ExampleOnly) home.ex/index.html home.ex/logo.gif CDNreplicatesowners contentincdnservers Whenownerupdatescontent,CDNupdatesservers SomelargecontentownersoperatetheirownCDNs: Amazon,Google/YouTube,Netflix(virtual) Sample Delivery (Example Only) stadium.mp4, tvlogo.mp4 www1.cdi.ex shirtad.gif cdi.ex/stadium.mp4 adserver.ex/shirtad.gif cdi.ex/tvlogo.mp4 Whydon twe storeindex.html andshirtad.gifat thecdnalso? index.html, logo.gif www3.cdi.ex [Frank13] shirtad.gif index.html, logo.gif www2.cdi.ex stadium.mp4, tvlogo.mp4
ContentDistributionNetwork CDNnodescreateapplication9 layeroverlaynetwork LargerCDNsmayhave theirownwans,e.g.,google s B4,thatinterconnectwiththe restoftheinternetlikeany otherisp snetwork CDNdirectsarequesttothe serverclosesttotheclient (how?) Tier91Backbones IXPs ISPs CDNs:e.g.,Akamai, Amazon,Google [afterwalrand] AccessAggregators ClientRedirection Twoissues: Howtodirectclientstoaparticularserver? Howtochoosewhichservertodirectaclient? ClientRedirection Howtodirectclientstoaparticularserver? Aspartofapplication:HTTPredirect pros:application9level,fine9grainedcontrol cons:additionalloadandrtts,hardtocache Aspartofnaming:DNS pros:well9suitedtocaching,reducertts cons:reliesonproxiesandestimations,notaccurate DNS9basedRedirection Clientsaredirectedtotheclosestserveraspartof thednsnameresolutionprocess: 1. clientasksitslocaldnsresolvertoresolvecdn s server sname 2. thelocaldnsresolverisdirectedtocdn sauthoritative nameserverbydns 3. CDN snameservereitherreturnstheaddressofserver closesttothednsresolveroranorderedlistof addresses,rankedbydistancetolocaldnsresolver Prosandconsofeach?
CDNExample 2 5 originserver 1 4 nearbycdnserver HTTPrequestfor home.ex/index.html contains cdi.ex/stadium.mp4 client slocalnameserver 3 DNSqueryforcdi.ex DNSqueryforcdi.ex CDN sauthoritativednsserver HTTPrequestfor cdi.ex/stadium.mp4 ServerSelection Howtochoosewhichservertodirectaclient? serverload client9serverdistance CDNmaintainsa map,estimatingdistancesbetweenleaf ISPsandCDNnodes CDN snameserveruses map todetermineserverclosestto thelocaldnsresolver DNSresolverusedasproxyforclient inaccuratelocation CDNdoesn tknowclient saddressatnameresolutiontime distancecanbemeasuredusingdifferentmetrics, e.g.,latency,lossrate onlyestimated deliverycost(isppricing)