Analysing the impact of CDN based service delivery on traffic engineering Gerd Windisch Chair for Communication Networks Technische Universität Chemnitz Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 1
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 2
Introduction & Motivation CDNs account for a big traffic share in todays networks Server selection strategies of CDNs are usually not aware of ISP internal traffic congestion -> could negatively effect network performance Thus, the knowledge about the behaviour of server selection strategies of major CDNs can provide valuable information to network operators (to adapt the traffic engineering accordingly) Targets of the measurement study: get insight into the YouTube CDN infrastructure get insight into the video server selection strategies applied in the YouTube CDN Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 3
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 4
Distributed Measurement Approach Use of openly available HTTP proxy servers located in several ISP networks in Europe With this approach YouTube could be seen from the perspective of different ISPs Through these proxy servers a set of 20 videos is requested periodically and the response (HTTP) is analysed 5 measurement traces with a duration between 3 and 7 days have been obtained with a time resolution of 15 min Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 5
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 6
YouTube Infrastructure 137 different YouTube locations were found within the measurement traces 2 types of YouTube server locations were identified: YouTube owned data center locations Google Global Cache data center locations located in ISP networks YouTube AS Locations GGC Locations Total EU 22 107 129 USA 8 0 8 Total 30 107 137 3779 IP addresses where measured, 3005 belonging to YouTube and 774 belonging to other ASes Remark: for this analysis all data sets were combined regardless of the ISP and the measurement duration Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 7
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 8
Video Server Selection - Mechanism Server selection mechanism is used to direct a user request to the best video server location (data center) multiple selection criteria might be used (e.g. distance, server load) Most common approach for CDNs: DNS based server selection Observation: YouTube changed its video server selection from an DNS based approach to a URL rewriting based approach Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 9
Video Server Selection - Mechanism DNS based Approach: Client 1) HTTP Get Request www.youtube.com/watch?v=... Local DNS Server 3) HTTP Get Response Video web site 4) DNS Request v1.lscache1.c.youtube.com 6) DNS Response Youtube Video Server IP 7) HTTP Get Request v1.lscache1.c.youtube.com/... YouTube HTTP Frontend Server YouTube DNS Server 2) Map watchid to static video server URL 5) Select best matching video server and return IP 8) HTTP Get Response video file YouTube Video Server Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 10
Video Server Selection - Mechanism URL-rewriting based Approach: Client 1) HTTP Get Request www.youtube.com/watch?v=... 3) HTTP Get Response Video web site 4) DNS Request r1---sn-4g57ln7d.c.youtube.com YouTube HTTP Frontend Server 5) DNS Response Youtube Video Server IP YouTube DNS Local Server DNS Server 6) HTTP Get Request r1---sn-4g57ln7d.c.youtube.com/... 2) Select a video server in the best location, and embed URL in web page 7) HTTP Get Response video file YouTube Video Server Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 11
Video Server Selection - Mechanism Advantages of URL rewriting based server selection mechanism: the server selection can be done based on the user IP address and not on the IP address of the DNS Server -> better geo-location of user additional criteria (other HTTP header fields) can be applied Disadvantage: URL rewriting only works for subsequent requests (but: the initial request has to be handled via DNS selection mechanisms) Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 12
Video Server Selection Pattern Classification Based on the measurement traces the regularity of the video server selection patterns is analysed For a fair comparison all similar patterns observed in an ISP network on different proxies and in different measurement traces are counted as one observation Main result: the majority (166 out of 168) of all pattern observations can be classified as two types: constant pattern daily recursive pattern Some shifts (16) within the patterns have been identified which are however not daily recurrent Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 13
Video Server Selection - Pattern Classification Video server selection pattern types: Constant pattern no, or little changes of the video server locations this pattern appears most frequently Daily recurrent pattern Results: clearly visible daily recurrence usually one server location in off-peak hours; load balancing among some few server locations during peak traffic hours GGC Locations YouTube AS Locations Total Constant pattern 75 27 102 Daily recurr. pattern 25 39 64 Total 100 66 166 Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 14
Video Server Selection - Pattern Classification Example: constant pattern single source Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 15
Video Server Selection - Pattern Classification Example: constant pattern load balancing Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 16
Video Server Selection - Pattern Classification Example: daily recurrent pattern Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 17
Video Server Selection - Pattern Classification Example: daily recurrent pattern AS35236,Czech Republic ord lga mia fra ams lhr par prg 0 h 1 h2 h 3 h 4 h5 h 6 h 7 h8 h 9 h 10 h 11 h 12 h 13 h 14 h 15 h 16 h 17 h 18 h 19 h 20 h 21 h 22 h 23 h0 h Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 18
Video Server Selection Pattern Classification Example: neither constant nor daily recurrent Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 19
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 20
Impacts on Traffic Engineering Consequences of non-optimal traffic engineering -> packet loss/ delay increase due to overloaded ISP internal paths Goal: optimized dynamic traffic engineering based on traffic shift prediction traffic shifts are predicted based on observed pattern shifts only those pattern shifts are relevant which lead to traffic load shifts on interconnect points from the predicted pattern shifts the expected shifts of the traffic matrix can be derived -> path optimization Quality metrics: traffic matrix prediction precision optimization performance (speed, small optimality gap) ISP network reconfiguration speed Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 21
Outline Introduction & Motivation Distributed Measurement Approach YouTube CDN Infrastructure Video Server Selection Impacts on Traffic Engineering Conclusion Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 22
Conclusion Key findings: YouTube utilizes a high number of GGC video server locations YouTube recently changed its video server selection mechanism to a URL-rewriting based scheme the majority of all patterns can be classified into two categories: constant pattern daily recursive pattern Next steps: investigation of server selection strategies of other CDNs like Akamai and Limelight finishing the development of an pattern prediction model (markov model) Gerd Windisch - Chair for Communication Networks - Technische Universität Chemnitz Page 23