Similar documents
Content Delivery and the Natural Evolution of DNS

John S. Otto Fabián E. Bustamante

Should Internet Service Providers Fear Peer-Assisted Content Distribution?

Measurements on the Spotify Peer-Assisted Music-on-Demand Streaming System

A DNS Reflection Method for Global Traffic Management

Enabling ISP-CDN Collaboration: Turning Challenges into Opportunities

Request Routing, Load-Balancing and Fault- Tolerance Solution - MediaDNS

Where Do You Tube? Uncovering YouTube Server Selection Strategy

GLOBAL SERVER LOAD BALANCING WITH SERVERIRON

Dynamics of Prefix Usage at an Edge Router

State of the Cloud DNS Report

Akamai CDN, IPv6 and DNS security. Christian Kaufmann Akamai Technologies DENOG 5 14 th November 2013

The Effect of Caches for Mobile Broadband Internet Access

State of the Cloud DNS Report

loss-tolerant and time sensitive loss-intolerant and time sensitive loss-intolerant and time insensitive

Analysing the impact of CDN based service delivery on traffic engineering

CDN and Traffic-structure

An apparatus for P2P classification in Netflow traces

BGP and Traffic Engineering with Akamai. Caglar Dabanoglu Akamai Technologies AfPIF 2015, Maputo, August 25th

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

Faster Web through Client-assisted CDN Server Selection

Content Retrieval using Cloud-based DNS

Client-IP EDNS Option Concerns

Bit-Rate and Application Performance in Ultra BroadBand Networks

Experimentation with the YouTube Content Delivery Network (CDN)

DDoS Vulnerability Analysis of Bittorrent Protocol

HW2 Grade. CS585: Applications. Traditional Applications SMTP SMTP HTTP 11/10/2009

Content Delivery Networks (CDN) Dr. Yingwu Zhu

Global Server Load Balancing

Internet Content Distribution

The secret life of a DNS query. Igor Sviridov <sia@nest.org>

End-User Mapping: Next Generation Request Routing for Content Delivery

THE MASTER LIST OF DNS TERMINOLOGY. First Edition

THE MASTER LIST OF DNS TERMINOLOGY. v 2.0

The forces behind the changing Internet: IXPs, content delivery, and virtualization

How To Set Up A Shared Insight Cache Server On A Pc Or Macbook With A Virtual Environment On A Virtual Computer (For A Virtual) (For Pc Or Ipa) ( For Macbook) (Or Macbook). (For Macbook

Network Positioning System

A Measurement of NAT & Firewall Characteristics in Peer to Peer Systems

One-Click Hosting Services: A File-Sharing Hideout

Exploring YouTube s Content Distribution Network Through Distributed Application-Layer Measurements: A First View

CSCI-1680 CDN & P2P Chen Avin

Computer Networks - CS132/EECS148 - Spring

How To Understand The Power Of A Content Delivery Network (Cdn)

WAVE: Popularity-based and Collaborative In-network Caching for Content-Oriented Networks

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Measuring CDN Performance. Hooman Beheshti, VP Technology

CDN Brokering. Content Distribution Internetworking

D. SamKnows Methodology 20 Each deployed Whitebox performs the following tests: Primary measure(s)

The Ecosystem of Computer Networks. Ripe 46 Amsterdam, The Netherlands

Bloom Filter based Inter-domain Name Resolution: A Feasibility Study

Measuring Internet Evolution or... If we don t Measure, we don t Know What s Happening! Measurement WG, APAN 26, Queenstown, 2008

A Tale of Three CDNs: An Active Measurement Study of Hulu and Its CDNs

Mario A. Sánchez. M.S. Computer Science, Northwestern University, June M.S. Telecommunications, University of Maryland at College Park, May 2004

3. Dataset size reduction. 4. BGP-4 patterns. Detection of inter-domain routing problems using BGP-4 protocol patterns P.A.

SUITABLE ROUTING PATH FOR PEER TO PEER FILE TRANSFER

CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS

DNS (Domain Name System) is the system & protocol that translates domain names to IP addresses.

QoE-Aware Multimedia Content Delivery Over Next-Generation Networks

Superior Disaster Recovery with Radware s Global Server Load Balancing (GSLB) Solution

Demand Routing in Network Layer for Load Balancing in Content Delivery Networks

BGP and Traffic Engineering with Akamai. Christian Kaufmann Akamai Technologies MENOG 14

How is SUNET really used?

Mapping the Expansion of Google s Serving Infrastructure

Network Mobility Support Scheme on PMIPv6 Networks

Authority Server Selection of DNS Caching Resolvers

CDN Brokering. Alexandros Biliris, Chuck Cranor, Fred Douglis, Michael Rabinovich, Sandeep Sibal, Oliver Spatscheck, and Walter Sturm

From Internet Data Centers to Data Centers in the Cloud

DNS, CDNs Weds March Lecture 13. What is the relationship between a domain name (e.g., youtube.com) and an IP address?

Implementation of a Lightweight Service Advertisement and Discovery Protocol for Mobile Ad hoc Networks

Citrix NetScaler Global Server Load Balancing Primer:

Web Application Hosting Cloud Architecture

Measuring and Mitigating Web Performance Bottlenecks in Broadband Access Networks

Transcription:

ABSTRACT

Acknowledgments

List of Abbreviations

Contents ABSTRACT 3 Acknowledgments 5 List of Abbreviations 7 List of Figures 15 List of Tables 23 1 Introduction 25 2 Motivation and background 29 3 Overview of dissertation work 33

users CDN CDN ISP Users ISP 4 User-CDN tension: remote DNS use 39

5 CDN-ISP tension: CDN model and traffic diversity 71

6 ISP-User tension: cross-isp P2P traffic 99

7 Contributions and Conclusions 141 References 143

List of Figures N <

. r =. ECS

Direct Resolution DR > DR namehelp

>

T T U V T T > T T

List of Tables

Chapter 1 Introduction 1.1 Trends and Tensions

The growth in use of remote DNS services creates tension between users and CDNs. Increasing CDN market diversity results in friction between CDNs and ISPs. BitTorrent, the largest P2P content distribution system, causes stress between users and ISPs. 1.2 Roadmap

Chapter 2 Motivation and background 2.1 Growth in content

2.2 Content distribution approaches > 2.2.1 Infrastructure-based CDNs

% using CDNs 100 90 80 70 60 50 40 30 20 10 0 10 50 100 500 1000 N most popular sites Sites Pageviews N N 2.2.2 Infrastructure-less P2P systems

2.3 Summary

Chapter 3 Overview of dissertation work 3.1 Key perspectives in content delivery 3.1.1 Users and content providers user

3.1.2 CDNs Redirection policies at scale 3.1.3 Eyeball ISPs ISPs 3.2 Dissertation argument This dissertation argues that it is possible to identify technical solutions that alleviate the tensions between users, CDNs and ISPs by sharing readily available information between them.

3.2.1 Economic and technical duality technical economic 3.2.2 Scope of dissertation work I focus on the current incarnation of the Internet. The technical solutions I find may not be optimal.

It may not be possible to prove this argument in a general way. 3.3 Trends and tensions between players 3.3.1 Use of remote DNS by users to locate CDN content Ch. 4 3.3.2 Diverse CDN traffic patterns on ISP networks Ch. 5 3.3.3 Users running P2P systems over ISP networks Ch. 6

3.4 Methodology: empirical approach 3.4.1 Leveraging Ono, NEWS, Dasu and Namehelp platforms

3.4.2 Aggregating end-host vantage points

Chapter 4 User-CDN tension: remote DNS use

4.1 Trend: growth in remote DNS use Any

% of users 9 8 7 6 5 4 3 2 1 0 Any OpenDNS Google Level3 May 2010 Jul 2010 Sep 2010 Nov 2010 Jan 2011 Mar 2011 May 2011 Jul 2011 Sep 2011 Nov 2011 4.2 Trend: industry response, edns-client-subnet DNS extension ECS

both the DNS service and the CDN ECS both the DNS service and CDN must support the extension only public service no ISP DNS services N ECS

% of sites 100 90 80 70 60 50 40 30 20 10 0 No CDNs support ECS Some CDNs support ECS 10 50 100 500 1000 N most popular sites ECS

4.3 Methodology

< best case results 4.3.1 Obtaining CDN redirections actual network location

and DNS extension supported. is supported ECS ECS ECS DNS extension not supported. do not ECS

4.3.2 Measuring DNS services unicast IP address separate unicast interface Filtering configured public DNS services. unicast IP address

Send DNS query Get DNS answer; HTTP connect Send request Rec'd headers Rec'd first byte of object Transfer complete DNS latency HTTP latency End-to-end latency 4.3.3 Measuring CDNs the first byte GET 4.3.4 Baseline for performance comparison

baseline performance best for each location any 4.4 Tension: user DNS choice vs. CDN performance

1.0 1.0 CCDF of ISPs 0.8 0.6 0.4 0.2 0.0 0.01 0.1 1 10 100 Failure Rate (%) CCDF of ISPs 0.8 0.6 0.4 0.2 0.0 1 sec 1 min 1 hr 1 day Mean Time to Repair DNS Service p(failure) MTTR < < ISPs 0.8% 10 < 4.4.1 Benefits of public DNS services

< 4.4.2 Performance implications of public DNS services 4.4.2.1 Impact on CDN redirections

1.0 0.8 Iterative-Iterative Iterative-ISP Iterative-Google Iterative-OpenDNS 1.0 0.8 Iterative-Iterative Iterative-ISP Iterative-Google Iterative-OpenDNS 0.6 0.6 CDF 0.4 CDF 0.4 0.2 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Cosine Similarity 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Cosine Similarity cosine similarity A B [, ] cos sim = A B A B cos sim = cos sim =

at least <

1.0 0.8 Google DNS OpenDNS CDF of locations 0.6 0.4 0.2 0.0 0 200 400 600 800 1000 Network latency % difference, Public vs ISP DNS twice as far away potential < necessary but not sufficient 4.4.2.2 Impact on HTTP performance

1.0 1.0 0.8 0.8 CDF 0.6 0.4 0.2 0.0 Iterative ISP DNS Google DNS OpenDNS 0 50 100 150 200 250 300 HTTP latency % difference CDF 0.6 0.4 0.2 0.0 Iterative ISP DNS Google DNS OpenDNS 0 50 100 150 200 250 300 HTTP latency % difference 4.4.2.3 Cause of performance impact

CDF of locations 1.0 0.8 0.6 0.4 Iterative ISP DNS Google DNS OpenDNS 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Fraction of HTTP time waiting for header and start of object.

500 Ping latency (ms) 100 10 10 100 1000 HTTP latency (ms) r =. r =. n = 4.5 Solution: share user location with CDNs DR DR namehelp DR 4.5.1 Better CDN performance using client location ECS

1.0 0.8 CDF of locations 0.6 0.4 0.2 0.0 ECS Whole IP ECS /16 Prefix ECS /24 Prefix 0 20 40 60 80 100 HTTP latency % difference ECS ECS ECS ECS

1.0 1.0 0.8 0.8 CDF of locations 0.6 0.4 CDF of locations 0.6 0.4 0.2 0.0 ISP DNS Google DNS Google DNS with ECS 0 50 100 150 200 250 300 End-to-end latency % difference 0.2 0.0 ISP DNS Google DNS Google DNS with ECS 0 50 100 150 200 250 300 End-to-end latency % difference For these CDNs like deployments more ECS

ECS ECS ECS ECS ECS ECS 4.5.2 Direct Resolution approach Direct Resolution DR DR DR Direct Resolution DR DR

Query Client Recursive DNS Authoritative DNS 1 Lookup hostname Have CNAME mapping to CDN hostname Resolve hostname Select CDN server for recursive DNS 2 Lookup nameserver for CDN hostname Have address of CDN authoritative DNS server Resolve nameserver for CDN hostname 3 Directly request CDN redirection Have CDN redirection for client's location Select CDN server for client Direct Resolution directly contacts the CDN s authoritative server 4.5.2.1 CDN redirections using Direct Resolution DR DR location of the recursive DNS server

4.5.3 Performance evaluation DR how DR DR end-to-end performance DR

1.0 1.0 0.8 0.8 CDF 0.6 0.4 0.2 0.0 Google DNS DR DR (cached) 0 50 100 150 200 250 300 End-to-end latency % difference CDF 0.6 0.4 0.2 0.0 Google DNS DR DR (cached) 0 50 100 150 200 250 300 End-to-end latency % difference DR DR DR DR DR DR DR

1.0 0.8 0.6 CDF 0.4 0.2 0.0 Google DNS DR (optimized) Google DNS with ECS 0 50 100 150 200 250 300 End-to-end latency % difference > DR DR optimized DR optimized DR optimized DR optimized DR optimized DR Optimized DR DR

optimized DR 4.5.4 Namehelp platform and deployment DR namehelp namehelp 4.5.4.1 DNS proxy daemon DR namehelp namehelp namehelp namehelp DR namehelp 4.5.4.2 Performance comparison utility namehelp namebench namebench namehelp

Receive request Query recursive server N In local cache? Y Return cached response Y Is a CDN? N Should do DR? N Do probabilistic asynchronous DR Return original response Y Cached NS? N Do asynchronous NS lookup Y Do DR: query NS directly Do asynchronous comparison: DR vs. recursive Return modified response namehelp namehelp namebench namehelp overall time comparable

namebench 4.5.4.3 Deployment namehelp namehelp 4.6 Related Work 4.6.1 CDNs

4.6.2 DNS and CDNs 4.6.3 Solutions

edns-clientsubnet unmodified applications Direct Resolution namehelp DR 4.7 Summary and Contributions Direct

Resolution namehelp namehelp namehelp

Chapter 5 CDN-ISP tension: CDN model and traffic diversity 5.1 Trend: diversification of content delivery approaches

Location on-network functionality dedicated links shared links origin server Distance D0 D1 D2 D3 near Distance from end users far D3 D2 D1 PoPs External Networks ISP D0 Cities 5.1.1 CDNs using shared links 5.1.2 Directly-connected CDNs,

5.1.3 Peer-to-peer CDNs and ISP on-network functionality 5.1.4 Summary 5.2 Methodology

5.2.1 Datasets undns 5.2.1.1 ISP dataset of netflow records

5.2.1.2 End-user dataset of CDN performance

5.2.2 Metrics End-users ISP content providers CDNs 5.2.2.1 End-users 5.2.2.2 CDNs magnitude percent change in load

5.2.2.3 ISPs

5.2.2.4 Non-considerations 5.2.3 Emulating content locations city-level conservation of traffic

5.3 Tension: CDN traffic diversity vs. ISP traffic management 5.3.1 Users benefit from nearby content

CDF of cities 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 D0 in ISP network D1 dedicated links D2 shared links D3 origin server 0 20 40 60 80 100 120 RTT (ms) CDF of cities 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 D1 dedicated links D2 shared links D3 origin server 0 50 100 150 200 250 300 350 400 Standard deviation of RTT (ms)

CDF of cities 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 D0 in ISP network D1 dedicated links D2 shared links D3 origin server 0 20 40 60 80 100 120 RTT (ms) CDF of cities 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 D1 dedicated links D2 shared links D3 origin server 0 50 100 150 200 250 300 350 400 Standard deviation of RTT (ms)

5.3.2 Nearby content challenges CDNs with high traffic variability

CDF of aggregation points 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 D2 ISP link D1 ISP PoP D0 City 0.0 1 2 5 10 20 50 100 CDN traffic variation (%) CDF of aggregation points 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 D2 ISP link D1 ISP PoP D0 City 0.0 1 2 5 10 20 50 100 CDN traffic variation (%) CDF of aggregation points 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 D2 ISP link D1 ISP PoP D0 City 0.0 1 2 5 10 20 50 100 CDN traffic variation (%)

significantly more 5.3.3 Nearby content is preferable for ISPs

CDF of links 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0-80 -60-40 -20 0 20 40 % difference in traffic variation 5.3.3.1 External link traffic

40 % difference in traffic variation 20 0-20 -40-60 -80 0.2 0.5 1 2 5 10 20 50 100 CDN fraction of link traffic (%)

100 50 CDN fraction of link traffic (%) 20 10 5 2 1 0.5 0.2 > 5% +/- 5% < -5% 0.5 1 2 5 10 20 50 100 Normalized link traffic volume (%) > >

1.0 0.9 0.8 0.7 CDF of cities 0.6 0.5 0.4 0.3 0.2 0.1 constrain to 1 PoP 0.0-10 -5 0 5 10 15 20 25 30 % increase in traffic matrix predictability 5.3.3.2 Internal network traffic

1.0 0.9 0.8 0.7 CDF of links 0.6 0.5 0.4 0.3 0.2 0.1 0.0 D2 existing shared links D1 new dedicated links 1 2 5 10 20 50 100 Traffic variation (%)

5.4 Solution: cooperate to increase predictability and handle traffic bursts Constraining ingress. In-network caching functionality. 5.4.1 Finding a compromise

100 80 D1 original D1, 1 PoP 100 80 D1, 1 PoP D0, city caching, 1 PoP D0, city caching, 2 PoPs 100 80 D1 original D0, city caching, 2 PoPs Traffic variation (%) 60 40 Traffic variation (%) 60 40 Traffic variation (%) 60 40 20 20 20 0 0 20 40 60 80 100 Normalized link traffic volume (%) 0 0 20 40 60 80 100 Normalized link traffic volume (%) 0 0 20 40 60 80 100 Normalized link traffic volume (%)

1.0 0.9 0.8 0.7 CDF of cities 0.6 0.5 0.4 0.3 0.2 constrain to 1 PoP 0.1 constrain to 1 PoP; caching constrain to 2 PoPs; caching 0.0-10 -5 0 5 10 15 20 25 30 % increase in traffic matrix predictability

1.0 0.9 0.8 0.7 CDF of cities 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1-2 PoPs 3 PoPs 4 PoPs 5+ PoPs -10-5 0 5 10 15 20 25 30 % increase in traffic matrix predictability 5.4.2 Generalizing to other ISP architectures

5.5 Related Work 5.5.1 Network-integrated CDNs

5.5.2 CDN and ISP federations

5.5.3 Peer-assisted content distribution 5.6 Summary and Contributions

Chapter 6 ISP-User tension: cross-isp P2P traffic

6.1 BitTorrent dataset 6.1.1 Sampling methodology

6.2 Towards a representative view of BitTorrent

EU 52% EU 52% AF 2% OC 3% SA 4% AS 23% NA 20% AF 2% OC 3% SA 4% NA 16% AS 19%

µ 6.3 Trend: growth and shifts in BitTorrent use

300 300 Hourly peer traffic volume (MB) 250 200 150 100 50 0 2008 2009 2010 2011 2012 Hourly peer traffic volume (MB) 250 200 150 100 50 0 2008 2009 2010 2011 2012 6.3.1 Growth in per-user BitTorrent traffic 6.3.2 Increasingly diurnal usage patterns

Avg Hourly Peers Seen 300 200 100 50 30 20 10 5 3 2 Sat Sun Mon Tue Wed Thu Fri Avg Hourly Peers Seen 300 200 100 50 30 20 10 5 3 2 Tue Wed Thu Fri Sat Sun Avg Hourly Peers Seen 300 200 100 50 30 20 10 5 3 2 Fri Sat Sun Mon Tue Wed All EU NA AS SA OC AF Peak:Trough Ratio, Hourly Peers 3.0 2.5 2.0 1.5 1.0 2008 2009 2010 2011 2012 EU NA AS SA OC AF Connected Peer Continent

CDF 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 30 min 1 hr 2 hrs 4 hrs Session duration 12 hrs 1 day 2008 2010 2012 2 days 6.3.3 Greater diversity in session time distribution

Session Duration (hours) 24 16 12 8 6 4 3 2 1.5 '08 Q3 Q2 Q1 '12 1 All EU NA AS SA OC AF Vantage Point Location All, Q2 increased 6.3.4 Growth in system-wide traffic volume

/ +27 peer download rate (A) concurrent flows (C) per-flow download rate (D) total flows (E) unique peers per hour (B) concurrent flows (C) total download rate (F) per-flow download rate (D) total flows (E) 6.3.5 Shift toward diurnal usage

Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day Normalized traffic (%) 100 80 60 40 20 2010 2012 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Local hour of day

6.4 Tension: user access to content vs. ISP costs

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 BGP + Traceroute BGP 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Mapped Traffic 6.4.1 Mapping BitTorrent flows

> 6.4.2 Where traffic flows where 6.4.2.1 Tier-based topology classification

6.4.2.2 Most traffic stays in lower-tier networks T CDF [X x] Tier 1 Tier 2 Tier 3 Tier 4 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic T

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic T = U = T = U = T = U = CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 Tier 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 Tier 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 Tier 3 Tier 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic T = U = T = U = T = U = T U V 6.4.2.3 Origin and destination tier affect the tier traffic reaches T U V V T does not go above tier-3

stays in tier 4 6.4.3 Economic impact study on ISPs all

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Customer Peer Provider 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Customer Peer Provider 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Customer Peer Provider 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of Traffic T 6.4.3.1 Portion of charging traffic

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 2 Tier 3 Tier 4 0.01 0.1 1 10 100 1000 10000 Customer : Provider ratio T > 6.4.3.2 Traffic ratios

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 1 Tier 2 Tier 3 Tier 4-1 -0.75-0.5-0.25 0 0.25 0.5 0.75 1 (Customer - Provider) / Total >

CDF [X x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Tier 2 Tier 3 Tier 4-100 -50 0 50 100 Customer - Provider (GB) T >

6.4.3.3 Summary of variable costs analysis 6.4.4 95th-percentile cost contributions of BitTorrent traffic temporal pattern whether BitTorrent is relatively more expensive 6.4.4.1 95th-percentile billing and Shapley Value average marginal cost contribution

v th m = v th (BT) m = v th (Other + BT) v th (Other) SV BT = (m + m )/ efficient p BT = SV BT /(SV BT + SV Other ) f BT BT = p BT /f BT > comparatively more expensive < comparatively less expensive

6.4.4.2 Impact on 95th-percentile transit costs relatively more expensive T T T T 6.4.4.3 Relative Cost metric and scaling more

relative 6.4.4.4 Relative impact of BitTorrent on 95th-percentile costs X X more T

1.6 1.6 1.6 1.6 1.5 1.5 1.5 1.5 Relative Cost 1.4 1.3 1.2 1.1 1.0 Relative Cost 1.4 1.3 1.2 1.1 1.0 Relative Cost 1.4 1.3 1.2 1.1 1.0 Relative Cost 1.4 1.3 1.2 1.1 1.0 0.9 0.8 1 2 5 10 20 30 BT % of Total Traffic 0.9 0.8 1 2 5 10 20 30 BT % of Total Traffic 0.9 0.8 1 2 5 10 20 30 BT % of Total Traffic 0.9 0.8 1 2 5 10 20 30 BT % of Total Traffic T T T T T later T

6.4.5 Summary 6.5 Solution: leveraging available locality in swarms

6.5.1 BitTorrent peer discovery mechanisms

6.5.1.1 Random sampling in peer discovery randomly sampled 6.5.2 BitTorrent swarm membership dataset under-report

Total swarm size 70000 60000 50000 40000 30000 20000 10000 0 06 12 18 11/10 06 UTC time % peers in AS5089 (GB) 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 06 12 18 11/10 06 Local time 6.5.3 Available locality in swarms

Peers per network 10 4 10 3 10 2 10 1 Peak Hour Average Minimum 10 0 0.1 1 10 100 Percent of networks 6.5.3.1 Potential increase in local peer availability

< local peer discovery 6.5.3.2 Predicting future local peer availability the future distribution 6.5.4 Maximizing tracker-based peer discovery

Meta search engines 1 Search engines 2 Trackers Peers 3 6.5.4.1 Survey of torrent and tracker listings

CDF of torrents 1.0 0.8 0.6 0.4 0.2 median maximum 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Fraction of tracker domains seen per search engine Search Engine A c d b Trackers e a g f Search Engine B 6.5.4.2 Improving peer discovery by using more trackers

CDF of torrents 1.0 0.8 0.6 0.4 0.2 0.0 minimum median maximum 0 20 40 60 80 100 120 140 % increase in available swarm population components

CDF of torrents 1.0 0.8 0.6 0.4 0.2 0.0 minimum median maximum 0 50 100 150 200 250 300 % increase in number of trackers 6.5.4.3 Pushing trackers to the limit

CDF of trackers 1.0 0.8 0.6 0.4 0.2 0.0 default maximum 0 50 100 150 200 Number of peers returned per request CDF of trackers 1.0 0.8 0.6 0.4 0.2 0.0 5m 15m 30m 1h 3h 6h 12h Tracker-specified interval I was able to obtain swarm samples every few seconds from all trackers 140x faster

6.5.5 Summary 6.6 Related Work 6.6.1 Economic implications of P2P traffic

6.6.2 Characterizing P2P traffic locality 6.6.3 Biased neighbor selection techniques

6.7 Summary and Contributions

Chapter 7 Contributions and Conclusions

7.1 Open research directions and ongoing projects

References Proc. of IEEE INFOCOM Proc. of IMC SIGCOMM Comput. Commun. Rev. Proc. of the WWW Proc. of IMC Proc. of ICDCS Proc. of ACM CoNEXT Proc. of ACM SIGCOMM Proc. of ACM SIGCOMM

Proc. of IEEE INFOCOM Proc. of IMC Proc. of ACM CoNEXT Proc. of PAM ACM SIGCOMM CCR Proc. of IMC ACM CCR Proc. of IEEE INFOCOM Proc. of PAM Proc. of SIGMETRICS/Performance

Computer Communications IEEE/ACM Transactions on Networking Proc. of IMC Proc. of IEEE Global Internet Symposium Proc. of ACM CoNEXT Proc. of IMC Workshop Proc. of IMC Proc. of IMC Proc. of ACM SIGCOMM Computer Networks Computer Networks Proc. ACM IMW Proc. of ACM SIGCOMM Proc. of ACM SIGCOMM Proc. of IMC

Proc. of USENIX ATC Proc. of ACM SIGCOMM namebench ACM SIGOPS Operating Systems Review Proc. of ACM SIGCOMM Proc. of IMC IEEE J.Sel. A. Commun. Proc. of USENIX OSDI Proc. of HotNets Proc. of IPTPS Proc. of ACM SIGCOMM

Proc. of ACM SIGCOMM Proc. of USENIX NSDI Proc. of IMC Proc. of IEEE INFOCOM Proc. of ACM SIGMETRICS Proc. of IMC Proc. of IMC Proc. of ICDCS

Proc. of ACM SIGMETRICS Proc. of the WWW Proc. of ACM SIGMETRICS Networking (2) Proc. of IPTPS Proc. of USENIX OSDI CNET News CNET News In Proc. of IEEE GLOBECOM

Proc. of ACM SIGCOMM Proc. of IEEE INFOCOM IEEE Transaction on Parallel and Distributed Systems Proc. of IMC

namehelp cum laude