CSCI-1680 CDN & P2P Chen Avin

Similar documents
P2P: centralized directory (Napster s Approach)

DNS and P2P File Sharing

The Role and uses of Peer-to-Peer in file-sharing. Computer Communication & Distributed Systems EDA 390

Application Layer. CMPT Application Layer 1. Required Reading: Chapter 2 of the text book. Outline of Chapter 2

HW2 Grade. CS585: Applications. Traditional Applications SMTP SMTP HTTP 11/10/2009

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Applications & Application-Layer Protocols: The Domain Name System and Peerto-Peer

Application Layer. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross

Peer-to-Peer Networks. Chapter 2: Initial (real world) systems Thorsten Strufe

Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.

Professor Yashar Ganjali Department of Computer Science University of Toronto.

How To Use A Phone Over Ip (Phyto) For A Phone Call

Overview. Tor Circuit Setup (1) Tor Anonymity Network

Internet Security. Prof. Anja Feldmann, Ph.D.

The BitTorrent Protocol

Distributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski

CS 5480/6480: Computer Networks Spring 2012 Homework 1 Solutions Due by 9:00 AM MT on January 31 st 2012

Skype network has three types of machines, all running the same software and treated equally:

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

loss-tolerant and time sensitive loss-intolerant and time sensitive loss-intolerant and time insensitive

Skype characteristics

Introduction to Computer Networks

1. The Web: HTTP; file transfer: FTP; remote login: Telnet; Network News: NNTP; SMTP.

How To Understand The Power Of A Content Delivery Network (Cdn)

The Challenges of Stopping Illegal Peer-to-Peer File Sharing

Distributed Systems. 2. Application Layer

P2P File Sharing: BitTorrent in Detail

Three short case studies

An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol

DNS, CDNs Weds March Lecture 13. What is the relationship between a domain name (e.g., youtube.com) and an IP address?

Computer Networks - CS132/EECS148 - Spring

RESEARCH ISSUES IN PEER-TO-PEER DATA MANAGEMENT

Architectures and protocols in Peer-to-Peer networks

An Introduction to Peer-to-Peer Networks

Peer-to-Peer Systems: "A Shared Social Network"

Computer Networks & Security 2014/2015

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Network Applications

VoIP over P2P networks

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013

Multicast vs. P2P for content distribution

CS5412: TORRENTS AND TIT-FOR-TAT

The Internet is Flat: A brief history of networking over the next ten years. Don Towsley UMass - Amherst

Decentralized supplementary services for Voice-over-IP telephony

Chapter 2 Application Layer

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

PEER TO PEER FILE SHARING USING NETWORK CODING

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid

Napster and Gnutella: a Comparison of two Popular Peer-to-Peer Protocols. Anthony J. Howe Supervisor: Dr. Mantis Cheng University of Victoria

Internet Content Distribution

Best Practices for Controlling Skype within the Enterprise. Whitepaper

The Algorithm of Sharing Incomplete Data in Decentralized P2P

Homework 2 assignment for ECE374 Posted: 02/20/15 Due: 02/27/15

Middleware and Distributed Systems. Peer-to-Peer Systems. Martin v. Löwis. Montag, 30. Januar 12

Unit 3 - Advanced Internet Architectures

Peer-to-peer filetransfer protocols and IPv6. János Mohácsi NIIF/HUNGARNET TF-NGN meeting, 1/Oct/2004

Best Practices for Controlling Skype within the Enterprise > White Paper

2015 Internet Traffic Analysis

Computer Networks and the Internet

Optimizing Congestion in Peer-to-Peer File Sharing Based on Network Coding

Peer-to-Peer Networks Organization and Introduction 1st Week

DDoS Vulnerability Analysis of Bittorrent Protocol

q Admin and recap q Case studies: Content Distribution o Forward proxy (web cache) o Akamai o YouTube q P2P networks o Overview

Content Delivery Networks. Shaxun Chen April 21, 2009

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

The Effect of Caches for Mobile Broadband Internet Access

Web Caching and CDNs. Aditya Akella

Peer-to-Peer Networks. Chapter 6: P2P Content Distribution

MC-FTP (Multicast File Transfer Protocol): Implementation and Comparison with

CSIS CSIS 3230 Spring Networking, its all about the apps! Apps on the Edge. Application Architectures. Pure P2P Architecture

DATA COMMUNICATOIN NETWORKING

Lecture 6 Content Distribution and BitTorrent

Seminar RVS MC-FTP (Multicast File Transfer Protocol): Simulation and Comparison with BitTorrent

Methods & Tools Peer-to-Peer Jakob Jenkov

Content Distribu-on Networks (CDNs)

From Centralization to Distribution: A Comparison of File Sharing Protocols

A distributed system is defined as

Basheer Al-Duwairi Jordan University of Science & Technology

CSC2231: Akamai. Stefan Saroiu Department of Computer Science University of Toronto

A Survey of Peer-to-Peer Network Security Issues

CS 360 Internet Programming

Peer-to-Peer File Sharing Across Private Networks Using Proxy Servers

What s a protocol? What s a protocol? A closer look at network structure: What s the Internet? What s the Internet? What s the Internet?

Measuring CDN Performance. Hooman Beheshti, VP Technology

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

Peer-to-Peer Networks 02: Napster & Gnutella. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg

A Topology-Aware Relay Lookup Scheme for P2P VoIP System

TalkShow Advanced Network Tips

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Content Delivery Networks (CDN) Dr. Yingwu Zhu

Transcription:

CSCI-1680 CDN & P2P Chen Avin Based partly on lecture notes by Scott Shenker and John Jannotti androdrigo Fonseca And Computer Networking: A Top Down Approach - 6th edition

Last time DNS & DHT Today: P2P & CND P2P Benefits Bit Torrent & Skype Caching & Content Distribution Networks

Content distribution networks challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users? option 1: single, large mega-server single point of failure point of network congestion long path to distant clients multiple copies of video sent over outgoing link.quite simply: this solution doesn t scale

Content distribution networks challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users? option 2: store/serve multiple copies of videos at multiple geographically distributed sites (CDN) enter deep: push CDN servers deep into many access networks close to users used by Akamai, 1700 locations bring home: smaller number (10 s) of larger clusters in POPs near (but not within) access networks used by Limelight

CDN: simple content access scenario Bob (client) requests video http://netcinema.com/6y7b23v video stored in CDN at http://kingcdn.com/netc6y&b23v 1. Bob gets URL for for video http://netcinema.com/6y7b23v from netcinema.com web page 1 2 6. request video from 5 KINGCDN server, streamed via HTTP 3. netcinema s DNS returns URL http://kingcdn.com/netc6y&b23v 3 netcinema.com 2. resolve http://netcinema.com/6y7b23v via Bob s local DNS 4 4&5. Resolve http://kingcdn.com/netc6y&b23 via KingCDN s authoritative DNS, which returns IP address of KIingCDN server with video netcinema s authorative DNS KingCDN.com KingCDN authoritative DNS

CDN cluster selection strategy challenge: how does CDN DNS select good CDN node to stream to client pick CDN node geographically closest to client pick CDN node with shortest delay (or min # hops) to client (CDN nodes periodically ping access ISPs, reporting results to CDN DNS) IP anycast alternative: let client decide - give client a list of several CDN servers client pings servers, picks best Netflix approach Multmedia Networking 7-6

How Akamai works Akamai has cache servers deployed close to clients Co-located with many ISPs Challenge: make same domain name resolve to a proxy close to the client Lots of DNS tricks. BestBuy is a customer Delegate name resolution to Akamai (via a CNAME) From Brown: dig www.bestbuy.com ;; ANSWER SECTION: www.bestbuy.com. 3600 IN CNAME www.bestbuy.com.edgesuite.net. www.bestbuy.com.edgesuite.net. 21600 IN CNAME a1105.b.akamai.net. a1105.b.akamai.net. 20 IN A 198.7.236.235 a1105.b.akamai.net. 20 IN A 198.7.236.240 Ping time: 2.53ms From Berkeley, CA: a1105.b.akamai.net. 20 IN A 198.189.255.200 a1105.b.akamai.net. 20 IN A 198.189.255.207 Ping time: 3.20ms

dig www.bestbuy.com ;; ANSWER SECTION: DNS Resolution www.bestbuy.com. 3600 IN CNAME www.bestbuy.com.edgesuite.net. www.bestbuy.com.edgesuite.net. 21600 IN CNAME a1105.b.akamai.net. a1105.b.akamai.net. 20 IN A 198.7.236.235 a1105.b.akamai.net. 20 IN A 198.7.236.240 ;; AUTHORITY SECTION: b.akamai.net. 1101 IN NS n1b.akamai.net. b.akamai.net. 1101 IN NS n0b.akamai.net. ;; ADDITIONAL SECTION: n0b.akamai.net. 1267 IN A 24.143.194.45 n1b.akamai.net. 2196 IN A 198.7.236.236 n1b.akamai.net finds an edge server close to the client s local resolver Uses knowledge of network: BGP feeds, traceroutes. Their secret sauce

What about the content? Say you are Akamai Clusters of machines close to clients Caching data from many customers Proxy fetches data from origin server first time it sees a URL Choose cluster based on client network location How to choose server within a cluster? If you choose based on client Low hit rate: N servers in cluster means N cache misses per URL

Consistent Hashing [Karger et al., 99] C 4 0 A Object Cache 1 1 B 2 C 3 2 B 3 C 4 A URLs and Caches are mapped to points on a circle using a hash function A URL is assigned to the closest cache clockwise Minimizes data movement on change! When a cache is added, only the items in the preceding segment are moved When a cache is removed, only the next cache is affected

Consistent Hashing [Karger et al., 99] C 4 0 A Object Cache 1 1 B 2 C 3 B 3 C 2 4 A Minimizes data movement If 100 caches, add/remove a proxy invalidates ~1% of objects When proxy overloaded, spill to successor Can also handle servers with different capacities. How? Give bigger proxies more random points on the ring

Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change IP addresses examples: file distribution (BitTorrent) Streaming (KanKan) VoIP (Skype)

Peer-to-Peer Systems How did it start? A killer application: file distribution Free music over the Internet! (not exactly legal ) Key idea: share storage, content, and bandwidth of individual users Lots of them Big challenge: coordinate all of these users In a scalable way (not NxN!) With changing population (aka churn) With no central administration With no trust With large heterogeneity (content, storage, bandwidth, )

3 Key Requirements P2P Systems do three things: Help users determine what they want Some form of search P2P version of Google Locate that content Which node(s) hold the content? P2P version of DNS (map name to location) Download the content Should be efficient P2P form of Akamai

File distribution: client-server vs P2P Question: how much time to distribute file (size F) from one server to N peers? peer upload/download capacity is limited resource u s : server upload capacity file, size F server u s u 1 d 1 u 2 d 2 d i : peer i download capacity u N d N network (with abundant bandwidth) d i u i u i : peer i upload capacity

File distribution time: client-server server transmission: must sequentially send (upload) N file copies: F u s d i time to send one copy: F/u s network time to send N copies: NF/u s client: each client must download file copy d min = min client download rate min client download time: F/d min u i time to distribute F to N clients using client-server approach D c-s > max{nf/u s,,f/d min } increases linearly in N

File distribution time: P2P server transmission: must upload at least one copy F u s time to send one copy: F/u s network client: each client must download file copy min client download time: F/d min clients: as aggregate must download NF bits max upload rate (limting max download rate) is u s + Su i d i u i time to distribute F to N clients using P2P approach D P2P > max{f/u s,,f/d min,,nf/(u s + Su i )} increases linearly in N but so does this, as each peer brings service capacity

Minimum Distribution Time Client-server vs. P2P: example client upload rate = u, F/u = 1 hour, u s = 10u, d min u s 3.5 3 P2P Client-Server 2.5 2 1.5 1 0.5 0 0 5 10 15 20 25 30 35 N

Napster (1999) xyz.mp3

Napster xyz.mp3 xyz.mp3?

Napster xyz.mp3 xyz.mp3?

Napster xyz.mp3 xyz.mp3?

Napster Search & Location: central server Download: contact a peer, transfer directly Advantages: Simple, advanced search possible Disadvantages: Single point of failure (technical and legal!) The latter is what got Napster killed

Gnutella: Flooding on Overlays (2000) Search & Location: flooding (with TTL) Download: direct xyz.mp3 xyz.mp3? An unstructured overlay network

Gnutella: Flooding on Overlays xyz.mp3 xyz.mp3? Flooding

Gnutella: Flooding on Overlays xyz.mp3 xyz.mp3? Flooding

Gnutella: Flooding on Overlays xyz.mp3

KaZaA: Flooding w/ Super Peers (2001) Well connected nodes can be installed (KaZaA) or self-promoted (Gnutella)

Voice-over-IP: Skype proprietary applicationlayer protocol (inferred via reverse engineering) encrypted msgs P2P components: clients: skype peers connect directly to each other for VoIP call super nodes (SN): skype peers with special functions overlay network: among SNs to locate SCs login server Skype login server Skype clients (SC) supernode (SN) supernode overlay network

P2P voice-over-ip: skype skype client operation: 1. joins skype network by contacting SN (IP address cached) using TCP 2. logs-in (usename, password) to centralized skype login server 3. obtains IP address for callee from SN, SN overlay or client buddy list 4. initiate call directly to callee Skype login server

problem: both Alice, Bob are behind NATs Skype: peers as relays NAT prevents outside peer from initiating connection to insider peer inside peer can initiate connection to outside relay solution: Alice, Bob maintain open connection to their SNs Alice signals her SN to connect to Bob Alice s SN connects to Bob s SN Bob s SN connects to Bob over open connection Bob initially initiated to his SN

Lessons and Limitations Client-server performs well But not always feasible Things that flood-based systems do well Organic scaling Decentralization of visibility and liability Finding popular stuff Fancy local queries Things that flood-based systems do poorly Finding unpopular stuff Fancy distributed queries Vulnerabilities: data poisoning, tracking, etc. Guarantees about anything (answer quality, privacy, etc.)

P2P file distribution: BitTorrent file divided into 256Kb chunks peers in torrent send/receive file chunks tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file Alice arrives obtains list of peers from tracker and begins exchanging file chunks with peers in torrent

P2P file distribution: BitTorrent peer joining torrent: has no chunks, but will accumulate them over time from other peers registers with tracker to get list of peers, connects to subset of peers ( neighbors ) while downloading, peer uploads chunks to other peers peer may change peers with whom it exchanges chunks churn: peers may come and go once peer has entire file, it may (selfishly) leave or (altruistically) remain in torrent

BitTorrent: requesting, sending file chunks requesting chunks: at any given time, different peers have different subsets of file chunks periodically, Alice asks each peer for list of chunks that they have Alice requests missing chunks from peers, rarest first sending chunks: tit-for-tat Alice sends chunks to those four peers currently sending her chunks at highest rate other peers are choked by Alice (do not receive chunks from her) re-evaluate top 4 every10 secs every 30 secs: randomly select another peer, starts sending chunks optimistically unchoke this peer newly chosen peer may join top 4

BitTorrent: tit-for-tat (1) Alice optimistically unchokes Bob (2) Alice becomes one of Bob s top-four providers; Bob reciprocates (3) Bob becomes one of Alice s top-four providers higher upload rate: find better trading partners, get file faster!