The Spotify Protocol on the Spotify Peer-Assisted Music-on-Demand Streaming System Mikael Goldmann KTH Royal nstitute of Technology Spotify gkreitz@spotify.com P2P 11, September 1 2011 on Spotify
Spotify Protocol overview P2P Overlay What is Spotify? On-demand peer-assisted music streaming Large catalog of music (over 15 million tracks) Available in US and 7 European countries, over 10 million users Over 1.6 million subscribers Fast (median playback latency of 265 ms) Legal on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Business dea More convenient than piracy Spotify Free (ads, 10h/month, invite needed in US) Spotify Unlimited (no ads, on computer) Spotify Premium (no ads, mobile, offline, AP) on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Business dea More convenient than piracy Spotify Free (ads, 10h/month, invite needed in US) Spotify Unlimited (no ads, on computer) Spotify Premium (no ads, mobile, offline, AP) on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Desktop Client on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Smartphone Client on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Hardware Clients on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Speed Photo by moogs http://www.flickr.com/photos/anjin/23460398/, CC BY NC ND 2.0 on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Overview of Spotify Protocol Proprietary protocol Designed for on-demand streaming Only Spotify can add tracks 96 320 kbps audio streams (most are Ogg Vorbis q5, 160 kbps) Relatively simple and straightforward design Photo by opethpainter http://www.flickr.com/photos/opethpainter/3452027651, CC BY 2.0 on Spotify
Spotify Protocol overview P2P Overlay Spotify Protocol (Almost) Everything encrypted (Almost) Everything over TCP Multiplex messages over a single TCP connection Persistent TCP connection to server while logged in on Spotify
Spotify Protocol overview P2P Overlay Caches Player caches tracks it has played Default policy is to use 10% of free space (capped at 10 GB) Caches are large (56% are over 5 GB) Over 50% of data comes from local cache Cached files are served in P2P overlay on Spotify
Spotify Protocol overview P2P Overlay Streaming a Track Request first piece from Spotify servers Meanwhile, search for peers with track Download data in-order When buffers are sufficient, only download from P2P Towards end of a track, start prefetching next one on Spotify
Spotify Protocol overview P2P Overlay P2P Structure Unstructured overlay Nodes have fixed maximum degree (60) Neighbor eviction by heuristic evaluation of utility No overlay routing A user only downloads data she needs on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Downloading in P2P Ask for most urgent pieces first f a peer is slow, re-request from new peers When buffers are low, download from central server as well When doing so, estimate what point P2P will catch up from f buffers are very low, stop uploading on Spotify
Spotify Protocol overview P2P Overlay Music vs. Movies Music Small (5 minutes, 5 MB) Many plays/session Large catalog Active users Main problem: peer discovery Movies Large (2 hours, 1.5 GB) High bit rate Main problem: download strategy on Spotify
Spotify Protocol overview P2P Overlay Music vs. Movies Music Small (5 minutes, 5 MB) Many plays/session Large catalog Active users Main problem: peer discovery Movies Large (2 hours, 1.5 GB) High bit rate Main problem: download strategy on Spotify
The Spotify Protocol Spotify Protocol overview P2P Overlay Finding Peers Partial tracker (BitTorrent style) Only remembers 20 peers per track Returns 10 (online) peers to client on query Broadcast query in small (2 hops) neighborhood in overlay (Gnutella style) LAN peer discovery (cherry on top) Client uses all mechanisms for every track on Spotify
Network Evaluation So, how well does it work? Data both from 2010 study (P2P 10) and this work on Spotify
Network Data Sources (from 2010) % 100 90 80 70 60 50 Weekday Morning Data source - ratio - by week Weekend Night RRDTOOL / TOB OETKER 40 30 20 10 0 Mon Tue Wed Thu Fri Sat Sun Mon Cur: Min: Avg: Server 10.86 6.76 9.62 P2P 33.90 23.78 33.86 Cache 55.24 48.47 56.53 on Spotify
Network Data Sources Somewhat sensitive to churn Better P2P performance on weekends 8.8% from servers 35.8% from P2P 55.4% from caches on Spotify
The Spotify Protocol Network Finding Peers (from 2010) Table: Sources of peers Sources for peers Tracker and overlay Only Tracker Only overlay No Peers Found Fraction of searches 75.1% 9.0% 7.0% 8.9% Each mechanism by itself is fairly effective on Spotify
Network The overlay peer discovery mechanism 100 80 Percentage of requests for which peers were found at distance d in overlay search (weekdays) ES F FR GB NL NO SE 60 % 40 20 0 d=1 d=2 d=1 and/or d=2 on Spotify
Network The weekend effect in peer discovery 100 80 Percentage of requests for which peers were found at distance d in overlay search ES,weekday ES,weekend FR,weekday FR,weekend 60 % 40 20 0 d=1 d=2 d=1 and/or d=2 on Spotify
Network Users become slowly disconnected from overlay 60 Median 50 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
Network Users become slowly disconnected from overlay 60 Median 60th percentile 50 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
Network Users become slowly disconnected from overlay 60 50 Median 60th percentile 70th percentile 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
Network Users become slowly disconnected from overlay 60 50 Median 60th percentile 70th percentile 80th percentile 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
Network Users become slowly disconnected from overlay 60 50 Median 60th percentile 70th percentile 80th percentile 90th percentile 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
Network Users become slowly disconnected from overlay 60 50 Median 60th percentile 70th percentile 80th percentile 90th percentile 95th percentile 40 Number of peers 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 Minutes since login on Spotify
The Spotify Protocol Network NAT types in the wild Photo by roychung1993 http://www.flickr.com/photos/roychung1993/2856744333/, CC BY NC 2.0 on Spotify
Network NAT types in the wild 100 Percentage of P addresses with various NAT properties No NAT 80 60 % 40 20 0 ES F FR GB NL NO SE on Spotify
Network NAT types in the wild 100 80 Percentage of P addresses with various NAT properties No NAT UPnP worked UPnP mismatch UPnP private P 60 % 40 20 0 ES F FR GB NL NO SE on Spotify
The Spotify Protocol Network How many Ps do users have Photo by feria http://www.flickr.com/photos/feria/141613794/, CC BY NC SA 2.0 on Spotify
Network How many Ps do users have 100 Distinct P addresses a user connects from over a week (CDF) GB 90 80 % 70 60 50 40 1 2 3 4 5 6 7 8 9 10 Number of Ps on Spotify
Network How many Ps do users have 100 90 Distinct P addresses a user connects from over a week (CDF) ES GB 80 % 70 60 50 40 1 2 3 4 5 6 7 8 9 10 Number of Ps on Spotify
Network How many Ps do users have 100 90 Distinct P addresses a user connects from over a week (CDF) ES GB NL 80 % 70 60 50 40 1 2 3 4 5 6 7 8 9 10 Number of Ps on Spotify
Network How many Ps do users have 100 90 80 Distinct P addresses a user connects from over a week (CDF) ES F FR GB NL NO SE % 70 60 50 40 1 2 3 4 5 6 7 8 9 10 Number of Ps on Spotify
The Spotify Protocol Network of a large, deployed system Future work Scaling to more users mprovements of P2P protocol More measurements (what are you interested in?) on Spotify