A Success and Failure Factor Study of Peer-to-Peer File Sharing Systems Christian Lüthold, Marc Weber
Outline Setting Up the Stage Overview of File Sharing Systems Factors & Categories Conclusion Discussion
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
File Sharing Systems
File Sharing Systems this presentation can be downloaded from: http://thepiratebay.ac/torrent/9331945 (P2PFileSharingSuccessFactors.torrent)
File Sharing Systems
File Sharing Systems
File Sharing Systems
File Sharing Systems
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
A Success and Failure Factor Study of Peer-toPeer File Sharing Systems Christian Lüthold, Marc Weber
Setting Up the Stage file sharing is highly controversial easy, util, fast, widespread, etc. copyright infringement causes legal pursues emergence of decentralized systems, mainly peer-to-peer (P2P) Investigation: What leads to success/failure of P2P file sharing platforms?
What is Success? durability survival against any harm various perspectives, many needs users developers content industry operators / donors
Our Approach analysis of various successful platforms extraction of essential factors definition of categories based on factors draw conclusions based on categories
Overview of File Sharing Systems Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala
Early Napster - History Shawn Fanning, John Fanning, Sean Parker 1999-2001 focussed on exchange of music 26.4 million at peak lawsuit and injunction issued by RIAA shutdown in 2001 experienced a 2nd incarnation became online music store acquired by Rhapsody in 2011
Architecture central indexing server eased bootstrap problem eased user registration & authentication maintained music file index optimizes index for searches fast & complete fuzzy peers contact server server returns address of file holder requester and holder establish separate connection
Essential Properties first service of its type free easy-to-use client combines best parts of two worlds user & content management system (C/S) distribution & storage of content (P2P) central element single-point-of-failure vulnerable to both technical and legal attacks existence of copyright infringed material can be proven
Gnutella - History Justin Frankel at Nullsoft (AOL) 2000 - today improvements over Napster no central entities files of any kind initially intended to be closed-source prohibited by AOL reverse-engineered by community many clients in different flavours since then LimeWire: 28 million users at peak
Architecture friend-of-friend concepts amount of inquired nodes grows exponentially at least one node needs to be known optimization ultra peers maintain address books / indices routing leaves content holders concentrate on storing and providing
Architecture query leaf asks nearest ultra peer ultra peer routing (QRP, dynamic querying) requester and holder establish separate connection download-mesh list of hosts returned hosts get informed about other hosts allows forwarding downloads supports swarming
Essential Properties open-source protocol many software clients (free, proprietary) backwards compatibility (GDF specification) adapted to support any file format better scalable decentralized approach runs even if parts of system is down flooding as major drawback industry seems to attack popular clients protocol is intouchable new clients emerge (e.g., FrostWire)
KaZaA - History based on the FastTrack protocol created by Estonian programmers from BlueMoon Interactive and sold to Niklas Zennström and Janus Friis [1] 2001 Kazaa Media Desktop (KMD) was bundled with advertisement and malware In 2001, legal action in the Netherlands forced an offshoring of the company to Australia Renaming to Sherman Networks 2009 new version of KaZaA Monthly fee of $ 19.98 August 2012 end of service
Architecture two-tier hierarchy Ordinary Peer (OP) Super Peer (SP) - promoted OP connections are highly dynamic SN SN 11 minutes OP SN 34 minutes
Architecture content management Each node informs SP about the shared files SP is responsible to maintain an index (not shared) search OP sends query to SP which may forward it to next SP download over content hash
Essential Properties fully decentralized two-tier architecture of the overlay highly dynamic network (connection shuffling) commercial interests build in the system no flooding for search for the prize of longer search durations firewall avoidance and NAT circumvention
FreeNet - History 1999 Ian Clarke described Freenet: A Distributed Anonymous Information Storage and Retrieval System [9] 2000 Ian s paper was the most cited computer science paper according to Citeseer first version of Freenet was released 2005 idea of a Dark Net emerged 2008 final version (v.0.7.5) including dark net was released
Architecture each node needs to provide disk space bandwidth computation power network is an unstructured graph each node has a random ID between 0 and 1 no search content fully encrypted retrieving of data only by key
Retrieving Data
Essential Properties two modes to operate: Darknet (only connections to trusted peers) Opennet fully anonymous (provider & requester of data) content is split in chunks and encrypted no search at all content can only be added but not removed (unpopular content will be forgotten after some time by the network)
BitTorrent - History developed by Bram Cohen first release 2. July 2001 2005 Azureus was released Trackerless torrents using DHT Mainline DHT was released by BitTorrent Inc. 2007 BitTorrent Inc. changed website to onlinestore 2008 final version
Architecture three main components.torrent file (tracker) clients network is centered around one single content (swarm) node roles seeder leacher
Architecture
Essential Properties separate networks are built for each content (file or directory structure) swarm enables fast download parallel download from several sources at once upload capacity of each node is used fair sharing through tit-for-tat no search included in the system search is provided externally by third parties never targeted by lawsuits business partners and business users
Wuala - History research and development at ETH Zurich 2008 - today focus on P2P-based file system alternative to cloud storage storing, sharing, publishing security & privacy acquisition by LaCie in 2009 changes in underlying system architecture P2P replaced with C/S
Architecture DHT super nodes storage nodes client nodes message routing provide disk space publish/retrieve files redundancy solves availability problem implemented with erasure coding (not replication) dedicated server bootstrapping backup
Architecture putting client-side encryption encoding into multiple fragments distribution of fragments in the DHT routing client node contacts closest super node super node routing to storage node storage nodes contact client nodes directly sharing asymmetric RSA-2048 friendship key Cryptree
Essential Properties file sharing is not essential mission synchronization, versioning, backup friends, groups, public space central element as fallback focus on security not about anonymity, but privacy slow mechanisms client-side encryption/decryption erasure coding
Factors
Categories
Architectural Properties Architecture Type Central Elements Search (Speed & Completeness) Anonymity Replication Free-Riding Protection
Architectural Properties Type Central Elements Search Completeness Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Anonymity Speed Replication Free-Riding Protection
Architectural Properties: Conclusion pure P2P is more safe from legal issues search important for success exists in many forms (built-in, external, disabled) anonymity not crucial for success individuals seldom get pursued contradicts to search requirements free-riding users mostly do not care about it developers must care
Content Content Size (File Size) Content Diversity Content Quality
Content Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Content Size Content Diversity Content Quality
Content: Conclusion all, content size, diversity and quality, contribute to the overall user satisfaction make system more attractive important in the long-run competition quality measurement for community effort willing to share qualitative content thus implicit factor for success
Law Legal Issues
Law Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Legal Issues
Law Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Legal Issues
Law Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Legal Issues
Law: Conclusion Provide a target and they will come for you Search mechanisms reveal content Size of the systems matters Systems with central elements tend to live shorter than the ones without After shut down the willingness to cooperate with content producers increase (e.g., Napster, KaZaA)
Monetary Aspects Cost (of client) Monetary Revenues Business Partners Business Users
Monetary Aspects Costs (of client) Monetary Revenues Business Partners Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala Business Users
Monetary Aspects: Conclusion Clients today are free Depending on the system different payment models are available advertisements donation support licence Money made by advertisement KaZaA built-in third parties (Bit Torrent sites)
Quantitative Indications Number of Competitors Number of Clients Estimated number of users (@peak) Estimated content (@peak) System lifetime
Quantitative Indications Early Napster Gnutella KaZaA FreeNet BitTorrent Wuala # Competitors # Clients Estimated Users @peak Estimated Content @peak System Lifetime
Quantitative Indications: Conclusion the more competitors the harder different clients on same protocol attract different user groups provide diversity more users more users = more stability & content positive spiral effect
Time Line success of a system also influenced by time / trends some factors are time-dependent technology competition laws / justice events of the past influence systems of the future
1999 2000 2001 DHT SPOF RIAA advertisement bottlenecks
2007 2008... 2012 2013? Cloud Storage
Desirable Properties user perspective developer perspective free of charge, no advertisement making money rich & diverse content adaptation to individual resources (heterogeneity, load balancing) privacy, security, anonymity user registration / authentication (control, monitoring, protection) search & anonymity coexistently using locality (ISPs) fast & complete (fuzzy) search fully decentralized no restrictions (filesize / content) stable, reliable
Desirable Properties: Conflicts user perspective developer perspective free of charge, no advertisement making money rich & diverse content adaptation to individual resources (heterogeneity, load balancing) privacy, security, anonymity user registration/authentication (control, monitoring, protection) search & anonymity coexistently using locality (ISPs) fast & complete (fuzzy) search fully decentralized no restrictions (filesize/content) stable, reliable
References [1] Jian Liang, Rakesh Kumar, Keith W. Ross: The KaZaA Overlay: A Measurement Study, Polytechnic University, Brooklyn, USA. [2] Carmen G. López: Measuring Bittorrent Ecosystems, Universidas Carlos III de Madrid, Spain [3] S.M. Lui, Karl R. Lang, S.H. Kwok: Participation Incentive Mechanisms in Peer-to-Peer Subscription Systems, Hong Kong University of Science and TEchnology, Hong Kong [4] M. Eric Johanson, Dan McGuire, Nicholas D. Willey: The Evolution of the Peer-to-Peer File Sharing Industry and the Security Risks for Users, Dartmoth College, Hanover
References [5] M. Eric Johnson, Dan Mcguire, Nicholad D. Willey: Why File Sharing Networks are Dangerous?, Communication of the ACM Feb. 2009 [6] Jintae Lee: An End-User Perspective on File Sharing Systems, Communication of the ACM Feb. 2003 [7] [Predicting the Usage of P2P Sharing Software] [8] Wala - A Distributed File System http://www.youtube. com/watch?v=3xkz4kgkqy8&noredirect=1 [9] Ian Clarke, Oskar Sandberg, Brandon Wiley, Theodore W. Hong: Freenet: A Distributed Anonymous Information Storage and Retrieval System
References This talk can be found under: http://thepiratebay.ac/torrent/9331945
Discussion Who of you is using P2P file sharing systems? Why did you choose system X over system Y?
Discussion Do you think built-in searches were intentionally developed for file sharing and vice versa?
Discussion In the past the major network traffic was caused by P2P - today it is movie streaming - what do we conclude from that?
Discussion Personally, what would you consider a desirable property? Is there still something missing in all/most systems?