The Challenges of Stopping Illegal Peer-to-Peer File Sharing Kevin Bauer Dirk Grunwald Douglas Sicker Department of Computer Science University of Colorado
Context: The Rise of Peer-to-Peer 1993-2000: Early Internet saw mostly web traffic 2006-Present: P2P traffic now most common Source: CacheLogic Research 2006 2000: Peer-to-peer (P2P) protocols like Gnutella, FastTrack, Napster, & BitTorrent became popular for file sharing
Content Dissemination Models Traditional client/server model Users ( clients ) contact a centralized server to retrieve content (like a webpage) Peer-to-peer model Users ( peers ) contact each other to retrieve content Advantages: Decentralization, fault-tolerance, content availability, fast data dissemination Common applications: Streaming media, voice-over-ip (VoIP), and file sharing Client/server model P2P model
Current P2P Landscape P2P still most common protocol class today BitTorrent dominates P2P worldwide Source: Ipoque Internet Study 2008/2009
Sharing a File with BitTorrent To download the file: 1 Download the desired torrent file 2 Contact the tracker and obtain list of other peers Peer Who is sharing this file? 128.138.207.2, 182.203.21.4, Tracker server 3 Request pieces of file from the other peers I want piece #94 I want piece #23 Here s piece #94 Here s piece #23
What Type of Content is Shared? Source: Ipoque Internet Study 2008/2009
Copyright Enforcement Monitoring Investigations conducted by companies like Media Sentry & BayTSP have tried to identify illegal file sharers Investigative techniques: Query tracker server to obtain peer IP addresses Ping each IP address to ensure that it s alive Forward a DMCA take-down letter to each IP s ISP (or pursue legal action)
Identifying Users is Challenging Querying tracker lists can give many false-positives Trackers use a simple HTTP-based mechanism to register new peers; possible to register any arbitrary IP address Peer registration example: This URL registers an arbitrary IP A.B.C.D Source: Piatek et al. HotSec 08 Improvement: Actively participating in the file sharing may be a more accurate way to identify users
Identifying BitTorrent Traffic An ISP can identify BitTorrent application-layer header within packets in the network This is easy because BitTorrent typically operates in plaintext (no encryption) Possible to throttle or block BitTorrent traffic (i.e., Sandvine) BitTorrent is easy to identify: I want piece #94 Here s piece #94 I want piece #23 Here s piece #23 Countermeasure: Encrypt all peer communication; this is possible with message stream encryption (Diffie-Hellman key exchange + RC4 encryption)
Is Encrypted BitTorrent Common? Supported by popular BitTorrent clients like Vuze and µtorrent Source: Ipoque Internet Study 2008/2009 Protocol encryption cannot hide the file sharers identities This requires anonymity Protocol encryption may frustrate ISP s bandwidth throttling techniques However, traffic analysis based on packet sizes and timing may be used to identify traffic despite obfuscation
Tor: Anonymity for TCP Tor has become the most popular privacy enhancing system for enabling anonymous Internet communications Used widely to circumvent censorship, enable free speech, and promote democratic ideals worldwide Based upon a decentralized architecture Users forward their traffic through a set of Tor routers using a layered encryption scheme Each Tor router removes a layer of encryption At the final Tor router, the message is fully decrypted and can be delivered to its destination
Tor s System Architecture Client (Tor Proxy) Middle Router Exit Router Destination Server Entry Guard Virtual Circuit Directory Server Router list Tor provides anonymity for TCP by tunneling traffic through a circuit of three Tor routers using a layered encryption technique This could protect illegal file sharers from legal action/dmca take-down notices
Can BitTorrent Users Hide with Tor? Only 3.33%, but over 400,000 connections! We characterized how Tor is used in practice and observed significant BitTorrent traffic over a four day observation period
Can BitTorrent Users Hide with Tor? Nearly half of all Tor traffic! BitTorrent is using a disproportionate amount of bandwidth
Conclusion There s an arms race between P2P file sharers and ISPs/copyright holders The progression of tactics may reach a state where file sharers become completely anonymous and untraceable Anti-piracy strategies should focus on economic incentives (i.e., tiered bandwidth pricing, lower price content, etc.) to win this race Thank you Contact: kevin.bauer@colorado.edu