Privacy- Preserving P2P Data Sharing with OneSwarm Presented by Adnan Malik
Privacy The protec?on of informa?on from unauthorized disclosure Centraliza?on and privacy threat Websites Facebook TwiFer
Peer to Peer (P2P) Alternate solu?on for file sharing without sharing through a website Privacy Vs Performance BitTorent Good performance Users can be monitored Tor Good privacy but poor performance
OneSwarm Both Performance and privacy Privacy Default policy on the public sharing by user Data shared using disposable, temporary addresses and routed indirectly Performance Content look up using mul?ple overlay paths Good performance even for rare objects Flexible Privacy Restricted sharing to trusted contacts Used by thousands of people worldwide
Data Sharing without OneSwarm Downloads Trust worthy, e.g Downloading Linux security patch using biforrent User trust Friend s vs anonymous peers May divide into three Models Freenet: for anonymous publica?ons Tor : for anonymous downloads? : controlled sharing with friends
Bob and Alice again
Data sharing Public Distribu?on Sharing recorded lecture course With permissions Permission against a file Restric?ng users against a file Without afribu?on For sensi?ve data sharing Privacy preserving keywords search Unknown source and des?na?on
Protocol Design Topology Users define the links by exchanging public keys This iden?fies each user and creates direct encrypted P2P connec?ons OneSwarm uses social graph and community server for key distribu?on Distributed hash table (DHT) serves as name resolu0on service Each client maintains encrypted en??es adver?sing their IP address and port to authorized users Peers Trusted Peers are used for sharing Among friends and family Untrusted For sharing without afribu?on. For users with few trusted friends Transport To enhance privacy Instead of sharing data publically Each OneSwarm client restricts direct communica0on to a small number of persistent contacts Instead of centralized informa0on of which peers have which object One Swarm : Locates different data sources using Object lookup through overlay Instead of sources sending data directly to receivers Reverse search path in the mesh is used Conges0on aware and automa?cally rou?ng protocol Mul0ple paths to each data source for performance
Protocol Design: Linking peers with trust rela?onships Public and private keys 1024 bit RSA key pair is generated upon installa?on Key serving as iden0ty among friends Manual key sharing between two users Automa?c key sharing Discovers and exchange keys over local area network Exis?ng social networks e.g google talk Email invita?on to friends
Protocol Design: Managing groups and untrusted peers Groups of colleagues Private community server Registered users Public Public community servers Community server registra?on Helps to avoid sybil iden??es Each user must have a node iden?ty Loca?on of other node not visible
Protocol Design: Iden?ty and connec?vity Distributed Hash table (DHT) IP and port Entries for a client are signed by client and encrypted with the public key Each entry is indexed by 20 byte randomly generated shared secret IP s and port are hashed DHT Loca?on is hidden
Protocol Design: Naming and loca?ng data Secure Sockets (SSLv3) used for connec?on File list messages Exchanged on first connec?on Compressed XML afributes Contains name, size and other meta data for par?cular peer Empty list by the node (if it has nothing to share) Naming Shared files are named using 160 bit SHA- 1 hash of their name and content For public data User obtains hashes from email, websites and keywords search For Private data User must obtain both hash and key used for decryp?on of data Conges?on Aware search Uses keyword search messages to include randomly generated ID s Search forwarded by nodes if not have file at system Shortest path High load and path alternate Path Setup Search reply message List of content hashes, File, meta data and path iden?fier More then one path is differen?ated by path ID s
Protocol Design Swarming data transfer Keep alive messages to refresh path Expires 30 seconds of inac?vity Path becomes congested? Incen?ves Transfer sta?s?cs Uploaded, downloaded,maximum transfer rates, control traffic and volume, up?me Tit for tat policy
Security Analysis Threat Model AFacker can join with limited number of nodes Can check the traffic flow to/from No guarantee Sniffing, modify or injected data Injector can size the hardware e.g Law enforcement AFacks and defenses Limi?ng hacker to snoop in from arbitrary loca?on by not assigning peer dynamically User defined trusted and untrusted links to keep the informa?on private End to end path between users changes rapidly helps to prevent hacking using historical data
Timing AFacks Measuring the round trip?me of search/ response maybe used to find data source detec?on by hacker Hacker may come up with many virtual nodes and trying to par?cipate in the system to find the directly connected nodes Solu?on OneSwarm ar?ficially inflates delays for query received from untrusted peers. In result hacker ends up two to three hope away from source/ receiver
PlanerLab RTT(Round Trip Time) Experiments Length of the path (Large and small) Congested nodes
Collusion afack
Evalua?on Measure Performance and structure and u?liza?on in the real world Voluntarily user reported data 100,000 dis?nct user reported over 10 month period Reported total number of peers method used for key exchange aggregate data transfer volumes Client running on hundreds of PlanetLab machines Measuring the background traffic generated Data forwarding and Control traffic
Evalua?on: Overlay Structure Social rela?onships Random matching of public community servers User s impor?ng large number of keys from websites maintaining ac?ve user s lists
Evalua?on: Mul? Path transfer
Exis?ng Systems
Overheads
Trace Replay in last.fm Social Graph
Related work (Privacy) Crowds provides anonymous web browsing by randomly tunneling requests via other system par?cipants Herbivore Anonymous file sharing by providing scalable implementa?on of DC- nets Tor Uses onion rou?ng techniques to anonymize requests via set of relay modes Tarzan Without using public key infrastructure,address rewri?ng techniques via P2P context OneSwarm Differs Data sharing model Peer trust rela?onships Large scale deployment and user popula?on
Related work (Trust) Sybil Guard Uses proper?es of social networks to hide iden??es in social systems Friendstore P2P backup system Data stored on other trust friend s nodes Similarly Turtle, UIA and Ostra OneSwarm With variety of addi?onal untrusted links Allowing mixtures of peer sources for further privacy enhancement
Conclusion Reduce cost of privacy to user Uses techniques Efficient, robust and privacy preserving lookup and data transfer User flexibility control over their privacy Sharing permissions Trust at individual data objects and peers Publically available Windows, Mac OS X and Linux Delivers privacy preserving download
Ques?ons?
Discussions Have you used OneSwarm Permissions with file sharing (bob and Aice example) How it can be improved In Distributed Hash Table, IP s and ports are hashed, Is it safe enough