Measurement Study of Wuala, a Distributed Social Storage Service



Similar documents
From Centralization to Distribution: A Comparison of File Sharing Protocols

TECHNICAL CHALLENGES OF VoIP BYPASS

Realizing a Vision Interesting Student Projects

Lab VI Capturing and monitoring the network traffic

A Critical Investigation of Botnet

Avaya ExpertNet Lite Assessment Tool

TCP Packet Tracing Part 1

EXPLORER. TFT Filter CONFIGURATION

About Firewall Protection

User Manual. Onsight Management Suite Version 5.1. Another Innovation by Librestream

3. Some of the technical measures presently under consideration are methods of traffic shaping, namely bandwidth capping and bandwidth shaping 2.

This Lecture. The Internet and Sockets. The Start If everyone just sends a small packet of data, they can all use the line at the same.

Network Forensics Network Traffic Analysis

Active Management Services

Sync Security and Privacy Brief

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

Frequently Asked Questions

1. The Web: HTTP; file transfer: FTP; remote login: Telnet; Network News: NNTP; SMTP.

Computer Networks - CS132/EECS148 - Spring

Network Layer IPv4. Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS. School of Computing, UNF

Lab Exercise SSL/TLS. Objective. Step 1: Open a Trace. Step 2: Inspect the Trace

Networks and Security Lab. Network Forensics

AIMMS The Network License Server

Minimal network traffic is the result of SiteAudit s design. The information below explains why network traffic is minimized.

IBM. Vulnerability scanning and best practices

Introduction to Network Security Lab 1 - Wireshark

A Reliable and Fast Data Transfer for Grid Systems Using a Dynamic Firewall Configuration

Hands-on Network Traffic Analysis Cyber Defense Boot Camp

OpenScape Business V2

NetFlow Tracker Overview. Mike McGrath x ccie CTO mike@crannog-software.com

Advanced Computer Networks Project 2: File Transfer Application

Skype characteristics

Mobile SCTP Transport Layer Mobility Management for the Internet

Common VoIP problems, How to detect, correct and avoid them. Penny Tone LLC 1

Craig Pelkie Bits & Bytes Programming, Inc. craig@web400.com

Monitoring PostgreSQL database with Verax NMS

Dollar Universe SNMP Monitoring User Guide

Skype network has three types of machines, all running the same software and treated equally:

FortKnox Personal Firewall

Tk20 Network Infrastructure

First Semester Examinations 2011/12 INTERNET PRINCIPLES

point to point and point to multi point calls over IP

SNMP Test er Manual 2015 Paessler AG

HoneyBOT User Guide A Windows based honeypot solution

Rapid Assessment Key v2 Technical Overview

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Lecture 28: Internet Protocols

Using UDP Packets to Detect P2P File Sharing

Large-Scale TCP Packet Flow Analysis for Common Protocols Using Apache Hadoop

and reporting Slavko Gajin

Network sniffing packet capture and analysis

Computer Networking LAB 2 HTTP

SOUTHERN POLYTECHNIC STATE UNIVERSITY. Snort and Wireshark. IT-6873 Lab Manual Exercises. Lucas Varner and Trevor Lewis Fall 2013

SAN/iQ Remote Copy Networking Requirements OPEN iscsi SANs 1

How To Understand Network Performance Monitoring And Performance Monitoring Tools

ΕΠΛ 674: Εργαστήριο 5 Firewalls

Pre Sales Communications

Appendix A: Configuring Firewalls for a VPN Server Running Windows Server 2003

Ethernet. Ethernet. Network Devices

Features Overview Guide About new features in WhatsUp Gold v14

Reliable log data transfer

Parallels Plesk Panel. VPN Module for Parallels Plesk Panel 10 for Linux/Unix Administrator's Guide. Revision 1.0

Linux firewall. Need of firewall Single connection between network Allows restricted traffic between networks Denies un authorized users

Introduction: Why do we need computer networks?

pt360 FREE Tool Suite Networks are complicated. Network management doesn t have to be.

COMP416 Lab (1) Wireshark I. 23 September 2013

NNMi120 Network Node Manager i Software 9.x Essentials

ETM System SIP Trunk Support Technical Discussion

Semester Thesis Traffic Monitoring in Sensor Networks

CHAPTER 1 INTRODUCTION

DMZ Network Visibility with Wireshark June 15, 2010

Introduction to Directory Services

Snoopy. Objective: Equipment Needed. Background. Procedure. Due Date: Nov 1 Points: 25 Points

Controlling Risk, Conserving Bandwidth, and Monitoring Productivity with Websense Web Security and Websense Content Gateway

Life of a Packet CS 640,

Final for ECE374 05/06/13 Solution!!

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Best Practices for Controlling Skype within the Enterprise. Whitepaper

PRI (T1/E1) Call Recorder User Manual Rev 1.0 (December 2013)

Agenda. Taxonomy of Botnet Threats. Background. Summary. Background. Taxonomy. Trend Micro Inc. Presented by Tushar Ranka

Enabling NetFlow on Virtual Switches ESX Server 3.5

Emerald. Network Collector Version 4.0. Emerald Management Suite IEA Software, Inc.

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Bit Chat: A Peer-to-Peer Instant Messenger

2014 ASE BIGDATA/SOCIALCOM/CYBERSECURITY Conference, Stanford University, May 27-31, 2014 ASE 2014 ISBN:

Make a folder named Lab3. We will be using Unix redirection commands to create several output files in that folder.

Protocols. Packets. What's in an IP packet

CS335 Sample Questions for Exam #2

Telematics. 14th Tutorial - Proxies, Firewalls, P2P

Networks & Security Course. Web of Trust and Network Forensics

Internet Protocol: IP packet headers. vendredi 18 octobre 13

System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks

During your session you will have access to the following lab configuration. CLIENT1 (Windows XP Workstation) /24

Firewall implementation and testing

IPDR vs. DPI: The Battle for Big Data

Transcription:

Measurement Study of Wuala, a Distributed Social Storage Service Thomas Mager - Master Thesis Advisors: Prof. Ernst Biersack Prof. Thorsten Strufe Prof. Pietro Michiardi Illustration: Maxim Malevich 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 1

Agenda 1. Introduction 2. Motivation 3. Basics 4. General Experimental Setup 5. Experiments 6. Conclusion 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 2

Introduction What is Wuala? On-line storage solution Leverages resources of peers to store files Development started at ETH Zürich 2004 Company founded 2007 Launched in 2008 Acquired by LaCie 2009 Pioneer in this area - first widely-used system! Dominik Grolimund, Luzius Meisser 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 3

Features 1 GiB storage for free, else trade local storage or pay Automatic backup Share files with friends All files encrypted locally Mount on-line storage into file system hierarchy File versioning 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 4

Wuala Application Wuala Application 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 5

General Architecture 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 6

Motivation How to make sure that files are always accessible? Crucial because of high churn rate Requirements: Availability Reliability Consistency How do peers / servers contribute to achieve this? 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 7

Redundancy Required Two main approaches: Replication Erasure Coding 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 8

Replication p = node availability r = redundancy factor 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 9

Replication p = node availability r = redundancy factor Availability target of 99.9% presuming a mean node availability of 50%: r = 10 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 10

Erasure Coding d points 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 11

Erasure Coding polynomial of degree d 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 12

Erasure Coding any d points sufficient in order to reconstruct the polynomial 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 13

Erasure Coding 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 14

Erasure Coding p = node availability r = n / m = redundancy factor 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 15

Erasure Coding p = node availability r = n / m = redundancy factor Availability target of 99.9% presuming a mean node availability of 50%: r = 2.49 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 16

How can Wuala be studied? Wuala is a closed source project written in Java Java bytecode obfuscated: No debug information available Name mangling was performed Decompiling not helpful However, observing network traffic should be meaningful! 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 17

General Experimental Setup Peers Servers Internet TCP / UDP Traffic Packet Analyzer (Wireshark respectively Tstat) Wuala Software Status Information Collector System Under Test (SUT) 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 18

Used Packet Analyzer Depending on Purpose Wireshark Tstat Level of detail complete packet dump only summarized data Output binary file text file Content of packets preserved dropped Postprocessing possible limited Amount of data large small 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 19

Wireshark Example Capture 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 20

Tstat Example Output 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 21

Experiment 1: Discover Wuala Servers How many servers are supporting the network? Servers should show a different behaviour Connection endpoints (IP addresses and ports) of particular interest Presumably servers located in special subnets 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 22

Experiment 1: Discover Wuala Servers Test procedure: High level of participation in the network required: 2 machines running as storage node for 2 months 1 machine running mass download of files (20 GiB in total) for 4 days (location of data collected by crawler) All machines connected to the Internet via 100 Mbit/s link Tstat used as packet analyzer Database for ownership lookups of IP addresses 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 23

Experiment 1: Results Number of IP Adresses Country Provider Subnetwork 293 (!) Germany Hetzner Online AG 188.40.0.0/16 178.63.0.0/16 78.46.0.0/15 2 Switzerland Atrila GmbH 188.92.144.0/21 Easy to differentiate between servers and peers: Servers: UDP port 50012 Peers: random UDP port in range 7100-7600 (until specified in the application options) 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 24

Experiment 2: Where Are Files Stored? Who is responsible to store uploaded data? Test procedure: Upload files with random content and different sizes Sharing storage deactivated Use Wireshark to acquire information about endpoints, amount and content of transferred data 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 25

Experiment 2: Results File Size Protocol Stored on Special Characteristic Tiny File 0 4 KiB TCP Servers only Embedded in metadata Small File 4 KiB 292 KiB TCP Servers only Transferred to two servers Medium File 292 KiB 1 MiB UDP Servers only Erasure coding performed Big File 1 MiB 14 GiB UDP Servers and peers Erasure coding performed 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 26

Experiment 2: Results Wuala does not store data of all files on peers! Only data of files bigger than 1 MiB is sent to other peers Small files seem not to be worth the effort distributing them to peers 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 27

Experiment 3: How Are Big Files Transferred? How is fragmentation done? Test procedure: Upload 100 MiB file with random content Sharing storage deactivated Wireshark used to generate dump file Tshark (delivered with Wireshark) to filter out signaling traffic Tstat to scan dump file Observe temporary folders 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 28

Experiment 3: General Result Upload performed in following steps: 1. Encrypt file in temporary folder 2. Generation of Transmission Blocks (usage of erasure coding) 3. Upload to servers 4. Upload seems to be finished (!) 5. Subsequent upload to peers in background 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 29

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 30

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 31

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 32

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 33

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 34

Generation of Transmission Blocks 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 35

Generation of Transmission Blocks Consequences: Interleaving allows streaming of files Download from many sources in parallel Switching to different sources possible Fast downloads 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 36

Experiment 3: Results Server Upload of 100 MiB file First only upload to servers: Upload to 120 distinct servers via UDP Most transfer sessions ~1 MiB traffic, some servers receive this amount multiple times 144 MiB in total redundancy on servers! Upload with full speed Concurrent transfers Upload completed after 50 seconds 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 37

Experiment 3: Results Peer Upload of 100 MiB file Subsequent upload to peers: Upload to 250 distinct peers via UDP Most transfer sessions ~1 MiB traffic 273 MiB in total Upload with limited speed Finished after 40 minutes Overall ~400 MiB uploaded 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 38

Experiment 4: Who is serving data during download? Download 100 MiB random file again right after upload after 1 week Sharing storage deactivated Wireshark used to generate dump file Tshark (delivered with Wireshark) to filter out signaling traffic Tstat to scan dump file 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 39

Experiment 4: Results Right after Upload 21 MiB received from servers 97 MiB received from peers 118 MiB in total Download completed after 72 seconds (~1,6 MiB/s) 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 40

Experiment 4: Results 1 Week Later 100 MiB received from servers 23 MiB received from peers 123 MiB in total Download completed after 43 seconds (~2,8 MiB/s) Faster! 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 41

Experiment 5: Maintenance Since peers may leave the network forever: Are files maintained by a peer over time? add redundancy upload to the network file maintenance reconstruct retrieve from the network 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 42

Experiment 5: Maintenance Test procedure: Trading storage deactivated Upload 100 MiB file File locally deleted Use Tstat to observe the peer over 2 months 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 43

Experiment 5: Maintenance Result: No download within 2 months! However: Subsequent Upload to Peers after Download As soon as data is not well distributed on peers Similar to initial peer upload After one week 132 MiB have been uploaded to peers again Even for big files Wuala is relying on servers only! 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 44

Experiment 6: How Are Sources for a Special File Located? Is there a DHT to locate sources of files? Characteristic of a P2P network Or do servers provide this information? Characteristic of a Client / Server Architecture 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 45

Experiment 6: How Are Sources for a Special File Located? Long term observation revealed no communication to peers except for file transfers no JOIN messages (required by common DHTs!) Use Wireshark to generate a dump file before initiating a download Put focus on first UDP packets received from servers 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 46

Experiment 6: Results Repetition of special byte sequence obvious in datagrams Presumably IP addresses of servers First attempt: XOR would result in constant cipher cheap usage 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 47

Experiment 6: Results Repetition of special byte sequence obvious in datagrams Presumably IP addresses of servers 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 48

Experiment 7: Earned On-line Storage vs. Occupied Local Storage Is there any imbalance, pointing out that users earn more on-line storage than they actually offer? Procedure: Tstat observing a machine operating as storage node Duration of 90 days for data acquisition Log all status information provided by the Wuala application Log amount of used local storage 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 49

Experiment 7: Results First only depending on average on-line time and maximum shared space: After 2 GiB data has been stored: Overall a long-lasting gap between earned storage and occupied storage 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 50

Incoming Traffic over 120 Days Incoming traffic increases over time - indicates rating of peers 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 51

Outgoing Traffic over 120 Days Correlation: More data stored locally more requests for stored data ~20 KiB/s after 4 months 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 52

Conclusion Wuala only peer assisted, far away from pure P2P No DHT used on peers Lots of servers responsible to guarantee access on files in long term Not all data is uploaded to peers Imbalance between offered local storage and on-line storage Fast download / streaming possible Service dependent on company Peers used to save traffic and load on servers 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 53

Thank you! Questions? 15.12.2010 Peer-to-Peer Networks Group Thomas Mager 54