Inside Dropbox: Understanding Personal Cloud Storage Services Corneliu Claudiu Prodescu School of Engineering and Sciences Jacobs University Bremen Campus Ring 1, 28759 Bremen, Germany Monday 22 nd April, 2013
Presentation Outline Introduction Dropbox Architecture Measurements and Evaluations Conclusions
Cloud Storage Services Cloud services increasingly prevalent Microsoft, Amazon, Google Cloud storage in particular very popular Dropbox, UbuntuOne, Box.com Non-trivial amount of traffic
Measurement Setup Cloud Storage Services Traffic Measurement use tstat to monitor traffic from 2 university campuses 2 points of presence in a major ISP over 42 consecutive days client/server IP addresses, TCP metrics classify traffic by TLS sever name and DNS FQDN
Initial Measurement Figure: Storage Services Usage source: ID:UPCSS Dropbox leads constantly
Dropbox Desktop Client Main characteristics Python native client for Windows / MacOS / Linux Data transfers via librsync
Dropbox Desktop Client Dropbox Servers - DNS Analysis Figure: Domain names used by Dropbox services source: ID:UPCSS Two planes are distinguished: control plane (meta-data, notification, authentication) data plane (storage)
Dropbox Desktop Client Dropbox Servers - Protocol Analysis - Issue Dropbox protocol is not publicly documented Most communication is encrypted under TLS.
Dropbox Desktop Client Dropbox Servers - Protocol Analysis - Issue Dropbox protocol is not publicly documented Most communication is encrypted under TLS. Dropbox Servers - Protocol Analysis - Solution Route traffic through Squid proxy Use Squid SSL Bump module to decrypt TLS Change at run-time the client-trusted Dropbox Inc. certificate with the one provided to Squid SSL Bump. Memory Re-writing
Dropbox Desktop Client Dropbox Servers - Protocol Insight Figure: Dropbox Protocol Example - File upload source: ID:UPCSS Notification protocol via HTTP long polling notifyx.dropbox.com Meta-data and storage protocols via TCP/SSL client-lb, client.dropbox.com
Dropbox Desktop Client Dropbox Servers - Protocol Insight Data (new files or deltas) is chunked in 4MBytes pieces Namespaces are used for each shared folder each user has an initial namespace Each device provides a unique ID
Dropbox WebUI Use separate servers for user private and share-by-link files dl-web, dl.dropbox.com Significantly less traffic Not analyzed further plot coming soon
Overall Traffic Traffic Share Figure: Dropbox vs Youtube Traffic Share in Campus 2 source: ID:UPCSS Dropbox traffic as much as 1/3 of Youtube traffic
Overall Traffic Dropbox server breakdown Figure: Traffic Share of Dropbox Servers source: ID:UPCSS Dropbox application preferred more than 80% in all vantage points
Overall Traffic Flow Sizes Figure: Distribution of Flow Sizes source: ID:UPCSS bound in range 4Kb - 400Mb significant amount of flows less than 10Kb
Throughput Analysis Throughput Figure: Upload Throughput source: ID:UPCSS 1-chunk flows bound by TCP slow-start multi-chunk flows affected by sequential chunk ACKs bundled ACKs introduced in next version
User Patterns Device Distribution Figure: Devices per user source: ID:UPCSS Most users use Dropbox with a single device
User Patterns Namespace Distribution Figure: Namespaces per user source: ID:UPCSS Most users only have their own namespace Campus users tend to share more
User Patterns Device Daily Activation Figure: Fraction of Active Devices per day source: ID:UPCSS Weekends clearly visible in campus data-sets
User Patterns Download/Upload breakdown 30% occasional users 26% download-only 7% upload-only 37% actively upload and download
User Patterns Download/Upload breakdown Figure: User Downloads/Uploads in Home 1 source: ID:UPCSS
Conclusions Dropbox is the most popular cloud storage service
Conclusions Dropbox is the most popular cloud storage service Architecture Analysis
Conclusions Dropbox is the most popular cloud storage service Architecture Analysis Usage Patterns
Conclusions Dropbox is the most popular cloud storage service Architecture Analysis Usage Patterns Throughput analysis and recommendations
Thank you Questions?