Internet Storage Sync Problem Statement

Similar documents
Background. Personal cloud services are gaining popularity

QuickSync: Improving Synchronization Efficiency for Mobile Cloud Storage Services

A First Look at Mobile Cloud Storage Services: Architecture, Experimentation and Challenge

Inside Dropbox: Understanding Personal Cloud Storage Services

MOBILE APPLICATIONS AND CLOUD COMPUTING. Roberto Beraldi

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

AWS Security & Compliance

Cloud Sync White Paper. Based on DSM 6.0

Reducing Replication Bandwidth for Distributed Document Databases

Cloud Based Tes,ng & Capacity Planning (CloudPerf)

Berkeley Ninja Architecture

Inside Dropbox: Understanding Personal Cloud Storage Services

ViewBox: Integrating Local File System with Cloud Storage Service

Introduction to Cloud Storage GOOGLE DRIVE

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies

SOLUTIONS CLOUD - DPS JEUDI 19 NOVEMBRE 2015

Understanding and Detec.ng Real- World Performance Bugs

In the Cloud. Scoville Memorial Library February, 2013

Amazon S3 Cloud Backup Solution Contents

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

How To Manage A Mobile Device Management At Harvard

Accelerating Cloud Based Services

12 Key File Sync and Share Advantages of Transporter Over Box for Enterprise

Multi-level Metadata Management Scheme for Cloud Storage System

Gmail Or other POP3

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM

PipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery. Razvan Ghitulete Vrije Universiteit

How To Synchronize With A Cwr Mobile Crm 2011 Data Management System

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering

Chapter 19 Cloud Computing for Multimedia Services

Transforming cloud infrastructure to support Big Data Ying Xu Aspera, Inc

Cloud Panel Service Evaluation Scenarios

Inside Dropbox: Understanding Personal Cloud Storage Services

Computer Networks. Examples of network applica3ons. Applica3on Layer

Two-Level Metadata Management for Data Deduplication System

ESPRESSO: An Encryption as a Service for Cloud Storage Systems

Cloud Computing. Chapter 4 Infrastructure as a Service (IaaS)

ICN based Scalable Video Conferencing on Virtual Edge Service Routers (VSER) Platform

HiDrive Intelligent online storage for private and business users.

Voice. Internet. Apps. Data Center. Wide Area Networks. Business is better in the cloud

2013 USER GROUP CONFERENCE

On-line Storage and Backup Services

SCOPE OF SERVICE Hosted Cloud Storage Service: Scope of Service

Amazon Cloud Storage Options

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

SureDrop Secure collaboration. Without compromise.

SOP Common service PC File Server

Quanqing XU YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud

Overview. Timeline Cloud Features and Technology

NAS 259 Protecting Your Data with Remote Sync (Rsync)

Hardware Configuration Guide

Enterprise Private Cloud Storage

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

GCM for Android Setup Guide

Storing & Synchronizing Data In The Cloud

Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning. MIS 5206 Protec/ng Informa/on Assets Greg Senko

Computer Backup Issues For Windows 8

PRIVACY-PRESERVING PUBLIC AUDITING FOR SECURE CLOUD STORAGE

Dropbox for Business. Secure file sharing, collaboration and cloud storage. G-Cloud Service Description

Cloud Computing for Education Workshop

Non-Cooperative Computation Offloading in Mobile Cloud Computing

Book of Abstracts CS3 Workshop

The most comprehensive review and comparison of cloud storage services

CommUlinks of Colorado

You will need your District Google Mail username (e.g. and password to complete the activation process.

Redefining Microsoft SQL Server Data Management. PAS Specification

Aspera Direct-to-Cloud Storage WHITE PAPER

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

EonStor DS remote replication feature guide

icloud for Developers

Maginatics Cloud Storage Platform for Elastic NAS Workloads

The Genealogy Cloud: Which Online Storage Program is Right For You Page , copyright High-Definition Genealogy. All rights reserved.

Competitive Analysis Retrospect And Our Competition

Transcription:

Internet Storage Sync Problem Statement draft-cui-iss-problem Zeqi Lai Tsinghua University 1

Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Summary 2

The way we store our data 3

Internet Storage Sync Services New data entrance of the Internet Basic func?on: storing, sharing and synchronizing data Large user base: Dropbox has more than 400 million users Significant traffic: Dropbox accounts for approximately 4% of the total Internet traffic [IMC 2012] 4

Internet Storage Sync Services New data entrance of the Internet Major players: Dropbox, Google Drive, One Drive, Box.com, Apple Combining other services via APIs: photo sharing, email azachment, social apps 5

Typical Architecture & Flow Typical architecture of ISS services Control flow: exchanging metadata Storage flow: exchanging contents Sync process with your mul?ple clients 6

Capabilities in Sync Protocol Key storage capabili?es [IMC 13] Chunking: spli_ng a large file into mul?ple units Bundling: mul?ple small chunks as a single one Deduplica2on: avoiding the retransmission of content already available in the server Delta- encoding: upda?ng the modified por?on 7

Outline Background Problem Statement (draa- cui- iss- problem) Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Summary 8

Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Summary 9

Using Multiple ISS Services Users may use mul?ple services Performance or func?onality diversity Dropbox works bezer for synchronizing docs Google Drive connects to Gmail and Google Doc BaiDu cloud provides 2TB free space 10

However that is not easy For users Users may install mul?ple similar clients It is unable to synchronize data across services (e.g. sync between a Dropbox user and a Google Drive user) For applica?on developers A developer has to deal with many different APIs in order to connect his app with mul?ple sync services 11

Using a Private ISS Service Enterprise may want their own storage Public ISS services may not be trusted Like what email is doing It is difficult to build and use a private ISS service There is no standard sync protocol Need to start from scratch 12

Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Summary 13

Rethinking about Capabilities Ideal With these capabili?es, sync services can efficiently synchronize our data Reality The sync 2me is s2ll much longer than expected with various network condi?ons! Measurement study We measured several sync services to iden?fy and analyze the sync inefficiency problem 14

Impact of Missing Capabilities Bandwidth inefficiency Sync is not efficient for large # of small files in high RTT condi?ons because the client waits for an app- level ACK before sending next chunk Bundling is quite important! 15

Impact of Misusing Capabilities Deduplica?on is NOT always efficient More effec?ve dedup does not work well in good network condi?ons because of its high computa?on overhead Network-aware dedup may be important DER: the ra?o of the deduplicated file size to the original file size 16

Impact of Misusing Capabilities Delta-encoding fails with fixed- size chunking 3 basic file opera?ons (flip bits, insert, delete) Changing 2MB of a 10MB file leads to more than 6MB sync traffic TUO: Traffic data / modified data 17

Impact of Misusing Capabilities Why the delta- encoding fails? A large file is split into mul?ple chunks Delta- encoding is performed between chunks But modifica?ons will move cut points! 18

Measurement Conclusion Missing or Misusing these key capabili?es leads to the sync inefficiency problem Challenges of improving sync efficiency Are these capabili?es enough? Should we combine these storage techniques with network parameters (e.g. delay, loss and etc.)? And how? 19

Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Summary 20

Exploration on Capabilities QuickSync [MobiCom15] with 3 techniques Propose network- aware content- defined chunker to iden?fy redundant data Design improved incremental sync approach that correctly performs delta- encoding between similar chunks to reduce sync traffic Delay- batched ACK to improve sync throughput 21

QuickSync Implementation Implementa?on over Dropbox Unable to directly modify Dropbox, so we design a proxy- based architecture built on Amazon EC2 Implementa?on over Seafile The proxy- based architecture adds overhead Full implementa?on with Seafile (open source) 22

Impact of Network-aware Chunker Network- aware Chunker Larger chunks in good network condi?ons, make aggressive chunking in slow networks Performance results 200GB backup; up to 31% speed improvement Network- aware chunker works well 23

Integrated System Performance Setup Prac?cal sync workloads on Windows / Android Performance results (Win / Android) Traffic size reduc?on: up to 80.3% / 63.4% Sync?me reduc?on: up to 51.8% / 52.9% 24

Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es Future work 25

Related Work WebDAV [RFC 4918], Git These efforts focus on authoring and versioning Can not well support large files Rsync Delta- encoding algorithm only works well in file granularity Different from ISS ISS focuses on the sync opera?on Other important capabili?es are closely related and required (e.g. chunking, deduplica?on) 26

Future Work Goal: usability & capabili?es Easier to use mul?ple storage sync services Easier to build a private sync service Achieve interoperability Reasonably configure capabili?es Possible solu?on: standard sync protocol Standardize the sync process and capabili?es Want to apply IETF Transport and Security exper?se 27

References Problem Statement: hzp://datatracker.iet.org/doc/draa- cui- iss- problem/ Wiki: hzps://github.com/iss- iet/iss/wiki/internet- Storage- Sync QuickSync [MobiCom2015]: hzp://www.4over6.edu.cn/cuiyong/cindex.html A First Look at Mobile Cloud Storage Services [IEEE Network Magazine]: hzp://www.4over6.edu.cn/cuiyong/cindex.html 28