ViewBox: Integrating Local File System with Cloud Storage Service FAST 2014 Yupu Zhang +, Chris Dragga +*, Andrea Arpaci-Dusseau +, RemziArpaci-Dusseau + University of Wisconsin-Madison 1
Outline Introduction Motivation Design and Implementation Evaluation Conclusion 2
Introduction Cloud-based file synchronization services have become enormously popular in recent years Numerous providers: Dropbox, Google Drive, SkyDrive Large user base: Dropbox has more than 100 million users Promising benefit Reliable backup on the cloud Automatic synchronization across clients/devices 3
Motivation-Data Corruption Data Corruption Uploaded from local machine to cloud Propagated to other devices/clients 4
Data Corruption-Experiment Inject corruption to a synchronized file on disk by flipping bits through the device file of the underlying disk Execute both data operations and metadata-only operations on the corrupt file Check if corruption is propagated 5
Data Corruption Experiment L: corruption remains local LG: corruption is propagated(global) Since ZFS is able to detect local corruption, none of the synchronization clients propagate corruption 6
Data Corruption Lessons Where do synchronization services fail? Rely on file-level monitoring mechanism, e.g., inotify Cannot tell between legitimate changes and corruption Where do file systems fail? Many file systems do not checksum data 7
Motivation-Crash Inconsistency Crash inconsistency Out-of-sync synchronization 8
Crash Inconsistency-Experiment A file is synchronized at V 0 on disk and cloud Update the file from V 0 to V 1 Inject a crash and observe sync client s behavior 9
Crash Inconsistency-Experiment OOS: out-of-sync Service on ext4(ordered) produces erratic and inconsistent behavior All three services behave correctly on ZFS and ext4 with data journaling 10
Crash Inconsistency-Lessons Where do synchronization services fail? Depend on their own metadata tracking Inconsistent with file system metadata upon crash Where do file systems fail? Metadata journaling cannot provide data consistency 11
Motivation-Causal Inconsistency Causal inconsistency Files are uploaded out of order Cloud state does not match a valid FS state 12
Causal Inconsistency-Lessons Where do synchronization services fail? Synchronize files out of order Where do file systems fail? No efficient mechanism to provide a static and consistent view to sync services 13
Summery The sense of safety provided by synchronization services is largely illusory Both file systems and sync services are responsible for these failures Many file systems lack strong reliability mechanisms file system state correct state What sync clients see is different from what local file systems see cloud state file system state 14
Design and implementation-viewbox Based on ext4, Dropbox and Seafile Goals Integrity Consistency Recoverability Performance 15
ViewBox Overview Local detection No corruption/inconsistency is spread View-based Synchronization Present file system s view to sync service Basis for consistency and correct recovery Ext4-cksum View Manager Cloud-aided Recovery Restore file system to correct state upon failure Cloud Helper 16
17 ViewBox Architectrue
Ext4-cksum - Local Detection Ext4-cksum stores data checksums in a fixed-sized checksum region immediately after the inode table 32-bit CRC checksum per 4KB block 128KB checksum region for a 128MB block group 18
View Manager Create file system views Upload views to cloud through sync client Challenge 1 -How to provide consistency? Challenge 2 -How to create views efficiently? 19
How to Guarantee Consistency? Cloud journaling Treat cloud storage as external journal Synchronize local changes to cloud at FS epochs i.e., when ext4-cksum performs a journal commit Three types of views Active view (local)=>current FS state Frozen view (local) =>Last FS snapshot in memory Synced views (on cloud) => Previously uploaded views Roll back to the latest synced view upon failure 20
Synchronizing Frozen Views Create a new frozen view after the previous frozen view is synchronized and when FS reaches an epoch The state of frozen views is always static 21
Multi-client Consistency (a) the client directly applies the changes in view 1 to its frozen view and propagates those changes to the active view. (b) download view 1 first, then merges the two views 22
How to Efficiently Freeze a View? A frozen view is short-lived and kept only in memory Incremental snapshotting dirty table: to track what files and directories are modified in the active view operation log:records all successful namespace operations (e.g., create, mkdir, unlink, rmdir, and rename) in the active view 23
24 Incremental Snapshotting
Cloud Helper A user-level daemon Talks to local FS through ioctl Communicates with the server through web API Upon data corruption Fetches correct block from cloud After crash, two types of recovery Recovers damaged files Rolls back entire file system to the latest synced view 25
Evaluation 3.3GHz Intel Quad Core CPU, 16 GB memory 1TB Hitachi hard drive Linux kernel 3.6.11 (64-bit) Dropbox client 1.6.0 Seafile client and server 1.8.0 26
Cloud Helper Data Corruption D: Detected R: Recovered Crash Consistency Yes: occurred No: did not occur 27
Ext4-cksum The performance overhead is quite minimal 28
View Manager As shown under the After COW column, the overhead is negligible, because no data copying is performed. 29
View Manager Frozen view F1 F2 Active view F1 F2 30
ViewBox with Dropbox and Seafile The runtime of the workload in ViewBox is at most 5% slower and sometimes faster than that of the unmodified ext4 setup For iphoto view and iphoto edit, the synchronization time on ViewBox with Dropbox is much greater than that on ext4. This is due to Dropbox s lack of proper interface support for views 31
Conclusion Problem: Cloud storage services and file systems fail to protect data Many copies do NOT always make data safe cloud state file system state correct state Solution: ViewBox Enhance local file systems with data checksumming Present file system s view to sync service cloud state = file system state = correct state 32