Quanqing XU YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud

Size: px

Start display at page:

Download "Quanqing XU [email protected]. YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud"

Junior Peter Conley
10 years ago
Views:

1 Quanqing XU YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud

2 Outline Motivation YuruBackup s Architecture Backup Client File Scan, Data De-duplication and Data Transmission Metadata Server Communication with Clients, Global Fingerprint Lookup and Store, and Highly Scalable Cluster of Metadata Servers Demo Preliminary experimental results Development status 2

Clients, Global Fingerprint Lookup and Store, and Highly Scalable Cluster

3 Motivation Yuruware needs incremental backup in the cloud Cloud storage providers High reliability and scalability at low cost Ultra large-scale storage space 905 billion objects in Amazon S3, Q1/2012 Customers Backup and restore progressive data within short time Backup up to petabytes of data in total To build a large-scale cloud backup system System scalability Storage efficiency Backup and restoration performance NICTA Copyright [1]

Backup up to petabytes of data in total To build a large-scale cloud backup system System scalability Storage efficiency Backup and

The Architecture of YuruBackup To increase scalability to accommodate PB-scale data To improve space efficiency to reduce costs To save bandwidth to adapt to the low bandwidth of WAN Metadata of

4 The Architecture of YuruBackup To increase scalability to accommodate PB-scale data To improve space efficiency to reduce costs To save bandwidth to adapt to the low bandwidth of WAN Metadata of PB-scale data Backup Agent Write master Source-side De-duplication PB-scale space A cluster of metadata servers Target-side De-duplication slave Metadata Agent slave Snapshots Cloud Storage Read Read RPC, parallel transmission, data/metadata separation 4

Write master Source-side De-duplication PB-scale space A cluster of metadata servers Target-side De-duplication

5 Storage Hierarchy Snapshot A virtual file Collection Block Chunk Snapshot A Snapshot B Collection Block Chunk 5

6 Mapping blocks from memory to disk A block <collectionuuid, blockno, checksum, start, length> Components Memory Block, Block Proxy and TAR Store Memory Block... Memory Block... Memory Block In Memory Block Proxy TAR Store In Disk Collection Collection Collection 6

7 The Flow Chart of Backup Process Create DB connection to metadata catalog Initialize the TAR store T Initialize the Metadata Manager Scan a directory to get a file list The file list is empty? Yes Release the Metadata Manager Release the TAR store Close DB connection to metadata catalog No Remove a file and write its incremental backup into T T s size >= a given size? Yes Write T into disk and clear it No 7

Yes Release the Metadata Manager Release the TAR store Close DB connection to metadata catalog No

8 Backup Client It provides a functional interface to users. Backup and restoration To reduce I/O requests Read/Write Buffer To locate items Compressed BF Berkeley DB Source-side dedup CD Chunking Transmission Batched RPC Parallel uploading 8

Buffer To locate items Compressed BF Berkeley DB

9 Source-side de-duplication Rabin s Fingerprinting Given a string A = a m a m-1 a 1 A k-bit Rabin fingerprint is computed as follows: m 1 m 2 Let, A( t) a t a t a t a m m 1 Choose an irreducible polynomial P(t) P k k 1 ( t) pkt pk 1t p0 Compute Rabin s fingerprint f(a) f ( A) A( t) mod P( t) Content-defined Chunking (SOSP 01) low_order(f, k) = c 2 1 C 1 C 2 C 3... [1] Muthitacharoen A, Chen B, Maziéres D. A low-bandwidth network file system. In: Proc. of the 18th ACM Symp. on Operating System Principles (SOSP 2001). New York: ACM Press, w 9

t) mod P( t) Content-defined Chunking (SOSP 01) low_order(f, k) = c 2 1 C 1 C 2 C 3... [1] Muthitacharoen A, Chen B, Maziéres D.

10 Duplication Detection based on Bloom filter Observations Most files are never changed after their creations (ATC 04) Over 2/3 of files have not been modified (FAST 07) Index Summary based on Compressed BF(ACM 70, PODC 01) Approximate set membership problem Trade-off between space and false positive probability Three functions 1) Initialize(initElementCount, desiredfpp) 2) Insert(fingerprint) 3) Lookup(fingerprint) [1] Burton H. Bloom. Space/time trade-o s in hash coding with allowable errors. ACM Communications, 13(7), [2] Mitzenmacher. Compressed Bloom Filters. In Twentieth ACM Symposium on Principles of Distributed Computing, August

1) Initialize(initElementCount, desiredfpp) 2) Insert(fingerprint) 3) Lookup(fingerprint) [1] Burton H. Bloom.

11 Metadata Server Communication with Clients A single, batched and asynchronous lookup RPC for n FPs The callback function enqueues the updated request Global FP Lookup and Store Global Index Summary Global target-side deduplication FP Lookup FP Store 11

enqueues the updated request Global FP Lookup and Store Global

12 Highly Scalable Cluster of MDSs SQL Nodes with NDB YuruBackup Clients Load Balancer DataNodes Slaves Masters SQL Nodes with NDB+InnoDB Data replication To make reads scalable MySQL replication Failover Data partitioning To make writes scalable MySQL cluster Read Write Replication Load balancing To aware of which nodes are readable and writable 12

..... Masters SQL Nodes with NDB+InnoDB Data replication To make reads scalable MySQL

13 Demo of YuruBackup Chunk Partition Duplication Detection 13

14 An example of a snapshot (5 new blocks) B 1 B 3 B 5 B 7 B 12 14

15 An example of incremental backup emacs-23.2a emacs-23.3a 15

16 Comparison ReducedRatio = Datasets Hbase (97.5) ,462 4, Average (162.8) ,144 17, Nonoverlap data size (MB) # BytesSentByRsync - # BytesOfData - # BytesOfMetadata # BytesSentByRsync rsync Transferred data size (MB) Transferred data size (MB) Table 1. Dataset YuruBackup # chunks Data Metadata # old chunks # new chunks Emacs (155.9) ,731 11, Eclipse (234.9) , GCC (428.6) ,386 9, Hadoop-src (214.1) ,365 15, Hadoop-bin (110.5) , Lucene-src (64.8) , Lucene-bin (156.4) ,191 26, Hive-src (144.0) ,072 7, Hive-bin (21.7) , (%) 16

Dataset YuruBackup # chunks Data Metadata # old chunks # new chunks Emacs 140.2 155.7 (155.9) 60.4 1.6 15,731 11,484 61.23 Eclipse 234.4 233.0 (234.9) 220.3 1.1 277 84,317 5.53 GCC 107.8 94.7 (428.

17 Others YuruBackup is deployed atop Amazon S3 metadata servers are running in EC2 will be deployed in other cloud platforms Performance evaluation De-duplication Efficiency De-duplication Overhead Scalability Backup Window Fine-granularity Restoration, etc. 17

Performance evaluation De-duplication Efficiency De-duplication

18 Current Development Status Program directories (~12,000 LOC) include: header files, ~1,200 LOC src: source files, ~5,200 LOC 18

19 Thank you! Q&A

20 Dataset OverlapRatio = OverlapDataSize TransferredDataSize Emacs eclipse gcc Hadoopsrc Hadoopbin Objects # Files Data size (MB) 23.2a 4, a 4, galileo 2, Helios-SR2 2, , , , , # Overlap Files Overlap data size (MB) (%) (10.09) (0.21) 70, (74.86) 3, (56.56) (66.36) 20

6 0.20.204.0 5,811 208.0 0.20.205.0 6,004 214.1 0.20.204.0 507 105.0 0.20.205.0 538 110.

21 Dataset lucenesrc Lucenebin Hive-src Hive-bin Objects # Files Data size (MB) , , , , , , hbase , , Linux shell: diff urnas v1 v2 # Overlap Files Overlap data size (MB) (%) 2, (73.58) (8.51) 3, (34.10) (73.88) 1, (50.81) Return 21

22 The rsync Algorithm f.old f.new A 2. A sends the checksums to B 4. B tells A how to construct file f.new from f.old and the literal data. B 1. A computes the checksum of each block S i in file f.old 3. B searches the file f.new and find the difference between f.old and f.new. The checksum consist of rolling 32-bit checksums (adler-32 checksum) and a 128-bit MD4 checksum. Return 22

Multi-level Metadata Management Scheme for Cloud Storage System

, pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.1.22 Multi-level Metadata Management Scheme for Cloud Storage System Jin San Kong 1, Min Ja Kim 2, Wan Yeon Lee 3, Chuck Yoo 2 and Young Woong Ko 1