Practical Data Integrity Protection in Network-Coded Cloud Storage Henry C. H. Chen Department of Computer Science and Engineering The Chinese University of Hong Kong
Outline Introduction FMSR in NCCloud FMSR-DIP Publications Yuchong Hu, Henry C. H. Chen, Patrick P. C. Lee, and Yang Tang NCCloud: Applying Network Coding for the Storage Repair in a Cloud of-clouds Proceedings of the 10th USENIX Conference on File and Storage Technology (FAST 12) Henry C. H. Chen, and Patrick P. C. Lee Practical Data Integrity Protection in Regenerating-Coding-Based Storage To appear in the 31st International Symposium on Reliable Distributed Systems (SRDS 12)
Outline Introduction FMSR in NCCloud FMSR-DIP
Cloud Storage On-demand storage outsourcing Supports RESTful APIs: PUT, GET, DELETE, LIST
Problems in the Cloud
Data Integrity Protection Corruption detection Addressed in this work Fault-tolerance and repair Addressed in NCCloud Desirable properties Minimize cost Works on thin clouds (i.e., clouds with only basic file access semantics)
Data Integrity Protection Corruption detection Addressed in this work Fault-tolerance and repair Addressed in NCCloud Desirable properties Minimize cost Works on thin clouds (i.e., clouds with only basic file access semantics)
Data Integrity Protection Corruption detection Addressed in this work (FMSR-DIP) Fault-tolerance and repair Addressed in NCCloud Desirable properties Minimize cost Works on thin clouds (i.e., clouds with only basic file access semantics)
Related Work Single node, smart clouds PDP [Ateniese et al. 07] POR [Juels et al. 07] Multi-node, different storage schemes MR-PDP [Curtmola et al. 08] HAIL [Bowers et al. 09]
Our Work Build FMSR-DIP, a corruption detection scheme that allows byte-sampling Works on thin clouds Works on functional minimum storage regenerating (FMSR) code Targets on long-term archives
Outline Introduction FMSR in NCCloud FMSR-DIP
NCCloud Cloud 1 Users file upload file download Proxy Cloud 2 Cloud 3 Cloud 4
Contributions of NCCloud Propose an implementable design of functional minimum storage regenerating (FMSR) code Support basic read/write operations and the repair function on thin clouds Preserve storage requirements as in optimal erasure codes, while reducing repair traffic Implement and evaluate in real cloud storage
Repairing a Failed Cloud How to repair: Proxy Cloud 1 Cloud 2 Cloud 3 Cloud 4 Cloud 5 Repair traffic = + + Goal: minimize repair traffic
Reed-Solomon Codes Node 1 Node 2 A B A B File of size M Proxy Reed Solomon codes Repair traffic = M Node 3 Node 4 A+B A+2B B A+B A A Conventional repair: n = 4, k = 2 (n, k) MDS code: Any k out of n storage nodes (clouds) can rebuild original file. Reconstruct whole file and generate data in new node
FMSR in NCCloud Node 1 Node 2 P1 P2 P3 P4 A B C D File of size M Proxy F-MSR codes Repair traffic = 0.75M Node 3 Node 4 P5 P6 P7 P8 P3 P5 P7 P1 P2 P1 P2 n = 4, k = 2 Code chunk P i = linear combination of original native chunks Repair in FMSR: Download one code chunk from each surviving node Reconstruct new code chunks (via random linear combination) in new node
FMSR Property Proxy File partition k(n-k) chunks A B C D encode n(n-k) chunks P1 P2 P3 P4 P5 P6 P7 P8 distribute Storage nodes P1 P2 P3 P4 P5 P6 P7 P8 n=4, k=2
FMSR Property Proxy File partition k(n-k) chunks A B C D encode n(n-k) chunks P1 P2 P3 P4 P5 P6 P7 P8 distribute Storage nodes P1 P2 P3 P4 P5 P6 P7 P8 c 1,1 c 1,2 c 1,3 c 1,4... c 8,1 c 8,2 c 8,3 c 8,4 A B C D P1 P2 P3 P4 P5 P6 P7 P8 n=4, k=2 Encoding matrix rank = k(n-k) Native chunks Code chunks
NCCloud: Experiments Testbed environment Local cloud Openstack Swift 1.4.2 1 proxy node connected to 15 storage nodes (LAN) NCCloud deployed on proxy node Commercial cloud Microsoft Azure Storage schemes (4,2)-Reed-Solomon vs. (4,2)-FMSR
Response time (s) REPAIR Response time (s) DOWNLOAD Response time (s) UPLOAD 50 40 30 20 10 0 12 10 8 6 4 2 0 Response time: Local Cloud 1 10 50 100 200 300 400 500 1 10 50 100 200 300 400 500 RAID-6 RS F-MSR FMSR File size (MB) RAID-6 RS F-MSR FMSR File size (MB) FMSR has higher response time due to encoding/decoding overhead FMSR has slightly less response time in repair, due to less data download 35 30 25 20 15 10 5 0 1 10 50 100 200 300 400 500 RAID-6(native) RS chunk repair) RAID-6(parity) RS (code chunk repair) F-MSR FMSR File size (MB) 20
Response time (s) REPAIR Response time (s) DOWNLOAD Response time (s) UPLOAD Response time: Commercial Cloud 6 RAID-6 RS 4 F-MSR FMSR 2 0 1 2 5 10 File size (MB) 2.5 2 1.5 1 0.5 0 1 2 5 10 RAID-6 RS F-MSR FMSR File size (MB) No distinct response time difference, as network fluctuations play a bigger role in actual response time 6 5 4 3 RAID-6(native) RS chunk repair) RAID-6(parity) RS (code chunk repair) F-MSR FMSR 2 1 0 1 2 5 10 File size (MB) 21
Outline Introduction FMSR in NCCloud FMSR-DIP
FMSR-DIP: Design Goals Preserves advantage of FMSR Works on thin clouds Supports sampling to minimize cost Works against a Byzantine, mobile adversary Exhibits arbitrary behaviors Corrupts different subsets of servers over time
FMSR-DIP: Overview Cloud 1 Users file upload file download Proxy FMSR- DIP Cloud 2 Cloud 3 Cloud 4 Four operations: Upload, Check, Download and Repair
FMSR-DIP: Upload 8 FMSR code chunks, 3 bytes each
FMSR-DIP: Upload Apply error-correcting code (ECC) to each chunk individually
FMSR-DIP: Upload XOR each byte with a pseudorandom value
FMSR-DIP: Upload For each chunk, calculate the MAC of the first 3 bytes
FMSR-DIP: Upload Upload the chunks to clouds Encrypt the metadata from NCCloud (which contains the encoding matrix) Append all MACs to metadata Replicate metadata on all nodes
FMSR-DIP: Check Pick a row to check
FMSR-DIP: Check XOR with the previous pseudorandom values, and check their consistency
Recall: FMSR Encoding c 1,1 c 1,2 c 1,3 c 1,4 P 1 c 2,1 c 2,2 c 2,3 c 2,4 c 3,1 c 3,2 c 3,3 c 3,4 A P 2 P 3 c 4,1 c 4,2 c 4,3 c 4,4 B P 4 c 5,1 c 5,2 c 5,3 c 5,4 C P 5 c 6,1 c 6,2 c 6,3 c 6,4 D P 6 c 7,1 c 7,2 c 7,3 c 7,4 c 8,1 c 8,2 c 8,3 c 8,4 Encoding matrix rank = k(n-k) Native chunks P 7 P 8 Code chunks
FMSR-DIP: Download Download chunks from any 2 nodes and verify with their MACs
FMSR-DIP: Download Remove pseudorandom values and pass to NCCloud for decoding
FMSR-DIP: Repair
FMSR-DIP: Repair Download 1 chunk from all other nodes and verify with their MACs
FMSR-DIP: Repair Remove pseudorandom values and pass to NCCloud
FMSR-DIP: Repair NCCloud generates new chunks
FMSR-DIP: Repair Process the newly generated chunks as before
FMSR-DIP: Repair Upload chunks and update metadata on all nodes
FMSR-DIP: Experiments Testbed environment Openstack Swift 1.4.2 1 proxy node connected to 15 storage nodes (LAN) NCCloud and FMSR-DIP deployed on proxy node NCCloud uses RAMDisk as storage Storage scheme (4,2)-FMSR
REPAIR Time taken(s) DOWNLOAD Time taken(s) UPLOAD Time taken (s) 25 20 15 10 5 0 8 6 4 2 Running Time vs. File Size 100MB 50MB 20MB 10MB 5MB 1MB Transfer-Up DIP-Encode FMSR File size Transfer-Down DIP-Decode FMSR FMSR-DIP overhead comparable to network transfer time in a LAN environment 0 20 15 10 5 0 100MB 50MB 20MB 10MB 5MB 1MB 100MB 50MB 20MB 10MB 5MB 1MB File size Transfer-Up Transfer-Down DIP-Encode DIP-Decode FMSR File size
Time taken (s) Time taken (s) The Check Operation 80 70 60 50 40 30 20 10 0 1% check 256B 1KB 4KB 7KB 25KB 256KB Misc. Transfer-Down Rank Checking PRF Download block size Bottleneck in network transfer 30 25 20 15 10 5 256KB download block size Misc. Transfer-Down Rank Checking PRF 0 100% 75% 50% 25% 10% 5% 1% Checking percentage
Conclusions Propose a design for efficient data integrity protection using FMSR on thin clouds Implement and evaluate the efficiency of the design Source code: NCCloud http://ansrlab.cse.cuhk.edu.hk/software/nccloud/ FMSRDIP http://ansrlab.cse.cuhk.edu.hk/software/fmsrdip/
Thank You!
Error Localization Assume each byte is correct in turn
Error Localization Form a system with bytes from k other nodes
Error Localization Mark all involved bytes as correct if system is consistent
Error Localization Try all subsets
Error Localization Try all subsets
Error Localization And so on
Cloud Storage Pricing S3 Rackspace Azure Storage (per GB) $0.125 $0.15 $0.125 Data transfer in (per GB) free free Free Data transfer out (per GB) $0.12 $0.18 $0.12 PUT (per 10,000 requests) $0.10 free $0.01 GET (per 10,000 requests) $0.01 free $0.01 Pricing in US dollars, as of May, 2012