Reducing Replication Bandwidth for Distributed Document Databases
|
|
- Elvin Cross
- 7 years ago
- Views:
Transcription
1 Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2
2 Document-oriented Databases { "_id" : "55ca4cf7bad4f75b8eb5c25c", "pageid" : "46780", "revid" : "41173", "timestamp" : " T20:06:22", "sha1" : "6i81h1zt22u1w4sfxoofyzmxd "text" : The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts just as predicting, The fairy Queen, however, appears to all live happily ever after. " } Update { "_id" : "55ca4cf7bad4f75b8eb5c25d, "pageid" : "46780", "revid" : "128520", "timestamp" : " T20:11:12", "sha1" : "q08x58kbjmyljj4bow3e903uz "text" : "The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts just as predicted, The fairy Queen, on the other hand, is ''not'' happy, and appears to all live happily ever after. " } Update: Reading a recent doc and writing back a similar one 2
3 Replication Bandwidth Operation logs Primary Database WAN Operation logs { "_id" : "55ca4cf7bad4f75b8eb5c25d, "pageid" : "46780", "revid" : "128520", "timestamp" : " T20:11:12", "sha1" : "q08x58kbjmyljj4bow3e903uz "text" : "The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts just as predicted, The fairy Queen, on the other hand, is ''not'' happy, and appears to all live happily ever after. " } { "_id" : "55ca4cf7bad4f75b8eb5c25c", "pageid" : "46780", "revid" : "41173", "timestamp" : " T20:06:22", "sha1" : "6i81h1zt22u1w4sfxoofyzmxd "text" : The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts just as predicting, The fairy Queen, however, appears to all live happily ever after. " } Secondary Secondary 3
4 Operation logs Replication Bandwidth { { "_id" : "55ca4cf7bad4f75b8eb5c25d, "_id" : "55ca4cf7bad4f75b8eb5c25c", "pageid" : "46780", "pageid" : "46780", "revid" : "128520", "revid" : "41173", "timestamp" : " T20:11:12", "timestamp" : " T20:06:22", Primary "sha1" : "q08x58kbjmyljj4bow3e903uz "sha1" : "6i81h1zt22u1w4sfxoofyzmxd "text" : "The Peer and the Peri is a "text" : The Peer and the Peri is a Database comic [[Gilbert and Sullivan]] comic [[Gilbert and Sullivan]] [[operetta ]] in two acts just as [[operetta ]] in two acts just as predicted, The fairy Queen, on the other predicting, The fairy Queen, however, hand, is ''not'' happy, and appears to all appears to all live happily ever after. " live happily ever after. " } Goal: Reduce WAN } bandwidth WAN Operation logs for geo-replication Secondary Secondary 4
5 Why Deduplication? Why not just compress? Oplog batches are small and not enough overlap Why not just use diff? Need application guidance to identify source Dedup finds and removes redundancies In the entire data corpus 5
6 Traditional Dedup: Ideal Chunk Boundary Modified Region Duplicate Region Incoming Data {BYTE STREAM } Deduped Data Send dedup ed data to replicas 6
7 Traditional Dedup: Reality Chunk Boundary Modified Region Duplicate Region Incoming Data Deduped Data 4 7
8 Traditional Dedup: Reality Chunk Boundary Modified Region Duplicate Region Incoming Data Deduped Data 4 Send almost the entire document. 8
9 Similarity Dedup (sdedup) Chunk Boundary Modified Region Duplicate Region Incoming Data Delta! Dedup ed Data Only send delta encoding. 9
10 Compress vs. Dedup 20GB sampled Wikipedia dataset MongoDB v2.7 // 4MB Oplog batches 10
11 sdedup Integration Insertion & Updates Client Database Oplog Source Document Cache Source documents sdedup Encoder Dedup ed oplog entries Unsynchronized oplog entries Oplog Oplog syncer sdedup Decoder Re-constructed oplog entries Source documents Replay Database Primary Node Secondary Node 11
12 sdedup Encoding Steps Identify Similar Documents Select the Best Match Delta Compression 12
13 Identify Similar Documents Consistent Sampling Similarity Sketch Feature Index Table Target Document Rabin Chunking Candidate Documents Doc # Doc #2 Doc #3 Doc #2 Doc #3 Similarity Score 1 Doc #1 2 Doc #2 2 Doc #3 13
14 Select the Best Match Initial Ranking Rank Candidates Score 1 Doc #2 2 1 Doc #3 2 2 Doc #1 1 Final Ranking Rank Candidates Cached? Score 1 Doc #3 Yes 4 1 Doc #1 Yes 3 2 Doc #2 No 2 Is doc cached? If yes, reward +2 Source Document Cache 14
15 Evaluation MongoDB setup (v2.7) 1 primary, 1 secondary node, 1 client Node Config: 4 cores, 8GB RAM, 100GB HDD storage Datasets: Wikipedia dump (20GB out of ~12TB) Additional datasets evaluated in the paper 15
16 Compression Compression Ratio sdedup trad-dedup KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset 16
17 Memory 800 sdedup trad-dedup Memory (MB) KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset 17
18 Other Results (See Paper) Negligible client performance overhead Failure recovery is quick and easy Sharding does not hurt compression rate More datasets Microsoft Exchange, Stack Exchange 18
19 Conclusion & Future Work sdedup: Similarity-based deduplication for replicated document databases Much greater data reduction than traditional dedup Up to 38x compression ratio for Wikipedia Resource-efficient design with negligible overhead Future work More diverse datasets Dedup for local database storage Different similarity search schemes (e.g., super-fingerprints) 19
Reducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 #1 You can
More informationReducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu, Andrew Pavlo, Sudipta Sengupta Jin Li, Gregory R. Ganger Carnegie Mellon University, Microsoft Research CMU-PDL-14-108 December
More informationSpeeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,
More informationWAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression
WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract
More informationUnderstanding EMC Avamar with EMC Data Protection Advisor
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features to reduce the complexity of managing data protection
More informationData Deduplication HTBackup
Data Deduplication HTBackup HTBackup and it s Deduplication technology is touted as one of the best ways to manage today's explosive data growth. If you're new to the technology, these key facts will help
More informationA Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of
More information09'Linux Plumbers Conference
09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center cmm@us.ibm.com 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing
More informationIMPLEMENTATION OF SOURCE DEDUPLICATION FOR CLOUD BACKUP SERVICES BY EXPLOITING APPLICATION AWARENESS
IMPLEMENTATION OF SOURCE DEDUPLICATION FOR CLOUD BACKUP SERVICES BY EXPLOITING APPLICATION AWARENESS Nehal Markandeya 1, Sandip Khillare 2, Rekha Bagate 3, Sayali Badave 4 Vaishali Barkade 5 12 3 4 5 (Department
More informationDeduplication Demystified: How to determine the right approach for your business
Deduplication Demystified: How to determine the right approach for your business Presented by Charles Keiper Senior Product Manager, Data Protection Quest Software Session Objective: To answer burning
More informationProtect Data... in the Cloud
QUASICOM Private Cloud Backups with ExaGrid Deduplication Disk Arrays Martin Lui Senior Solution Consultant Quasicom Systems Limited Protect Data...... in the Cloud 1 Mobile Computing Users work with their
More informationMAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,
More informationA Deduplication File System & Course Review
A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror
More informationBackup Software Data Deduplication: What you need to know. Presented by W. Curtis Preston Executive Editor & Independent Backup Expert
Backup Software Data Deduplication: What you need to know Presented by W. Curtis Preston Executive Editor & Independent Backup Expert When I was in the IT Department When I started as backup guy at $35B
More informationSTORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside
Managing the information that drives the enterprise STORAGE Buying Guide: DEDUPLICATION inside What you need to know about target data deduplication Special factors to consider One key difference among
More information3Gen Data Deduplication Technical
3Gen Data Deduplication Technical Discussion NOTICE: This White Paper may contain proprietary information protected by copyright. Information in this White Paper is subject to change without notice and
More informationBuilding a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos
Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos Symantec Research Labs Symantec FY 2013 (4/1/2012 to 3/31/2013) Revenue: $ 6.9 billion Segment Revenue Example Business
More informationDemystifying Deduplication for Backup with the Dell DR4000
Demystifying Deduplication for Backup with the Dell DR4000 This Dell Technical White Paper explains how deduplication with the DR4000 can help your organization save time, space, and money. John Bassett
More informationTheoretical Aspects of Storage Systems Autumn 2009
Theoretical Aspects of Storage Systems Autumn 2009 Chapter 3: Data Deduplication André Brinkmann News Outline Data Deduplication Compare-by-hash strategies Delta-encoding based strategies Measurements
More informationReference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges
Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges September 2011 Table of Contents The Enterprise and Mobile Storage Landscapes... 3 Increased
More informationRead Performance Enhancement In Data Deduplication For Secondary Storage
Read Performance Enhancement In Data Deduplication For Secondary Storage A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Pradeep Ganesan IN PARTIAL FULFILLMENT
More informationCEMEX en Concreto con EMC. Jose Luis Bedolla EMC Corporation Back Up Recovery and Archiving
CEMEX en Concreto con EMC Jose Luis Bedolla EMC Corporation Back Up Recovery and Archiving 1 Agenda Cemex Challenges Avamar Overview Solution for Cemex Consulting Approach Solution Description Solution
More informationMetadata Feedback and Utilization for Data Deduplication Across WAN
Zhou B, Wen JT. Metadata feedback and utilization for data deduplication across WAN. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 31(3): 604 623 May 2016. DOI 10.1007/s11390-016-1650-6 Metadata Feedback
More informationInline Deduplication
Inline Deduplication binarywarriors5@gmail.com 1.1 Inline Vs Post-process Deduplication In target based deduplication, the deduplication engine can either process data for duplicates in real time (i.e.
More informationUnderstanding EMC Avamar with EMC Data Protection Advisor
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features that reduce the complexity of managing data protection
More informationHardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
More informationWide-area Network Acceleration for the Developing World. Sunghwan Ihm (Princeton) KyoungSoo Park (KAIST) Vivek S. Pai (Princeton)
Wide-area Network Acceleration for the Developing World Sunghwan Ihm (Princeton) KyoungSoo Park (KAIST) Vivek S. Pai (Princeton) POOR INTERNET ACCESS IN THE DEVELOPING WORLD Internet access is a scarce
More informationEstimating Deduplication Ratios in Large Data Sets
IBM Research labs - Haifa Estimating Deduplication Ratios in Large Data Sets Danny Harnik, Oded Margalit, Dalit Naor, Dmitry Sotnikov Gil Vernik Estimating dedupe and compression ratios some motivation
More informationChunkStash: Speeding up Inline Storage Deduplication using Flash Memory
ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory Biplob Debnath Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA University of Minnesota, Twin Cities, USA Abstract Storage
More informationQuanqing XU Quanqing.Xu@nicta.com.au. YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud
Quanqing XU Quanqing.Xu@nicta.com.au YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud Outline Motivation YuruBackup s Architecture Backup Client File Scan, Data
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationMongoDB Developer and Administrator Certification Course Agenda
MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL
More informationInternet Storage Sync Problem Statement
Internet Storage Sync Problem Statement draft-cui-iss-problem Zeqi Lai Tsinghua University 1 Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es
More informationData Deduplication and Corporate PC Backup
A Druva White Paper Data Deduplication and Corporate PC Backup This Whitepaper explains source based deduplication technology and how it is used by Druva s insync product to save storage bandwidth and
More informationidedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti
idedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti Advanced Technology Group NetApp 1 idedup overview/context Storage Clients
More informationEMC BACKUP-AS-A-SERVICE
Reference Architecture EMC BACKUP-AS-A-SERVICE EMC AVAMAR, EMC DATA PROTECTION ADVISOR, AND EMC HOMEBASE Deliver backup services for cloud and traditional hosted environments Reduce storage space and increase
More informationALG De-dupe for Cloud Backup Services of personal Storage Uma Maheswari.M, umajamu30@gmail.com DEPARTMENT OF ECE, IFET College of Engineering
ALG De-dupe for Cloud Backup Services of personal Storage Uma Maheswari.M, umajamu30@gmail.com DEPARTMENT OF ECE, IFET College of Engineering ABSTRACT Deduplication due to combination of resource intensive
More informationContents. WD Arkeia Page 2 of 14
Contents Contents...2 Executive Summary...3 What Is Data Deduplication?...4 Traditional Data Deduplication Strategies...5 Deduplication Challenges...5 Single-Instance Storage...5 Fixed-Block Deduplication...6
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationA Data De-duplication Access Framework for Solid State Drives
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 941-954 (2012) A Data De-duplication Access Framework for Solid State Drives Department of Electronic Engineering National Taiwan University of Science
More informationEdelta: A Word-Enlarging Based Fast Delta Compression Approach
: A Word-Enlarging Based Fast Delta Compression Approach Wen Xia, Chunguang Li, Hong Jiang, Dan Feng, Yu Hua, Leihua Qin, Yucheng Zhang School of Computer, Huazhong University of Science and Technology,
More informationSharding and MongoDB. Release 3.2.1. MongoDB, Inc.
Sharding and MongoDB Release 3.2.1 MongoDB, Inc. February 08, 2016 2 MongoDB, Inc. 2008-2015 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationCost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.
Cost Effective Backup with Deduplication Agenda Today s Backup Challenges Benefits of Deduplication Source and Target Deduplication Introduction to EMC Backup Solutions Avamar, Disk Library, and NetWorker
More informationDistributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
More informationA Survey on Aware of Local-Global Cloud Backup Storage for Personal Purpose
A Survey on Aware of Local-Global Cloud Backup Storage for Personal Purpose Abhirupa Chatterjee 1, Divya. R. Krishnan 2, P. Kalamani 3 1,2 UG Scholar, Sri Sairam College Of Engineering, Bangalore. India
More informationM710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory
SFD6 NEC *Gideon Senderov NEC $1.4B/yr in R & D Over 55 years in servers and storage (1958) SDN, Servers, Storage, Software M-Series and HYDRAstor *Chauncey Schwartz MX10-Series New models are M110, M310,
More informationFile System Management
Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation
More informationHP StoreOnce: reinventing data deduplication
HP : reinventing data deduplication Reduce the impact of explosive data growth with HP StorageWorks D2D Backup Systems Technical white paper Table of contents Executive summary... 2 Introduction to data
More informationMongoDB and Couchbase
Benchmarking MongoDB and Couchbase No-SQL Databases Alex Voss Chris Choi University of St Andrews TOP 2 Questions Should a social scientist buy MORE or UPGRADE computers? Which DATABASE(s)? Document Oriented
More informationTop Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
More informationA SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP
A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP Dilip N Simha (Stony Brook University, NY & ITRI, Taiwan) Maohua Lu (IBM Almaden Research Labs, CA) Tzi-cker Chiueh (Stony
More informationUNDERSTANDING DATA DEDUPLICATION. Thomas Rivera SEPATON
UNDERSTANDING DATA DEDUPLICATION Thomas Rivera SEPATON SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may use this material
More informationTechnical White Paper for the Oceanspace VTL6000
Document No. Technical White Paper for the Oceanspace VTL6000 Issue V2.1 Date 2010-05-18 Huawei Symantec Technologies Co., Ltd. Copyright Huawei Symantec Technologies Co., Ltd. 2010. All rights reserved.
More informationVM-Centric Snapshot Deduplication for Cloud Data Backup
-Centric Snapshot Deduplication for Cloud Data Backup Wei Zhang, Daniel Agun, Tao Yang, Rich Wolski, Hong Tang University of California at Santa Barbara Pure Storage Inc. Alibaba Inc. Email: wei@purestorage.com,
More informationTertiary Backup objective is automated offsite backup of critical data
Tertiary Backups Entire System Including Windows Installation Data Files, Emails, Databases, Documents Not Important Important Critical Ultra Critical Tertiary Backup Tertiary Backup objective is automated
More informationData deduplication is more than just a BUZZ word
Data deduplication is more than just a BUZZ word Per Larsen Principal Systems Engineer Mr. Hansen DATA BUDGET RECOVERY & DATACENTER GROWTH PRESSURE DISCOVERY REVOLUTION More Storage Longer Backups Smaller
More informationKey Components of WAN Optimization Controller Functionality
Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications
More informationDe-duplication-based Archival Storage System
De-duplication-based Archival Storage System Than Than Sint Abstract This paper presents the disk-based backup system in which only relational database files are stored by using data deduplication technology.
More informationSymantec Backup Exec Blueprints
Symantec Backup Exec Blueprints Blueprint for Remote Office Protection Backup Exec Technical Services Backup & Recovery Technical Education Services Symantec Backup Exec Blueprints 1 Symantec Backup Exec
More informationData De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication
Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication Table of Contents Introduction... 3 Shortest Possible Backup Window... 3 Instant
More informationDon t Get Duped By. Dedupe. 7 Technology Circle Suite 100 Columbia, SC 29203. Phone: 866.359.5411 E-Mail: sales@unitrends.com URL: www.unitrends.
Don t Get Duped By 7 Technology Circle Suite 100 Columbia, SC 29203 Dedupe Phone: 866.359.5411 E-Mail: sales@unitrends.com URL: www.unitrends.com 1 The purpose of deduplication is to provide more storage,
More informationOptimize VMware and Hyper-V Protection with HP and Veeam
Optimize VMware and Hyper-V Protection with HP and Veeam John DeFrees, Global Alliance Solution Architect, Veeam Markus Berber, HP LeftHand P4000 Product Marketing Manager, HP Key takeaways from today
More informationData Deduplication in a Hybrid Architecture for Improving Write Performance
Data Deduplication in a Hybrid Architecture for Improving Write Performance Data-intensive Salable Computing Laboratory Department of Computer Science Texas Tech University Lubbock, Texas June 10th, 2013
More informationThe Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc The need for new architectures Business innovation Time-to-market
More informationGET. tech brief FASTER BACKUPS
GET tech brief FASTER BACKUPS Faster Backups Local. Offsite. Remote Office. Why Should You Care? According to a recent survey from the IDG Research Group, the biggest challenge facing IT managers responsible
More informationLessons Learned while Pushing the Limits of SecureFile LOBs. by Jacco H. Landlust. zondag 3 maart 13
Lessons Learned while Pushing the Limits of SecureFile LOBs @ by Jacco H. Landlust Jacco H. Landlust 36 years old Deventer, the Netherlands 2 Jacco H. Landlust / idba Degree in Business Informatics and
More informationVeeam Best Practices with Exablox
Veeam Best Practices with Exablox Overview Exablox has worked closely with the team at Veeam to provide the best recommendations when using the the Veeam Backup & Replication software with OneBlox appliances.
More informationSharding and MongoDB. Release 3.0.7. MongoDB, Inc.
Sharding and MongoDB Release 3.0.7 MongoDB, Inc. November 15, 2015 2 MongoDB, Inc. 2008-2015 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationHybrid Cloud Storage System. Oh well, I will write the report on May1 st
Project 2 Hybrid Cloud Storage System Project due on May 1 st (11.59 EST) Start early J : We have three graded milestones Milestone 1: demo part 1 by March 29 th Milestone 2: demo part 2 by April 12 th
More informationCloud Services. May 28 th, 2014 Athens, Greece
Cloud Services May 28 th, 2014 Athens, Greece Cloud Services? Cloud services and PT PT is Virtualization technology and delivery leader Well known as storage & data protection integrator Chosen by RedHat
More informationData Reduction Methodologies: Comparing ExaGrid s Byte-Level-Delta Data Reduction to Data De-duplication. February 2007
Data Reduction Methodologies: Comparing ExaGrid s Byte-Level-Delta Data Reduction to Data De-duplication February 2007 Though data reduction technologies have been around for years, there is a renewed
More informationes T tpassport Q&A * K I J G T 3 W C N K V [ $ G V V G T 5 G T X K E G =K ULLKX LXKK [VJGZK YKX\OIK LUX UTK _KGX *VVR YYY VGUVRCUURQTV EQO
Testpassport Q&A Exam : E22-280 Title : Avamar Backup and Data Deduplication Exam Version : Demo 1 / 9 1. What are key features of EMC Avamar? A. Disk-based archive RAID, RAIN, clustering and replication
More informationHyper-converged IT drives: - TCO cost savings - data protection - amazing operational excellence
Hyper-converged IT drives: - TCO cost savings - data protection - amazing operational excellence Sebastian Nowicki SimpliVity is one of the biggest innovations in enterprise computing since ware. ~John
More informationMulti-level Metadata Management Scheme for Cloud Storage System
, pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.1.22 Multi-level Metadata Management Scheme for Cloud Storage System Jin San Kong 1, Min Ja Kim 2, Wan Yeon Lee 3, Chuck Yoo 2 and Young Woong Ko 1
More informationAsymmetric Caching: Improved Network Deduplication for Mobile Devices
Asymmetric Caching: Improved Network Deduplication for Mobile Devices ShrutiSanadhya, 1 RaghupathySivakumar, 1 Kyu-Han Kim, 2 Paul Congdon, 2 SriramLakshmanan, 1 JatinderP Singh 3 1 Georgia Institute of
More informationBarracuda Backup Deduplication. White Paper
Barracuda Backup Deduplication White Paper Abstract Data protection technologies play a critical role in organizations of all sizes, but they present a number of challenges in optimizing their operation.
More informationOnline De-duplication in a Log-Structured File System for Primary Storage
Online De-duplication in a Log-Structured File System for Primary Storage Technical Report UCSC-SSRC-11-03 May 2011 Stephanie N. Jones snjones@cs.ucsc.edu Storage Systems Research Center Baskin School
More informationSTORAGE SOURCE DATA DEDUPLICATION PRODUCTS. Buying Guide: inside
Managing the information that drives the enterprise STORAGE Buying Guide: inside 2 Key features of source data deduplication products 5 Special considerations Source dedupe products can efficiently protect
More informationUNDERSTANDING DATA DEDUPLICATION. Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s.
UNDERSTANDING DATA DEDUPLICATION Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s. SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual
More informationCloud De-duplication Cost Model THESIS
Cloud De-duplication Cost Model THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Christopher Scott Hocker
More informationE-Guide. Sponsored By:
E-Guide An in-depth look at data deduplication methods This E-Guide will discuss the various approaches to data deduplication. You ll learn the pros and cons of each, and will benefit from independent
More informationDEXT3: Block Level Inline Deduplication for EXT3 File System
DEXT3: Block Level Inline Deduplication for EXT3 File System Amar More M.A.E. Alandi, Pune, India ahmore@comp.maepune.ac.in Zishan Shaikh M.A.E. Alandi, Pune, India zishan366shaikh@gmail.com Vishal Salve
More informationPrimary Data Deduplication Large Scale Study and System Design
Primary Data Deduplication Large Scale Study and System Design Ahmed El-Shimi Ran Kalach Ankit Kumar Adi Oltean Jin Li Sudipta Sengupta Microsoft Corporation, Redmond, WA, USA Abstract We present a large
More informationUNDERSTANDING DATA DEDUPLICATION. Tom Sas Hewlett-Packard
UNDERSTANDING DATA DEDUPLICATION Tom Sas Hewlett-Packard SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may use this material
More informationNoSQL: Going Beyond Structured Data and RDBMS
NoSQL: Going Beyond Structured Data and RDBMS Scenario Size of data >> disk or memory space on a single machine Store data across many machines Retrieve data from many machines Machine = Commodity machine
More informationRelease Notes. LiveVault. Contents. Version 7.65. Revision 0
R E L E A S E N O T E S LiveVault Version 7.65 Release Notes Revision 0 This document describes new features and resolved issues for LiveVault 7.65. You can retrieve the latest available product documentation
More informationA block based storage model for remote online backups in a trust no one environment
A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, kenneth@duplicati.com) René Stach (editor, rene@duplicati.com) Abstract
More informationVDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop
VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent
More informationEffective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication
Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication 02/17/2015 2.1 Authors: Jason Basler Dan Wolfe Page 1 of 52 Document Location This is a snapshot of an on-line document.
More informationCloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
More informationGet Success in Passing Your Certification Exam at first attempt!
Get Success in Passing Your Certification Exam at first attempt! Exam : E22-280 Title : Avamar Backup and Data Deduplication Exam Version : Demo 1. What are key features of EMC Avamar? A. Disk-based archive
More informationA Survey on Deduplication Strategies and Storage Systems
A Survey on Deduplication Strategies and Storage Systems Guljar Shaikh ((Information Technology,B.V.C.O.E.P/ B.V.C.O.E.P, INDIA) Abstract : Now a day there is raising demands for systems which provide
More informationTradeoffs in Scalable Data Routing for Deduplication Clusters
Tradeoffs in Scalable Data Routing for Deduplication Clusters Wei Dong Princeton University Fred Douglis EMC Kai Li Princeton University and EMC Hugo Patterson EMC Sazzala Reddy EMC Philip Shilane EMC
More informationwww.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
More informationData Compression and Deduplication. LOC 2010 2010 Cisco Systems, Inc. All rights reserved.
Data Compression and Deduplication LOC 2010 2010 Systems, Inc. All rights reserved. 1 Data Redundancy Elimination Landscape VMWARE DeDE IBM DDE for Tank Solaris ZFS Hosts (Inline and Offline) MDS + Network
More informationPerformance and scalability of a large OLTP workload
Performance and scalability of a large OLTP workload ii Performance and scalability of a large OLTP workload Contents Performance and scalability of a large OLTP workload with DB2 9 for System z on Linux..............
More informationLecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
More informationTrends in Enterprise Backup Deduplication
Trends in Enterprise Backup Deduplication Shankar Balasubramanian Architect, EMC 1 Outline Protection Storage Deduplication Basics CPU-centric Deduplication: SISL (Stream-Informed Segment Layout) Data
More informationTHe exponential growth of mobile data traffic has led
1 Chunk and Object Level Deduplication for Web Optimization: A Hybrid Approach Ioannis Papapanagiotou, Student Member, IEEE, Robert D. Callaway, Member, IEEE, and Michael Devetsikiotis, Fellow, IEEE Abstract
More information