Object Storage, Cloud Storage, and High Capacity File Systems



Similar documents
Alternatives to Big Backup

Storage Design for High Capacity and Long Term Storage. DLF Spring Forum, Raleigh, NC May 6, Balancing Cost, Complexity, and Fault Tolerance

Trends in Enterprise Backup Deduplication

Object Oriented Storage and the End of File-Level Restores

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

EMC DATA DOMAIN PRODUCT OvERvIEW

RAID Overview

Keys to Successfully Architecting your DSI9000 Virtual Tape Library. By Chris Johnson Dynamic Solutions International

Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level. Jacob Farmer, CTO, Cambridge Computer

Turnkey Deduplication Solution for the Enterprise

EMC Data de-duplication not ONLY for IBM i

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

S O L U T I O N P R O F I L E. Riverbed and EMC Deliver Capacity-Optimized Cloud Storage for Backup, Recovery, Archiving, and DR

PARALLELS CLOUD STORAGE

Using object storage as a target for backup, disaster recovery, archiving

EMC DATA DOMAIN OPERATING SYSTEM

Hitachi NAS Platform and Hitachi Content Platform with ESRI Image

EMC DATA DOMAIN OPERATING SYSTEM

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Reliability and Fault Tolerance in Storage

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

1 Storage Devices Summary

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows

A Deduplication File System & Course Review

WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January Permabit Technology Corporation

ANY SURVEILLANCE, ANYWHERE, ANYTIME

Protecting Information in a Smarter Data Center with the Performance of Flash

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

Long term retention and archiving the challenges and the solution

Scalable Storage for Life Sciences

SoftLayer Fundamentals. Storage and Backup. August, 2014

Disk-to-Disk-to-Tape (D2D2T)

STORAGE SOURCE DATA DEDUPLICATION PRODUCTS. Buying Guide: inside

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

EMC BACKUP MEETS BIG DATA

Business Benefits of Data Footprint Reduction

LEVERAGING EMC SOURCEONE AND EMC DATA DOMAIN FOR ENTERPRISE ARCHIVING AUGUST 2011

Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst

Data Management & Storage for NGS

Efficient Backup with Data Deduplication Which Strategy is Right for You?

June Blade.org 2009 ALL RIGHTS RESERVED

Data Domain Overview. Jason Schaaf Senior Account Executive. Troy Schuler Systems Engineer. Copyright 2009 EMC Corporation. All rights reserved.

WHITE PAPER WHY ORGANIZATIONS NEED LTO-6 TECHNOLOGY TODAY

Trends in Application Recovery. Andreas Schwegmann, HP

Data Protection. the data. short retention. event of a disaster. - Different mechanisms, products for backup and restore based on retention and age of

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

WHITE PAPER. Reinventing Large-Scale Digital Libraries With Object Storage Technology

LDA, the new family of Lortu Data Appliances

M710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory

Replication and Erasure Coding Explained

Miguel Ortiz, Sr. Systems Engineer. Globanet

Business-Centric Storage FUJITSU Storage ETERNUS CS800 Data Protection Appliance

HP StoreOnce D2D. Understanding the challenges associated with NetApp s deduplication. Business white paper

Business-centric Storage FUJITSU Storage ETERNUS CS800 Data Protection Appliance

WHITE PAPER: customize. Best Practice for NDMP Backup Veritas NetBackup. Paul Cummings. January Confidence in a connected world.

Symantec NetBackup PureDisk Optimizing Backups with Deduplication for Remote Offices, Data Center and Virtual Machines

EMC arhiviranje. Lilijana Pelko Primož Golob. Sarajevo, Copyright 2008 EMC Corporation. All rights reserved.

What is RAID and how does it work?

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside

Intro to AWS: Storage Services

Dell PowerVault DL2200 & BE 2010 Power Suite. Owen Que. Channel Systems Consultant Dell

Backup and Recovery 1

EMC DATA DOMAIN DATA INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

DATA BACKUP & RESTORE

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

ntier Verde Simply Affordable File Storage

How To Make A Backup System More Efficient

Globus and the Centralized Research Data Infrastructure at CU Boulder

Presents. Attix5 Technology. An Introduction

Arif Goelmhd Goelammohamed Solutions Hyperconverged Infrastructure: The How-To and Why Now?

Symantec NetBackup Appliances

Moving Beyond RAID DXi and Dynamic Disk Pools

VMware vsphere Data Protection 6.0

Symantec NetBackup deduplication general deployment guidelines

Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

HP Store Once. Backup to Disk Lösungen. Architektur, Neuigkeiten. rené Loser, Senior Technology Consultant HP Storage Switzerland

Object Storage: Out of the Shadows and into the Spotlight

Tier 2 Nearline. As archives grow, Echo grows. Dynamically, cost-effectively and massively. What is nearline? Transfer to Tape

Backup & Disaster Recovery Options

Nexsan Assureon for Healthcare Introduction

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Quantum StorNext. Product Brief: Distributed LAN Client

Chapter 12: Mass-Storage Systems

3Gen Data Deduplication Technical

Building Storage Clouds for Online Applications A Case for Optimized Object Storage

Introduction to Data Protection: Backup to Tape, Disk and Beyond

Get Success in Passing Your Certification Exam at first attempt!

IBM Spectrum Protect in the Cloud

Transcription:

A Primer on Object Storage, Cloud Storage, and High Capacity File Presented by: Chris Robertson Sr. Solution Architect Cambridge Computer Copyright 2010-2011, Cambridge Computer Services, Inc. All Rights Reserved www.cambridgecomputer.com 781-250-3000

About Your Lecturer: Chris Robertson SA at Cambridge Computer 25% of my time I do what industry analysts do 75% of my time is client-facing, solving problems and reconciling to budgets Cambridge Computer Expertise in storage networking, data protection, and data life cycle management Founded in 1991 Based in Boston with regional teams spread around the country Unique business model with no costs or commitments to our clients (ask us how this is possible) Clients of all shapes and sizes Museums, K12, Defense Contractors, Banks, etc. Everyone has data. No one wants to lose it! 2

A Unique Business Model: Combining the Best of All Worlds... 3

What is Cloud Storage? The Cloud has the same challenges that any other enterprise has The design challenges of cloud storage are relevant to private users with large private data collections. Cloud storage has three major incarnations Enterprise storage for applications that are hosted in the cloud Dynamic provisioning of storage with careful attention to balancing capacity and performance. Hosted backups Granular / efficient backups with backups with data automatically stored off site Redundant object storage Geographically dispersed, redundant storage for data that does not change much. 4

A Typical Cloud Service: 3 Copies of Each Object Stored Somewhere 5

The Cloud is Accessed Through SOAP/ REST Software Interface File or block interface SOAP/REST interface Dedicated appliance and/or software app On-Ramp 6

Doubling is Serious Business 7

Traditional Storage Models Don t Scale Data accumulates over time If your primary storage capacity doubles, then BOTH the CAPACITY and the SPEED of your backup system must double. Backups take too long. Restores take too long. Storage devices become BRITTLE as they get bigger and bigger The bigger they are, the harder they fall Wholesale data migration between storage devices is impractical. Massive storage systems must allow for in place upgrades. 8

Moving a PB is Heavy Lifting Data Rate Example Total Time (Approximate) 140MB/Sec 1GB/Sec LTO-5 tape drive at full tilt without factoring in compression A beefy Virtual Tape Library A dedicated 10Gb Ethernet 82.5 days 11 days 1.5mb/Sec A dedicated T-1 176 years 156mb/Sec An OC3 640 days 2488mb/Sec An OC48 40 Days 9

What is Wrong with RAID? 10

Bigger Hard Drives: Friend or Foe? The Good News: As drives grow bigger we can achieve more capacity with fewer devices Fewer devices = higher density, lower power consumption, fewer device failures The Bad News MTBF not growing as fast Bandwidth into device not growing as fast Consequences Unreliability (per bit) growing Accessibility of data (per bit) shrinking Drive rebuild times are longer, which increases overall risk of data loss Rebuilding failed drives has a heavier impact on performance 11

RAID Rebuilds Take Too Long RAID 5 rebuilds take too long On the order of 36 hours per TB 4TB drive could take a week to rebuild RAID 6 (double parity offers some protection) But what happens when we have 8TB drives? The more stuff you have the higher the chance of failures. If you have 1PB or more, something will always be broken 12

Redundancy Between Cabinets: Can You Have Too Much Redundancy? Is this really a good idea? How long will it take to re-mirror a 14TB RAID 6 stripe? Is there a better way to protect against a device failure? Replication? Backup? Mirroring at a different level of abstraction? 13

How Big is the Building Block? What Are You Building? What Size Building Block? An outhouse? The foundation for a new house? A pyramid? Brick Cinder Block Boulder A parking garage? Grains of Sand (Concrete) 14

Object Storage More than Just the Cloud 15

Objects Represent a Different Way to Address Data Block Blocks are addressed by Device ID and sequential block number. File Object Files are addressed by UNC paths: \\MyServer\MyFolder\MyFile.doc Objects are addressed by an ID that is unique to the storage system. - Sequentially assigned number - Randomly assigned number - A hash derived as a function of the objects content - A combination of things 16

What is an Object An object is a chunk of data that can be individually addressed and manipulated A file is a chunk of data A zip file containing many files is a chunk of data A file can be made up of several chunks of data A block is a chunk of data A volume (a range of blocks) is made up of chunks of data Pages, extents, chunks, chunklets are objects consisting of multiple blocks Email? An email message is a chunk of data An email attachment is a chunk of data An email message along with its attachments could be treated as a single chunk of data. Often objects have associated metadata Descriptive information or tags Provenance 17

Content Addressing Content addressing calculates a hash of the data that makes up the object and uses the hash as an address Locality independence An object can live in multiple location for: Redundancy Parallelism Local processing affinity Data integrity The object can be compared against its hash for integrity checking If the hash test fails, simply retrieve a copy of the object and repair the corrupt object Deduplication Two objects with the same name are actually the same object 18

Self Healing and Data Protection in Object Stores 19

Basic Object-Level Redundancy: An Alternative to RAID and Mirroring 20

Redundant Objects Propagate on Device Failure 21

Object Mirroring Across a WAN 22

Erasure-Coded Data Protection: An Alternative to Parity-Based RAID 23

You Can Lose X% of Your Storage Without Losing Data 24

Some Real-World Examples of Object-Based Storage 25

Splitting SAN I/O into a Block Stream and an Object Stream 26

Object-Based File System with Erasure Coding and Global Dedupe 27

SharePoint with External Blob Storage Gateway of some sort 28

Shared File System Leveraging a Cloud-based Object Store 29

Object-Based Archive File System: Automatic Back up to Tape 30

The Mwah Hah Hah Plan to Conquer the World 31

Summary of What We Have Today Application software that manages files on CIFS and NFS volumes for a single location Out of band respects ACLs and UIDS/GUIDS Basic support for cloud stores (S3) and object stores Key-value metadata MYSQL back end Support for 500K to 1B files Admin GUI User GUI Rest-based API Multi-threaded crawler Policy-based multi-threaded data mover Backup copies with versioning Reporting with duplicate file detection 32