The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.



Similar documents
Trends driving software-defined storage

Product Brochure. Hedvig Distributed Storage Platform Modern Storage for Modern Business. Elastic. Accelerate data to value. Simple.

Hyperscale Use Cases for Scaling Out with Flash. David Olszewski

Hedvig Distributed Storage Platform with Cisco UCS

Product Brochure. Hedvig Distributed Storage Platform. Elastic. Modern Storage for Modern Business. One platform for any application. Simple.

Connecting Flash in Cloud Storage

PIONEER RESEARCH & DEVELOPMENT GROUP

Traditional v/s CONVRGD

Introduction to Datacenters & the Cloud

Designing a Cloud Storage System

VMware VMware Inc. All rights reserved.

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

PARALLELS CLOUD STORAGE

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

How To Make A Backup System More Efficient

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

VMware Software-Defined Storage Vision

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

Hyper-converged IT drives: - TCO cost savings - data protection - amazing operational excellence

Brian LaGoe, Systems Administrator Benjamin Jellema, Systems Administrator Eastern Michigan University

Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved.

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore. The Challenge SOLUTION GUIDE

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Identifying the Hidden Risk of Data Deduplication: How the HYDRAstor TM Solution Proactively Solves the Problem

TCO Case Study Enterprise Mass Storage: Less Than A Penny Per GB Per Year

ioscale: The Holy Grail for Hyperscale

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Software-defined Storage

Using object storage as a target for backup, disaster recovery, archiving

LDA, the new family of Lortu Data Appliances

What is RAID and how does it work?

Avamar. Technology Overview

Storage Architectures for Big Data in the Cloud

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

DataCore SANsymphony-V Software-Defined Storage The intelligent way of data virtualizing!

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Simplify IT with Hyperconvergence

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Configuration Guide. Achieve Unified Management and Scale-Out of Multi-Site File-Serving Deployments Using Windows Server 2012 R2 and Sanbolic

Solid State Storage in Massive Data Environments Erik Eyberg

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

SLIDE 1 Previous Next Exit

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

Summer Student Project Report

Cloud Optimize Your IT

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

VMware Software-Defined Storage and EVO:RAIL

Replication and Erasure Coding Explained

Data Management & Storage for NGS

Network Attached Storage. Jinfeng Yang Oct/19/2015

RAID Basics Training Guide

Maxta Storage Platform Enterprise Storage Re-defined

NET ACCESS VOICE PRIVATE CLOUD

Flash Storage: Trust, But Verify

New Hitachi Virtual Storage Platform Family. Name Date

RAID Implementation for StorSimple Storage Management Appliance

IBM ELASTIC STORAGE SEAN LEE

StarWind Virtual SAN for Microsoft SOFS

Arif Goelmhd Goelammohamed Solutions Hyperconverged Infrastructure: The How-To and Why Now?

Hitachi Content Platform. Andrej Gursky, Solutions Consultant May 2015

Cloud OS Vision. Modern platform for the world s apps

(R)Evolution im Software Defined Datacenter Hyper-Converged Infrastructure

EMC DATA DOMAIN OPERATING SYSTEM

Understanding Microsoft Storage Spaces

21 st Century Storage What s New and What s Changing

CISCO WIDE AREA APPLICATION SERVICES (WAAS) OPTIMIZATIONS FOR EMC AVAMAR

EMC DATA DOMAIN OPERATING SYSTEM

Protect Data... in the Cloud

In Memory Accelerator for MongoDB

Hadoop Architecture. Part 1

Zadara Storage Cloud A

W H I T E P A P E R T h e C r i t i c a l N e e d t o P r o t e c t M a i n f r a m e B u s i n e s s - C r i t i c a l A p p l i c a t i o n s

NEXT GENERATION EMC: LEAD YOUR STORAGE TRANSFORMATION. Copyright 2013 EMC Corporation. All rights reserved.

High Availability with Windows Server 2012 Release Candidate

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

HyperQ Storage Tiering White Paper

Introduction to NetApp Infinite Volume

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

Industry Brief. The Epic Migration. to Software Defined Storage. SUSE Enterprise Storage. Featuring

Protecting Information in a Smarter Data Center with the Performance of Flash

Latency vs. Capacity Storage Projections

SimpliVity OmniStack with the HyTrust Platform

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Cloud File Services: October 1, 2014

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

How to choose the right RAID for your Dedicated Server

What s New in VMware Virtual SAN TECHNICAL WHITE PAPER

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE

WHITE PAPER. Software Defined Storage Hydrates the Cloud

Transcription:

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc

The need for new architectures Business innovation Time-to-market Flexible infrastructure Business executives Developers IT infrastructure / DevOps 2

Modern apps need... Scale Flexibility Self-service Automation To achieve this, the world is moving to a software-defined, distributed systems approach 3

Software-defined Storage

Software-defined storage + = commodity servers software software-defined storage

Common software-defined storage elements Storage software Forms elastic storage cluster with commodity servers or cloud infrastructure Proxy/client Provides storage access to application compute environment via common protocols

Modern software-defined storage architectures Hyperconverged Hyperscale 7

Spanning multiple DCs and clouds Hyperconverged Hyperscale 8

Data protection with SDS: RAID, Erasure Coding & Replication

Protecting stored data: RAID Redundant Array of Independent Disks Divides or replicates data across multiple drives to deliver performance and fault tolerance Commonly used: RAID 0, RAID 1, RAID 5, RAID 10 Pros Trusted protection solution in the traditional array world Known performance delivery Cons High-capacity drive (8TB+) rebuilds can take days or even weeks RAID controllers add complexity for requisite performance 10

Protecting stored data: Erasure coding A parity based protection technique Data broken into fragments and encoded Stored across different locations with a configurable number of redundant pieces Pros Consumes less storage than replication good for cheap/deep Allows for the failure of two or more elements of a storage system Cons Parity calculation is CPU-intensive Increased latency can slow production writes and rebuilds 11

How erasure coding works Split a file into n chunks and code into m parity blocks A1 A split X1 X2 encode A2 A3 A4 12

How erasure coding works Tolerate m erasures (failures) A1 = X1 A2 = X2 A3 = X1 + X2 A4 = X1 + 2 X2 13

How erasure coding works In a distributed system, chunks are spread across nodes In this example, 2 nodes can fail and data can still be rebuilt A1 A2 A3 A4 X1 X1 X1 + X2 X1 + (2)X2 Node 1 Node 2 Node 3 Node 4 14

Erasure coding use case: Archival storage Goal Need long-term storage of PBs of files Minimizing storage costs critical to business profitability Solution Software-defined storage + erasure coding Results Store and protect archival data in 1.5x disk space Performance adequate for workload Rebuilds slower than desired, but capacity savings outweigh latency 15

Protecting stored data: Replication The creation of data copies across different locations of the storage system Typically 2 or 3 copies, configurable based on accepted risk level If a drive fails, data is recreated on another drive from replica(s) Pros Less CPU intensive = faster write performance Simple restores = faster rebuild performance Cons Requires 2x or more the original storage space 16

How replication works with software-defined storage Data broken into chunks and n copies made across server nodes in a cluster App host Data Node Center 1 Data Node Center 2 2 Data Node Center 3 3 17

Offsetting replication overhead Compression ~ 2:1 reduction Deduplication ~5:1 or higher reduction Low disk cost HDD and flash economics declining Overhead of replication more tolerable App hosts 18

Doesn t have to be one-size-fits-all Modern solutions provide per-volume choice Choose protection type based on workload Block (iscsi) NFS 512 bytes 64k Agnostic Rack-aware Datacenter-aware 2-6 copies 19

Replication use case: Primary data storage Company: Large financial organization Situation Hosting 500TB of data across four datacenters in two countries Want maximum availability and recoverability Solution Deployed software-defined storage with 4-way replication Results Achieve high-performance, high-availability, and quick rebuilds Active Active Active Data Center A Data Center B Data Center C Data Center D 20

Summary Protection technologies are evolving along with architectures RAID has met its limitation with large capacity drives Erasure coding is a good option for latency tolerant, large capacity stores Replication provides protection in demanding performance and availability environments Software-defined storage offers choice and flexibility to deploy each protection technology where it makes sense 21

Thank You!