With Verified Erasure Coding



Similar documents
Benefits of Intel Matrix Storage Technology

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

How To Make A Backup System More Efficient

Identifying the Hidden Risk of Data Deduplication: How the HYDRAstor TM Solution Proactively Solves the Problem

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe

HOW TRUENAS LEVERAGES OPENZFS. Storage and Servers Driven by Open Source.

Using RAID6 for Advanced Data Protection

Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration

Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop

Intel Matrix Storage Console

Improving Lustre OST Performance with ClusterStor GridRAID. John Fragalla Principal Architect High Performance Computing

How To Encrypt Data With A Power Of N On A K Disk

Intel RAID Volume Recovery Procedures

Linux Software Raid. Aug Mark A. Davis

Xyratex Update. Michael K. Connolly. Partner and Alliances Development

Dynamic Disk Pools Delivering Worry-Free Storage

Definition of RAID Levels

Audit & Tune Deliverables

Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges

HP Smart Array 5i Plus Controller and Battery Backed Write Cache (BBWC) Enabler

Nexenta Performance Scaling for Speed and Cost

WHITE PAPER. QUANTUM LATTUS: Next-Generation Object Storage for Big Data Archives

ZFS Backup Platform. ZFS Backup Platform. Senior Systems Analyst TalkTalk Group. Robert Milkowski.

Designing a Cloud Storage System

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

EMC DATA DOMAIN DATA INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

Practical issues in DIY RAID Recovery

Data Storage - II: Efficient Usage & Errors

How To Choose Veeam Backup & Replication

Pivot3 Desktop Virtualization Appliances. vstac VDI Technology Overview

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

THE STORAGE CHALLENGE

RAID Utility User Guide. Instructions for setting up RAID volumes on a computer with a Mac Pro RAID Card or Xserve RAID Card

Smart Array technology: advantages of battery-backed cache

Lecture 36: Chapter 6

Disaster Recovery Strategies: Business Continuity through Remote Backup Replication

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

Intransa EnterpriseServer and EnterpriseStorage Infrastructure for High Availability Needs V1.0

Moving Beyond RAID DXi and Dynamic Disk Pools

Every organization has critical data that it can t live without. When a disaster strikes, how long can your business survive without access to its

IBM ^ xseries ServeRAID Technology

Hard Drive Installation Options Ontrack Data Recovery Technical Paper.2004

RAID Basics Training Guide

Building Highly Available OpenZFS Storage Appliances Grenville Whelan

RAID Made Easy By Jon L. Jacobi, PCWorld

RAID Utility User s Guide Instructions for setting up RAID volumes on a computer with a MacPro RAID Card or Xserve RAID Card.

1 Storage Devices Summary

Introduction. Setup of Exchange in a VM. VMware Infrastructure

Reboot the ExtraHop System and Test Hardware with the Rescue USB Flash Drive

HARDWARE SUBRAID. SUBRAID MAX II STD rebuilds backup system

JetFlash User s Manual

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

Remote Desktop Services

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c

Building Storage Clouds for Online Applications A Case for Optimized Object Storage

New Advanced RAID Level for Today's Larger Storage Capacities: Advanced Data Guarding

5-BAY RAID STATION. Manual

Solbox Cloud Storage Acceleration

CSE-E5430 Scalable Cloud Computing P Lecture 5

Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

ANY SURVEILLANCE, ANYWHERE, ANYTIME

WHITE PAPER. Reinventing Large-Scale Digital Libraries With Object Storage Technology

Changing the Fundamentals of Data Storage for SuperComputers

Oracle Solaris: Aktueller Stand und Ausblick

RAID Implementation for StorSimple Storage Management Appliance

QuickSpecs. HP Smart Array 5312 Controller. Overview

ServeRAID M5015 and M5014 SAS/SATA Controllers for IBM System x IBM Redbooks Product Guide

Proposal for Virtual Private Server Provisioning

Guide to SATA Hard Disks Installation and RAID Configuration

QUICK REFERENCE GUIDE: KEY FEATURES AND BENEFITS

Main Reference : Hall, James A Information Technology Auditing and Assurance, 3 rd Edition, Florida, USA : Auerbach Publications

Standard RAID levels - Wikipedia, the free encycl...

Network Virtualization Platform (NVP) Incident Reports

EMC XTREMIO EXECUTIVE OVERVIEW

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

PARALLELS CLOUD STORAGE

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

An Affordable Commodity Network Attached Storage Solution for Biological Research Environments.

QuickSpecs. Models HP Smart Array E200 Controller. Upgrade Options Cache Upgrade. Overview

Brian LaGoe, Systems Administrator Benjamin Jellema, Systems Administrator Eastern Michigan University

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

Introduction to Optical Archiving Library Solution for Long-term Data Retention

INVITATION FOR TENDER AND INSTRUCTIONS TO TENDERERS INSTRUCTIONS TO TENDERERS

Virtualization s Evolution

End-to-end Data integrity Protection in Storage Systems

Management Tools. Contents. Overview. MegaRAID Storage Manager. Supported Operating Systems MegaRAID CLI. Key Features

Red Hat Enterprise Linux as a

An Oracle White Paper August Higher Security, Greater Access with Oracle Desktop Virtualization

Xserve G5 Using the Hardware RAID PCI Card Instructions for using the software provided with the Hardware RAID PCI Card

Ensuring Robust Data Integrity

File System Reliability (part 2)

Technical White paper RAID Protection and Drive Failure Fast Recovery

Web Hosting. Hosting. Cloud File Hosting. The Genio Group (214)

Guide to SATA Hard Disks Installation and RAID Configuration

Backup Solution Testing on UCS for Small-Medium Range Customers (Disk to Tape) Acronis Advanced Backup Software

Transcription:

With Verified Erasure Coding Superior Availability Integrity Performance Economy 1

Critical Big Data requirements are: High Data Availability (survive multiple drive failures even during rebuild) Perfect Data Integrity (eliminate silent data corruption even during rebuild) High Performance (minimize RAID rebuild time impacts) High Economy (both money and time savings) Big Parity meets these requirements Extends traditional RAID5(N+1) and RAID6(N+2) into N+(3:127) Eliminates Silent Data Corruption and Silent Data Corruption Amplification 7-30x the performance of Open Source Erasure Coding libraries Extends traditional Erasure Coding with Verified Erasure Coding 2

Big Parity is a C language Erasure Coding library Compatible with Linux (GCC), Mac OS X (GCC) and Windows (Intel) Implements Verified Erasure Coding Fully tested, verified and supported with multiple pending patents Early adopters of Erasure Coding systems include: Oracle/Sun Dell/Compellent ZFS: 3 parity drives NEC HydraStor: 3 parity drives EMC/Isilon: 4 parity drives Amplidata: 4 parity drives CleverSafe: 6 parity drives Big Parity offers 7-30x performance of any competing Erasure Coding System Big Parity is the only library to offer Verified Erasure Coding (patent pending) Big Parity is the only library to eliminate Silent Data Corruption Amplification 3

Compared with leading Open Source solutions Published results are from highly respected academic sources Comparisons include Jerasure, Luby, Zooko and Cleversafe Multiple runs to verify stability of results Results Big Parity offers 7-30x performance advantage 4

Erasure Coding Relative Performance Decoding MB/Sec 4000 3500 3000 2500 2000 1500 1000 500 0 14,2 12,4 10,6 Parity Configuration (Data Drives, Parity Drives) Big Parity Jerasure CRS Jerasure RS Luby Cleversafe Zooko 5

Each additional Parity drive increases data availability by an order of magnitude System can survive additional failures seamlessly Much more reliable than hot spares Unlike hot spares, pre-computed Parity drives are not contingent on successful reconstruction Parity drive requirements increase as a log function of data drives, not a linear function Bigger RAID groups have fewer components and are more reliable than multiple small groups 6

Silent Data Corruption is a well documented occurrence in Big Data Systems Network Appliance found ~1% of disk drives had Silent Data Corruption http://www.usenix.org/event/fast08/tech/full_papers/bairavasundaram/bairavasundaram.pdf Additional Parity drives means Silent Data Corruption can be eliminated Mathematical sums are used to validate data and correct errors 100% Reliable detection and correction Even during reconstruction 7

Assume a RAID6 system has two failed drives that are reconstructing Any SDC Error from any drive will be amplified and recorded on BOTH reconstructing drives permanently without the possibility of detection or recovery Big Parity eliminates both Silent Data Corruption and Silent Data Corruption Amplification Patent pending technique extends Erasure Coding into Verified Erasure Coding All other Erasure Coding Systems suffer from Silent Data Corruption Amplification 8

Using additional parity drives means reconstruction can be deferred No need to load system with reconstruction while critical applications are running N extra parity drives means 1/(N+1) as many reconstructions required For example, 1 extra parity drive means ½ as many reconstructions Using additional Parity drives means delays can be eliminated Drives often have long delays during recovery operations, which can accumulate over time and delay applications Additional Parity drives can eliminate those delays by reconstructing delayed data 9

Larger RAID groups mean fewer total disk components Saves power, packaging, cooling and interconnect for each saved disk Additional parity disks means fewer service events N extra parity drives means 1/(N+1) as many disk service events For example, 1 extra parity drive means ½ the number of disk service events 10

Authored by a veteran RAID designer specifically as a Verified Erasure Coding solution for RAID systems Not an academic work, a meticulously engineered production level solution Simultaneously supports older (Vandermonde) based Erasure Codes as well as newer (Lagrange) based Erasure Codes Easy to extend existing RAID6(N+2) systems into N+3 No need to rewrite any existing data or parity information Seamless upgrade in place for older customers needing additional protection Full support to migrate in place RAID6(N+2) into N+(3:127) Single library can read and verify old codes and then write new codes without disrupting user data 11

Higher data availability Orders of magnitude more protection than existing RAID5/6 strategies Increased data integrity with Verified Erasure Coding Elimination of Silent Data Corruption in all forms Improved performance Fewer reconstructions required Errant drive latencies eliminated Increased Economy Fewer total components Fewer service events Backwards compatible 12

2 Parity Drives for Data Integrity Used to eliminate Silent Data Corruption, especially during reconstruction 1 Parity Drive for Unrecoverable Read Errors Likely to occur in large systems with large drives 1 Parity Drive for Performance Large systems are likely to have drives with high latencies 3 Parity Drives to reduce Reconstruction Events Fewer service events required 7 Parity Drives in Total Guaranteed data integrity, predictable performance and economical service More parity drives decrease service requirements even further Which costs more, an additional disk drive or N additional service events? 13

Software or Hardware solution Both are highly accelerated and mathematically proven correct C language solution for all major Operating Systems Linux, Windows, OS X and Solaris Kernel or User level C and Verilog solution for FPGA/ASIC Backwards compatible with older RAID6 codes Allows update in place of older codes to newer codes Very simple interface only 4 total functions required Solve, Generate, Regenerate, Update 14

Developed by world leading mathematicians and engineers over a period of 4 years Formal mathematical proofs of correctness Extensive test validation matrix Up to 127 Data drives and 127 Parity drives Larger versions in development Multi-Gigabyte/Second performance with near linear scaling using standard X86 cores Patent Pending technology tested at >25x performance of ZFS RAIDZ3 and >30x performance of Jerasure Integration and support resources available Both object and source code licensing available Per Unit, Per Site, Per Year or One Time Fee 15

Reduce your Time to Market Leverage 4 calendar years and >50 man years of development Offer important reliability upgrade to your existing customers Be first to deliver best of breed data protection to your customers Eliminate risk to your development engineering schedule License this fully tested solution with source code and documentation Mathematical proofs of correctness vetted by world class professors Experienced engineering talent ready to support integration Ongoing support, improvements and product updates Arm your salespeople with the latest technology RAID5 and RAID6 are already being displaced Contact us Today Sales dmcdonell@streamscale.com Engineering manderson@streamscale.com 16

Superior Availability Integrity Performance Economy 17