Building Storage Clouds for Online Applications A Case for Optimized Object Storage

Similar documents
Revolutionary Methods to Handle Data Durability Challenges for Big Data

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

Object Oriented Storage and the End of File-Level Restores

The Design and Implementation of the Zetta Storage Service. October 27, 2009

HGST Object Storage for a New Generation of IT

Object storage in Cloud Computing and Embedded Processing

Designing a Cloud Storage System

Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage

Amazon Cloud Storage Options

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

WHITE PAPER. Reinventing Large-Scale Digital Libraries With Object Storage Technology

ntier Verde Simply Affordable File Storage

Storage Virtualization

(Scale Out NAS System)

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Long term retention and archiving the challenges and the solution

Archiving On-Premise and in the Cloud. March 2015

THE SUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

WOS. High Performance Object Storage

WOS OBJECT STORAGE PRODUCT BROCHURE DDN.COM Full Spectrum Object Storage

Introduction to NetApp Infinite Volume

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

ANY SURVEILLANCE, ANYWHERE, ANYTIME

WHITE PAPER. QUANTUM LATTUS: Next-Generation Object Storage for Big Data Archives

Low-cost

Big Data Unlimited Cloud Storage Atmos Object Storage

StorPool Distributed Storage Software Technical Overview

Certified long-term data archiving with ultra-redundant hard disk storage. Silent Cubes

Data Sheet FUJITSU Storage ETERNUS CD10000

VTrak SATA RAID Storage System

LEVERAGING EMC SOURCEONE AND EMC DATA DOMAIN FOR ENTERPRISE ARCHIVING AUGUST 2011

Alternatives to Big Backup

REDUCING DATA CENTER POWER CONSUMPTION THROUGH EFFICIENT STORAGE

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

Symantec NetBackup 5000 Appliance Series

WOS 360 FULL SPECTRUM OBJECT STORAGE

Parallels Cloud Storage

TCO Case Study Enterprise Mass Storage: Less Than A Penny Per GB Per Year

XenData Product Brief: SX-550 Series Servers for LTO Archives

Microsoft s Open CloudServer

Turnkey Deduplication Solution for the Enterprise

THESUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

Centralized Orchestration and Performance Monitoring

CyberStore WSS. Multi Award Winning. Broadberry. CyberStore WSS. Windows Storage Server 2012 Appliances. Powering these organisations

Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

CONVERGED DATA STORAGE SOLUTIONS. Helping Companies DESIGN, INTEGRATE and DEPLOY, END-TO-END File Based Workflows

Scala Storage Scale-Out Clustered Storage White Paper

Backup and Recovery 1

REDUCING DATA CENTER POWER CONSUMPTION THROUGH EFFICIENT STORAGE

Arkivum's Digital Archive Managed Service

IP Video Surveillance Certified Global OEM. Directory Servers. Archive Servers. Catalog

SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY

nexsan NAS just got faster, easier and more affordable.

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

IBM Scale Out Network Attached Storage

Datto Whitepaper: Business Continuity Built from the Ground Up. Business Continuity Built from the Ground Up THE ONLY CONTINUITY

EMC DATA DOMAIN OPERATING SYSTEM

The Convergence of Software Defined Storage and Physical Appliances Hybrid Cloud Storage

Solution Brief: Creating Avid Project Archives

NEXLINK STABLEFLEX MODULAR SERVER

EMC DATA DOMAIN OPERATING SYSTEM

PARALLELS CLOUD STORAGE

Disaster Recovery Strategies: Business Continuity through Remote Backup Replication

Big data management with IBM General Parallel File System

Pivot3 Desktop Virtualization Appliances. vstac VDI Technology Overview

Dynamic Disk Pools Delivering Worry-Free Storage

RozoFS: a fault tolerant I/O intensive distributed file system based on Mojette erasure code

I/O Performance of Cisco UCS M-Series Modular Servers with Cisco UCS M142 Compute Cartridges

CIRRASCALE SOLUTIONS GUIDE STORAGE

Introduction to Gluster. Versions 3.0.x

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

XenData Product Brief: SX-250 Archive Server for LTO

Every organization has critical data that it can t live without. When a disaster strikes, how long can your business survive without access to its

Improving IT Operational Efficiency with a VMware vsphere Private Cloud on Lenovo Servers and Lenovo Storage SAN S3200

Symantec NetBackup 5220

Tier 2 Nearline. As archives grow, Echo grows. Dynamically, cost-effectively and massively. What is nearline? Transfer to Tape

ClearCube: Dedicated 1:1 Blade PC Workstations for Centralized and Virtualized Desktop Infrastructure

OceanStor UDS Massive Storage System Technical White Paper Reliability

A STORAGE SYSTEM JUST LIKE THE ONE YOU HAVE TODAY A STORAGE SYSTEM NOTHING LIKE THE ONE YOU HAVE TODAY.

Xanadu 130. Business Class Storage Solution. 8G FC Host Connectivity and 6G SAS Backplane. 2U 12-Bay 3.5 Form Factor

Silent Cubes. Certified long-term data archiving with super-redundant hard disk storage

IBM Spectrum Protect in the Cloud

IBM System x GPFS Storage Server

Reducing Storage TCO With Private Cloud Storage

How To Make A Backup System More Efficient

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Tandberg Data AccuVault RDX

Maxta Storage Platform Enterprise Storage Re-defined

Transcription:

Building Storage Clouds for Online Applications A Case for Optimized Object Storage

Agenda Introduction: storage facts and trends Call for more online storage! AmpliStor: Optimized Object Storage Cost Reduction through Erasure Coding Use Case: Massive Media Questions Amplidata Confidential 2

Introduction: storage facts and trends 3

Introduction, facts and trends Studies show that data storage capacities will likely increase by over 30X in the coming decade to over 35 Zettabytes 35ZB Storage Consumption High-capacity drives Less Staff / TB Unstructured Data 30X Time 2020 4

Introduction, facts and trends The number of qualified people to manage this data will stay flat (~1.5X) Efficiency: automate & reduce overhead Capcity / Budget Storage Budget Time 5

Introduction, facts and trends Much of that growth (80%) is driven by unstructured data : billions of files Active Archives Online Images Large Files Medical Images Online Storage Online Movies 6

Introduction, facts and trends Traditional storage technologies require too much overhead, power and management -> There is a growing interest in Object Storage -> Erasure coding is the proclaimed successor of RAID 7

Introduction, facts and trends Most current storage technologies require >200% overhead to provide five 9s availability Raid6 + Replication 3 copies in the cloud 8

Introduction, facts and trends Storage currently accounts for 37-40% of overall data center energy consumption from hardware Energy consumption will influence technology procurement criteria Data Center Power Usage 9

Introduction, facts and trends Storage Maintenance processes need to be more automated: E.g., Data migration will soon take longer than the lifetime of media It s like painting the Golden Gate Bridge, but the bridge is continuously getting longer 10

A Call for more online storage 11

A Call for more online storage The Public Cloud industry is far ahead in the storage growth statistics: AWS S3 will soon have 800 billion objects stored Facebook has over 250 million photos uploaded per day, which is over 7 billion per month Youtube receives over 24hrs of new video every minute 12

A Call for more online storage Backup and Recovery is increasingly moved to the Cloud Document sharing is HOT Archives are moved back off tape, online archives are BIG Big Data is taking many shapes Social-Local-Mobile will keep stimulating digital data growth 13

A Call for more online storage 800 billion objects: Don t you want some of that???? 14

A Call for more online storage 800 billion objects: So how would you store that? 15

Object Storage for Online Applications 16

Object Storage What are the requirements? Data has to be always available online Direct interface to the applications Petabyte scalability Extreme reliability, integrity Cost-efficient Security } } Commodity-HDD Storage + REST API, Cloud-enabled + Erasure Coding = Optimized Object Storage 17

Storage Clouds Storage Cloud infrastructures Private or public setup Provide highest availability Applications File systems are obsolete Use REST API Application Application Application REST API Massively Scalable Storage Pool 18

Petabyte Scalability Object Storage systems will scale: Beyond petabytes of data Beyond billions of data objects Systems should scale uniformly Add resources incrementally Scale performance and capacity separately 19

Petabyte Scalability Scalable metadata repository (capacity & performance) Lightweight metadata, designed to scale up to billions of objects Flat namespace 20

Data Integrity Ensuring the integrity of long-term unstructured data, new data protection algorithms are required, to: Address the increasing capacity of disk drives Solve issues related to long RAID rebuild windows Object storage systems based on erasure-coding can not only protect data from higher numbers of drive failures, but also against the failure of entire storage modules. 21

Cost-efficient Power, cooling and floor-space requirements are paramount concerns: erasure coding drastically reduces overhead numbers Systems need to be self-managing The system needs to be hardware independent: data migration needs to be an automatic, continuous background process. 22

Cost-efficient Eliminate the need for manual disk swaps: move to higher-level container management tasks. The system should automatically manage allocation to the underlying disks 23

Security Multi-tenant authentication/authorisation Read Read/Write List Auditing & Logging Secure protocols/encryptions (https) Individual disks cannot be mis-used Data is encoded and spread 24

So, what is this erasure encoding? 25

Erasure Coding, simply explained BitSpread Encodes data in linear equations Distributes the equations across disks, storage nodes, racks, data centers Original data can always be uniquely determined from a subset of the equations BitSpread uses 4K variables independent of object size Extra blocks can be generated without knowing what is missing Simplified mathematics: Original Object 75 Decomposed Object 7 5 BitSpread Series of Equations X+Y=12 X-Y=2 2X+Y=19 Any 2 out of 3 equations uniquely determine object 7 5 7 5 7 5 26

AmpliStor System Controller Nodes (3+) Dual, quad-core Xeon processors, 16GB RAM, 2 x 200GB SSD, 2 x 10 Gigabit Ethernet network interfaces Object Based Interfaces: http/rest API, C API, Python CLI, WebDav 3 Controllers per System (minimum) can be scaled up for performance (fully shared metadata & storage pool) AS20 Low Power Storage Nodes (8+) 1 U rack mount chassis with 20TB capacity 2 x 1 Gigabit network interfaces Low power processor (Intel Atom) 10 x 2 TB low-power Green SATA disk drives Low power: 65-140 watts per node utilization (3.5-7 watts per TB) 27

Core Software Technology Components BitSpread Distributed Encoder/Decoder RAID replacement technology based on unique variant of Erasure Coding Dial-in fault tolerance through namespace level policies Namespace1: 16/4 policy protects against any 4 failures in 16 disks Namespace2: 18/6 policy protects against any 6 failures in 18 disks Provides availability and reliability even during failures Policies can be dynamically changed BitDynamics Maintenance & Self-Healing Agent Out of band operations agent for disk monitoring, integrity verification & object self-healing Performs automated tasks: scrubs, verifies, self-heals, repairs & optimizes data on disk 28

AmpliStor for Big Unstructured Data Turnkey storage solution for BIG Unstructured Data Systems scales from beyond Petabytes with Global Object Namespace Throughput scales with amount of resources Policy-Driven Storage Durability Ten 9 s of Durability (99.99999999%) and beyond through policies Eliminates the reliability exposures of RAID on high-density disk drives Eliminates data corruption or loss due to bit errors 50-70% improvement in Storage Efficiency 70% reduction in storage footprint compared to Three copies in the cloud 50% reduction in storage footprint compared to mirrored RAID Drives proportional reductions in data center floor space & power Automated Management Self-healing design manages data integrity assurance and auto-repairs data 50-70% reduction in TCO Storage footprint (Capex), power, data center space & management costs 29

AmpliStor Use Case: Massive Media 30

Montreux Jazz, an invaluable research asset Most successful social media group in EMEA Social networking Gaming Dating Massive Media Storage Cloud Half a billion objects 80 million users Highest level of data availability Storage requirements High availability without copying data: 250% overhead is unacceptable Low power: 3.5 Watt/TB Fast migration: REST API 31

Thank You! Tom Leyden, Director of Alliances & Marketing Twitter.com/tomme