Object Storage A New Architectural Partitioning In Storage



Similar documents
A Presentation to help form consensus

High Performance Computing OpenStack Options. September 22, 2015

Visions for Ethernet Connected Drives. Vice President, Dell Oro Group March 25, 2015

Storage Clouds. Karthik Ramarao. Director of Strategy and Technology and CTO Asia Pacific, NetApp Board Director SNIA South Asia

Interoperable Cloud Storage with the CDMI Standard

How To Write On A Flash Memory Flash Memory (Mlc) On A Solid State Drive (Samsung)

Cloud Data Management Interface (CDMI) The Cloud Storage Standard. Mark Carlson, SNIA TC and Oracle Chair, SNIA Cloud Storage TWG

Understanding Enterprise NAS

Best Practice and Deployment of the Network for iscsi, NAS and DAS in the Data Center

UNDERSTANDING DATA DEDUPLICATION. Tom Sas Hewlett-Packard

WAN Optimization and Cloud Computing. Josh Tseng, Riverbed

Restoration Technologies. Mike Fishman / EMC Corp.

Sujee Maniyam, ElephantScale

How To Test Nvm Express On A Microsoft I7-3770S (I7) And I7 (I5) Ios 2 (I3) (I2) (Sas) (X86) (Amd)

Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer

Block Storage in the Open Source Cloud called OpenStack

How it can benefit your enterprise. Dejan Kocic Hitachi Data Systems (HDS)

Cloud Storage Clients. Rich Ramos, Individual

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Trends in Application Recovery. Andreas Schwegmann, HP

How it can benefit your enterprise. Dejan Kocic Netapp

How To Migrate To A Network (Wan) From A Server To A Server (Wlan)

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

PCI Express IO Virtualization Overview

Deploying Public, Private, and Hybrid Storage Clouds. Marty Stogsdill, Oracle

UNDERSTANDING DATA DEDUPLICATION. Thomas Rivera SEPATON

Cloud and Big Data initiatives. Mark O Connell, EMC

Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack

UNDERSTANDING DATA DEDUPLICATION. Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s.

Storage Cloud Environments. Alex McDonald NetApp

Creating a Catalog for ILM Services. Bob Mister Rogers, Application Matrix Paul Field, Independent Consultant Terry Yoshii, Intel

Solid State Storage in a Hard Disk Package. Brian McKean, LSI Corporation

Cloud File Services: October 1, 2014

ADVANCED DEDUPLICATION CONCEPTS. Larry Freeman, NetApp Inc Tom Pearce, Four-Colour IT Solutions

Enterprise Architecture and the Cloud. Marty Stogsdill, Oracle

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

Storage Clouds. Enterprise Architecture and the Cloud. Author and Presenter: Marty Stogsdill, Oracle

Hadoop Architecture. Part 1

Data Center Convergence. Ahmad Zamer, Brocade

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

VTrak SATA RAID Storage System

Network Attached Storage. Jinfeng Yang Oct/19/2015

EMC BACKUP-AS-A-SERVICE

Fundamental Approaches to WAN Optimization. Josh Tseng, Riverbed

WAN Optimization and Thin Client: Complementary or Competitive Application Delivery Methods? Josh Tseng, Riverbed

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

How To Write An Article On An Hp Appsystem For Spera Hana

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

DD670, DD860, and DD890 Hardware Overview

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level. Jacob Farmer, CTO, Cambridge Computer

Enterprise Architecture and the Cloud. Marty Stogsdill, Oracle

white paper A CASE FOR VIRTUAL RAID ADAPTERS Beyond Software RAID

Best Practices Guide: Network Convergence with Emulex LP21000 CNA & VMware ESX Server

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Advanced Knowledge and Understanding of Industrial Data Storage

SAN Conceptual and Design Basics

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

CompTIA Storage+ Powered by SNIA

SMB Direct for SQL Server and Private Cloud

Microsoft s Open CloudServer

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

SwiftStack Filesystem Gateway Architecture

How To Create A Cloud Backup Service

February, 2015 Bill Loewe

Alexandria Overview. Sept 4, 2015

HP Smart Array 5i Plus Controller and Battery Backed Write Cache (BBWC) Enabler

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality

Server and Storage Consolidation with iscsi Arrays. David Dale, NetApp

Introduction to MPIO, MCS, Trunking, and LACP

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

Transcription:

Object Storage A New Architectural Partitioning In Storage IEEE 802 Plenary Tutorial #3 Sponsored by: David Law, 802.3 Chair Presented by: Thomas Skaar, Seagate Martin Czekalski, HGST Page 1

Abstract Object Storage Drives represent a new architectural partitioning in storage solutions. The typical storage node architecture includes low-cost enclosures with IP networking, CPU, Memory and Direct Attached Storage (DAS). While inexpensive to deploy, these solutions become harder to manage over time. Power and space requirements of Data Centers are difficult to meet with this type of solution. Object Drives further partition these object systems allowing storage to scale up and down by single drive increments. This tutorial will discuss the current state and future prospects for object drives, and also the relevancy to IEEE 802.3 Ethernet and other networks that supports Object Storage access over IP. IEEE 802 November 2015 Plenary Tutorial #3 Object Storage Page 2

Presentation Agenda Object Storage Introduction and Market.Thomas Skaar, Seagate Object Drive - A New Architectural Partitioning* Martin Czekalski, HGST *Note: IEEE-SA copyright permission letter on file. IEEE 802 November 2015 Plenary Tutorial #3 Object Storage Page 3

Object Storage Tutorial Introduction and Market Tom Skaar 11/9/2015

Object Storage - Meaning Traditional storage access has been done through block access schemes. Legacy CHS (cylinder-head-sector) physical mapping/addressing proven to be unfriendly and not scaleable. Thus a logical abstraction was defined -- LBA (Logical Block Address) to offer easier access. LBA scheme assumes host access of sequential logical blocks on a given storage device. In SCSI (or SAS), 512 byte block/sector size ( chunks ) directly mapped to SCSI commands for a given drive. Object Storage takes the next step in the storage abstraction -- LBA oriented access to object access. Benefits of Object Storage Serves the needs of cloud storage requirements low-cost and deep cold storage that is easy to scale, add, move, and change. Fault tolerance, redundancy, remote storage, etc., just comes from Ethernet system access.

SAS Block Storage vs Object Storage SAS Block Storage Drive Ethernet Object Storage Drive Eth0 Standard form factor 2 SAS ports SCSI command set data = read (LBA, count) write (LBA, count, data) LBA :: [0, max] data :: count * 512 bytes Standard form factor 2 Ethernet ports (re-use SAS connector) Object key/value API value = get (key) put (key, value) delete (key) key :: 1 byte to 4 KB value :: 0 bytes to 1 MB Eth1 LBA: Logical Block Address (abstraction of CHS cylinder-head-sector)

Ethernet Drives Yesterday and Tomorrow Object Drive Transition

Use Cases Cloud Object Storage Rack of Blade Drives w/ ToR Switch Blade Drives OSED 0 (HDD) OSED 1 (HDD) OSED 2 (HDD) OSED 3 (HDD) OSED 4 (HDD) E E OSED E E OSED E E OSED E OSED E E E OSED 5 6 7 8 9 (HDD) (HDD) (HDD) (HDD) (HDD) E OSED1 E E OSED1 E E OSED1 E E OSED1 E E OSED1 E 0 1 2 3 4 (HDD) (HDD) (HDD) (HDD) (HDD) Ethernet Drives E E E E E E E E E E Bridge Bridge 2 x Ethernet to ToR Switch

The Ethernet Object Storage Market 40 35 30 25 20 15 10 5 Object Storage Drives (Millions units) Expecting the object storage growth to be on Ethernet Drives 0 2015 2016 2017 2018 2019 2020 2021 2022 IDC Seagate+BRCM Estimates SAS Object Drives Ethernet Object Drives Notes: Source: IDC Object Storage market (~2017) plus market estimates by Seagate and Broadcom (2018~2022) Each Storage Drives are dual homed (2 ports of Ethernet and companion Switch ports) With Standardization, more Ethernet based drives are likely to be adopted (orange colored region) -- reduced complexity and system cost reduction (compared to SAS based object storage drives) 6

Ethernet Drive Interface Market Estimate Number of Ports (Millions) Number of Ports (Millions) 35 30 25 20 15 10 5 Storage Ethernet Port Speed Trend (Bridge) 0 2015 2016 2017 2018 2019 2020 2021 2022 Storage Ethernet Port Speed Trend (Bridge) 40 Gbps 25 Gbps 10 Gbps 5 Gbps 2.5 Gbps 1000 Mbps 18 16 14 1000 Mbps 12 2.5 Gbps 10 5 Gbps 8 10 Gbps 6 4 25 Gbps 2 40 Gbps 0 2015 2016 2017 2018 2019 2020 2021 2022 Notes: # of Switch ports reflects x2 of drive unit Market estimates by Seagate and Broadcom (2018~2022) 7

Market Driver Summary Object storage has emerged Low-cost and Deep Cloud cold storage will adopt Object HDD storage on Ethernet. Bandwidth need for HDD (regardless of SAS/SATA I/O speed) is ~2x the sustained bandwidth, now at round ~175 MBytes/s. 2.5G and 5G Ethernet speeds will serve the mainstream HDD needs for the long term. Native Ethernet I/O on Object storage to be 5~8% or more of overall Ethernet switch connections over time.

THANK YOU

Object PRESENTATION Drives: A TITLE New GOES Architectural HERE Partitioning Mark Carlson/Toshiba

SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced in their entirety without modification The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney. The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 2

Abstract A number of scale out storage solutions, as part of open source and other projects, are architected to scale out by incrementally adding and removing storage nodes. Example projects include: Hadoop s HDFS Ceph Swift (OpenStack object storage) The typical storage node architecture includes inexpensive enclosures with IP networking, CPU, Memory and Direct Attached Storage (DAS). While inexpensive to deploy, these solutions become harder to manage over time. Power and space requirements of Data Centers are difficult to meet with this type of solution. Object Drives further partition these object systems allowing storage to scale up and down by single drive increments. This talk will discuss the current state and future prospects for object drives. Use cases and requirements will be examined and best practices will be described. Learning Objectives Definition of object drives Learn the value that they provide Discover where they are they best deployed 3

What are Object Drives? Key/Value semantics (Object store) among others Hosted software in some cases Interface changed from SCSI based to IP based (TCP/IP, HTTP) Channel (FC/SAS/SATA) interconnect moves to Ethernet network 4

Object Drive use Interface Key Value example is Kinetic Metadata is not part of the interface Metadata understood by higher levels would be stored as values Not expected to be used directly by end user applications (similar to existing Block I/F in that regard) Higher level example is CDMI Includes rich metadata (both user and system) Expected to be used by end user applications (and is) Object Drives, because of their abstraction allow many more degrees of innovation freedom in the implementations behind this abstraction Hosts needn t manage the mapping of objects to a block storage device An object drive may manage how it places objects on the media without reference to host specified locations 5

What is driving the market? A number of scale out storage solutions expand by adding identical storage nodes incrementally Typically use an Ethernet interface and may be connected directly to the Internet Open source examples include: Scale out file systems Hadoop s HDFS Lustre Ceph Swift (OpenStack object storage) Commercial examples also exist 6

Who would buy Object Drives? System vendors and integrators Enables simplification of the software stack Hyperscale Data Centers Using commodity hardware and open source software Enterprise IT Following the Hyperscale folks 7

Problem Statement (Current Solutions) For these solutions, typically a commodity server is used as a storage node with Direct Attached Storage, CPU, memory, networking These generalized solutions for specific use cases utilize multiple commodity servers and therefore consume more power and are more complex to manage Although less expensive to acquire than previous solutions, they do not reduce long term ownership costs 8

How do Object Drives Solve this? CPU power is moved to the drive where it can be optimized to the task (I/O) does not need to be general purpose Similar to what happened with (dumb) cell phone processors For Smart phones, as desired applications increase their needs, cell phone resources have grown CPU/Memory/Network/Storage is one component, managed as such Volume shipments drives the cost of this down Enable the resources to be matched/tuned with each other and sized appropriately 9

What do Object Drives NOT Solve? (today) Management Complexity 10x or more of things to manage Still no management software included with the drives Perhaps the best place to do scale out management is above the object abstraction End to end security Data is secured on the drive End user applications don t do this securing of the data No secure multi-tenancy in reality Higher level software will typically be trusted for authentication and access control Drive has basic message security support (private key) End to end integrity Ensure that data is not altered during I/O e.g. T10 DIF 10

Example: Ceph Architecture 11

Ceph with Object Drives 12

Traditional Hard Drive SATA/SAS SCSI T10 Disk and/or Flash Traditional Drive Interconnect via SATA/SAS Limited routability High development costs Typically single host T10 SCSI Protocol Low-level storage interface Not designed for lossy network connectivity Not typically used in multiclient concurrent access use cases 13

Kinetic Key Value Drive Ethernet Object API Disk and/or Flash Interconnect via Dual Ethernet Full routability Lower development costs Intended for multi-client access Kinetic Protocol Higher-level Key Value interface Designed for lossy network connectivity (TCP/IP) Typically used in multi-client concurrent access use cases Kinetic KV Drive 14

Pre-Configured In-Storage Compute Drive Ethernet Custom App (E.g. Object API) Linux + App Containers Block Interface (or Flash KV API) SATA/SAS/NVMe Disk and/or Flash Pre-configured In-Storage Compute Interconnect via Ethernet Full routability Custom Applications installed at factory or provisioning time No pre-determined client-facing interface Example is Ceph, or Kinetic API Custom application can run in a container in an embedded Linux environment Custom application accesses storage via standard Linux interfaces 15

Provisionable In-Storage Compute Drive Ethernet Provisionable Application(s) Linux + App Containers Block Interface (or Flash KV API) Interconnect via Ethernet Full routability Custom Applications installable at any time SATA/SAS/NVMe Disk and/or Flash User-configured Embedded Compute 16

Interposers Ethernet Kinetic API Linux Block Interface SATA/SAS/NVMe Ethernet Custom Application Linux Block Interface SATA/SAS/NVMe Ethernet Custom Application Linux Object Interface Ethernet Allows existing drives to be used as Ethernetconnected Object Drives Allows collections of drives to be virtualized as Ethernet-connected Object Drives 17

Types of Object Drives Key Value Protocol (Object Drive) Minimal incremental CPU/Memory requirements Simple mapping to underlying storage In-Storage Compute (Object Drive)* Enough CPU/Memory for Object Node Software to be embedded on the drive General purpose download or factory installed May have additional requirements such as solid state media and more/higher bandwidth networking connections In both cases, the interface abstracts the recording technology *Jim Gray memorial 18

Key Value Protocol Object Drives Eliminates existing parts of the usual storage stack Block drivers, logical volume manager, file system And the associated bugs and maintenance costs Existing applications need to be re-written, or adapted Mainly used by green field developed applications Firmware is upgraded as an entire image Hyperscale customers are already doing this Using open source software and creating their own apps Facebook, Google, etc. Key Value organization of data is growing in popularity Examples: Cassandra, NoSQL 19

In-Storage Compute Object Drives Same advantages as Key Value protocol plus No need for a separate server to run Object Node service (other services still need a server but scale separately) scaling is smoother only adding drives Additional features of the Object Node software can be deployed independently Fewer hardware types that need to be maintained (for selected use cases) Failure domains are more fine grained, thus overall data availability is enhanced 20

In-Storage Compute Future As data on the drive becomes colder, CPU/Memory becomes less utilized Possible to host software that then uses this spare resource and works against the cold data Extracting metadata Performing preservation tasks Other data services: advanced data protection, archiving, retention, deduplication, etc. Data Analysis off-load 21

Ethernet Connectivity Ethernet speeds standardized at 1Gbps, 10 Gbps Object Drives are high volume, low margin devices 10 Gbps too expensive for the drive form factor for the foreseeable future Desire to have standardized speeds of 2.5 Gbps, 5 Gbps With auto-negotiation among them SFF 8601 effort to specify auto-negotiation using existing silicon implementations (switch firmware upgrade done in a few months) 802.3 proposed effort to standardize single lane 2.5 and 5 Gbps speeds in silicon (multi-year effort) 22

Other FMS Tutorials After lunch, attend these tutorials: Check out SNIA Tutorials: Flash Storage for Backup, Recovery and DR Utilizing VDBench to Perform IDC AFA Testing NVDIMM Cookbook A Guide to NVDIMM Integration 23

Attribution & Feedback The SNIA Education Committee thanks the following Individuals for their contributions to this Tutorial. Authorship History Mark Carlson and the Object Drive TWG Additional Contributors David Slik Robert Qiuxin Paul Suhler Please send any questions or comments regarding this SNIA Tutorial to tracktutorials@snia.org 24