Introduction to OpenStack Swift CloudOpen Japan 2014

Similar documents
SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL 2012 IBM Corporation

Hiroshi Miura. Scalable Private Cloud Storage with Full OSS Stack. CloudOpen Japan 5, June System Platform Sector, NTT DATA Corporation 年 月 日

How to manage your OpenStack Swift Cluster using Swift Metrics Sreedhar Varma Vedams Inc.

Technical Overview Simple, Scalable, Object Storage Software

IBM Spectrum Protect in the Cloud

Service Description Cloud Storage Openstack Swift

KT ucloud storage. Two Years of Life with OpenStack Swift / Jaesuk Ahn, Cloud OS Dev. Team, Korea Telecom

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Creating a Cloud Backup Service. Deon George

Introduction to Red Hat Storage. January, 2012

Building Storage-as-a-Service Businesses

High Performance Computing OpenStack Options. September 22, 2015

OpenStack IaaS. Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013

Alexandria Overview. Sept 4, 2015

Utilizing the SDSC Cloud Storage Service

Building low cost disk storage with Ceph and OpenStack Swift

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

In Memory Accelerator for MongoDB

WHITE PAPER. Software Defined Storage Hydrates the Cloud

Hitachi Content Platform. Andrej Gursky, Solutions Consultant May 2015

KVM, OpenStack, and the Open Cloud

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

NetApp Data Fabric: Secured Backup to Public Cloud. Sonny Afen Senior Technical Consultant NetApp Indonesia

The OpenStack TM Object Storage system

Setting Up SQL Server on Windows Azure Understanding Options and Differences

Hadoop & its Usage at Facebook

Improving Scalability Of Storage System:Object Storage Using Open Stack Swift

GPFS Cloud ILM. IBM Research - Zurich. Storage Research Technology Outlook

System Administrators, engineers and consultants who will plan and manage OpenStack-based environments.

Prepared for: How to Become Cloud Backup Provider

Storage and Disaster Recovery

Designing a Cloud Storage System

Adrian Otto,

Cloud Models and Platforms

Multi Provider Cloud. Srinivasa Acharya, Engineering Manager, Hewlett-Packard

Wikimedia architecture. Mark Bergsma Wikimedia Foundation Inc.

Archiving On-Premise and in the Cloud. March 2015

OpenStack Introduction. November 4, 2015

Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality

Surviving the Worst: Disaster Recovery for OpenStack

Sheepdog: distributed storage system for QEMU

Exposing the Cloud: It It s More than a Buzzword Tim Connors, Director, AT&T AT&T

Scalable Architecture on Amazon AWS Cloud

Red Hat Storage Server

cloud functionality: advantages and Disadvantages

Business Intelligence Competency Partners

Building Cost-Effective Storage Clouds A Metrics-based Approach

SMART SCALE YOUR STORAGE - Object "Forever Live" Storage - Roberto Castelli EVP Sales & Marketing BCLOUD

Aspera Direct-to-Cloud Storage WHITE PAPER

ACCELERATING SQL SERVER WITH XTREMIO

OpenStack. Orgad Kimchi. Principal Software Engineer. Oracle ISV Engineering. 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Wikimedia Architecture Doing More With Less. Asher Feldman Ryan Lane Wikimedia Foundation Inc.

OpenStack Object Storage Administrator Guide

BDR TM for VMware. VMware BACKUP WITH VEMBU. VEMBU TECHNOLOGIES TRUSTED BY OVER 25,000 BUSINESSES

Introduction. Examples of use cases:

A 5 Year Total Cost of Ownership Study on the Economics of Cloud Storage

KVM, OpenStack, and the Open Cloud

Backing up to the Cloud

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Cloud Computing: Making the right choices

Protecting the Microsoft Data Center with NetBackup 7.6

How To Choose Cloud Computing

Mark Bennett. Search and the Virtual Machine

Distributed Block-level Storage Management for OpenStack

SUSE Storage. FUT7537 Software Defined Storage Introduction and Roadmap: Getting your tentacles around data growth. Larry Morris

Lambert: Achieve High Durability, Low Cost & Flexibility at Same Time

SOFTWARE-DEFINED STORAGE IN ACTION

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

How To Build A Cloud Storage System

GPFS-OpenStack Integration. Dinesh Subhraveti IBM Research

Hadoop & its Usage at Facebook

Hadoop: Embracing future hardware

Client-aware Cloud Storage

Software Defined Microsoft. PRESENTATION TITLE GOES HERE Siddhartha Roy Cloud + Enterprise Division Microsoft Corporation

Investigating Private Cloud Storage Deployment using Cumulus, Walrus, and OpenStack/Swift

An Intro to OpenStack. Ian Lawson Senior Solution Architect, Red Hat

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Avamar Backup and Data De-duplication Exam

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

TECHNOLOGY WHITE PAPER Jun 2012

Apache Hadoop FileSystem and its Usage in Facebook

HP OpenStack & Automation

VMware Software-Defined Storage Vision

Storage Made Easy Enterprise File Share and Sync (EFSS) Cloud Control Gateway Architecture

Big data Devices Apps

Building a Cloud Computing Platform based on Open Source Software Donghoon Kim ( donghoon.kim@kt.com ) Yoonbum Huh ( huhbum@kt.

Using object storage as a target for backup, disaster recovery, archiving

Clusters in the Cloud

Cloud Computing using

The Design and Implementation of the Zetta Storage Service. October 27, 2009

Trends in Application Recovery. Andreas Schwegmann, HP

Diagram 1: Islands of storage across a digital broadcast workflow

Cloud Storage and Backup

CLOUD BASED SERVICE (CBS STORAGE)

WOS OBJECT STORAGE PRODUCT BROCHURE DDN.COM Full Spectrum Object Storage

PROPOSAL To Develop an Enterprise Scale Disease Modeling Web Portal For Ascel Bio Updated March 2015

Transcription:

Introduction to OpenStack Swift CloudOpen Japan 2014 Yuji Hagiwara hagiwarayuj@nttdata.co.jp Platform Engineer, NTT DATA Corp. Copyright 2014 NTT DATA Corporation

2 Agenda 1.What is Swift? 2.Swift s Latest Information 3.Swift s Future

3 Who am I Yuji Hagiwara Platform Engineer, NTT DATA Corp. Since 2011 - Using OpenStack Since 2013 - Developing Searching on Swift Demo App for Searching on Swift

Background Data Explosion on Enterprise Amount of Unstructured Data has been growing. We need storage with Scalability, Durability, Availability. Amount of Unstructured Data EB or PB scale Growing exponentially Examples of Unstructured Data Media (Images, Videos, Audios) Web Contents Documents Backups/Archives 2004 2007 2010 2013 2016 Where should we store these data? One of the Solutions is Swift. Copyright 2014 NTT DATA Corporation 4

What is Swift? Swift is... A storage system with Scalability, Durability, Availability. The REST-ful Distributed Object Storage likely Amazon S3. One of OpenStack Core Components. Implemented by Python. A Open Source Software. 1 Block Storage (Cinder) 2 Object Storage (Swift) Copyright 2014 NTT DATA Corporation 5

6 Usage so simple. $ curl -XPUT --data-binary @mydoc.txt http://swift.example.com:8080/v1/account/container/object $ curl XGET http://swift.example.com:8080/v1/account/container/object $ curl XDELETE http://swift.example.com:8080/v1/account/container/object

7 Use cases of Swift

8 Swift as a storage for a variety of applications System Backup CMS FTP-like use Digital Distribution Web Apps Cyber Duck REST API Swift

9 OpenStack Swift deployments and use cases Name of enterprise Product/ service Description Rackspace(USA) Korean Telecom (Sourth Korea) Cloud Files ucloud storage service Cloud file share service by Rackspace itself. They use same code as OSS except for features such as authentication, Accounting and CDN (<500PB) Object storage service using OpenStack/Swift (16PB+ size) Sina (Republic of China) Sina App Engine(SAE) Public storage service. They moved to OpenStack from another technology MongoDB in 2012. San Diego Supercomputer Center (USA) SDSC Cloud Storage Services Cloud storage service on SDSC. Users can select Amazon/S3 or Rackspace Swift. SME Storage (USA) SMEStorage Open Cloud Platform Cloud storage service based on Rackspace Cloud File SoftLayer (USA) SoftLayer Object Storage Public object storage service. Acquisition by IBM SwiftStack(USA) Swift Stack Provide professional service and Operation and management product HP(USA) HP Cloud Private cloud storage service uses OpenStack. Wikimedia(USA) Wikimedia storage Media files store for Wikipedia. NII(JAPAN) Academic Cloud service Academic cloud service by National Institute of Informatics in Japan (NII)(Integrated and supported by NTT Data)

10 Inside Swift

11 Architecture: Nodes Swift consist of 2-type Nodes: Proxy Node and Storage Node. Application Forward Data to node HTTP Load balancer Proxy Node Proxy Node Proxy Node Store data Storage Node Storage Node Storage Node Storage Node

12 Architecture: The Ring The Ring (static table for data allocation on storage node) decide the optimal Storage Node by Name. Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Ring Ring Ring Ring

13 Architecture: The role of Ring If you requested to Store the data A, 3 Replica nodes store the data A. data Application HTTP Load balancer Proxy Node Proxy Node Proxy Node A must be located at 1, 2, 4 Ring Ring Ring Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4 Data A Data A Data B Data A Data B Data B Ring Ring Ring Ring

14 Architecture: The role of Ring If you requested to Get the data A, One of Nodes reply the data A. data Application data HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring A must be located at 1, 2, 4 Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4 Data A Data A Data B Data A Data B Data B Ring Ring Ring Ring

15 Scalability (1) Expand proxy server Throughput (2)Expand Storage servers or disks volume More Throughput Proxy Proxy (expand) Proxy Proxy Storage Storage Storage Storage Storage Storage Storage (Expand) More Volume

16 Many processes working together containersync proxyserver Swift objectreplicator accountserver objectauditor objectupdater objectserver objectexpirer accountreplicator accountauditor accountreaper containerserver containerreplicator containerauditor containerupdater

17 Disk Nor mal Node 1 Node 2 Node 3 Node 4 (1) Each nodes checks data in others Node 1 Node 2 Node 3 Node 4 (2) Disk defeat (3) Detect disk trouble Defe at Node 1 Node 2 Node 3 Node 4 (4) Copy data to another node Node 1 Node 2 Node 3 Node 4 (5) Recover disk (6) recover data to original node Reco very Node 1 Node 2 Node 3 Node 4 (7)Delete temporal data

18 Disk Nor mal Node 1 Node 2 Node 3 Node 4 (1) Each nodes checks data in others Node 1 Node 2 Node 3 Node 4 (2) Disk defeat (3) Detect disk trouble Defe at Node 1 Node 2 Node 3 Node 4 (4) Copy data to another node Node 1 Node 2 Node 3 Node 4 (5) Recover disk (6) recover data to original node Reco very Node 1 Node 2 Node 3 Node 4 (7)Delete temporal data

19 Normal state Each Data has replicated. Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Server Server Server Server Data A Data A Data B Data A Data B Data B Auditor Auditor Auditor Auditor Ring Ring Ring

20 Disk Nor mal Node 1 Node 2 Node 3 Node 4 (1) Each nodes checks data in others Node 1 Node 2 Node 3 Node 4 (2) Disk defeat (3) Detect disk trouble Defe at Node 1 Node 2 Node 3 Node 4 (4) Copy data to another node Node 1 Node 2 Node 3 Node 4 (5) Recover disk (6) recover data to original node Reco very Node 1 Node 2 Node 3 Node 4 (7)Delete temporal data

21 Defeat state If a disk is broken... Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Server Server Server Server Data A Data A Data B Data A Broken Data B Data B Auditor Auditor Auditor Auditor Ring Ring Ring

Defeat state detects the lost data and replicates the data to another node for temporary. Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Server Server Server Server Data A Data B Data A Data B Data A Broken Data B Auditor Auditor Auditor Auditor Ring Ring Ring Copyright 2014 NTT DATA Corporation Temporary data When detect a lost data, Replicate the data. 22

23 Disk Nor mal Node 1 Node 2 Node 3 Node 4 (1) Each nodes checks data in others Node 1 Node 2 Node 3 Node 4 (2) Disk defeat (3) Detect disk trouble Defe at Node 1 Node 2 Node 3 Node 4 (4) Copy data to another node Node 1 Node 2 Node 3 Node 4 (5) Recover disk (6) recover data to original node Reco very Node 1 Node 2 Node 3 Node 4 (7)Delete temporal data

24 Recovery state When the broken disk is replaced to a fresh disk... Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Server Server Server Server Data A Data B Data A Data B Data A Data B Auditor Auditor Auditor Auditor Ring Ring Ring

Recovery state replicates the data and removes the temporary data. Application HTTP Load balancer Proxy Node Proxy Node Proxy Node Ring Ring Ring Ring Storage Node Storage Node Storage Node Storage Node Server Server Server Server Data A Data B Data A Data B Data A Data B Auditor Auditor Auditor Auditor Ring Ring Ring Copyright 2014 NTT DATA Corporation Removed Replicate the data to the correct node. 25

26 Disk Nor mal Node 1 Node 2 Node 3 Node 4 (1) Each nodes checks data in others Node 1 Node 2 Node 3 Node 4 (2) Disk defeat (3) Detect disk trouble Defe at Node 1 Node 2 Node 3 Node 4 (4) Copy data to another node Node 1 Node 2 Node 3 Node 4 (5) Recover disk (6) recover data to original node Reco very Node 1 Node 2 Node 3 Node 4 (7)Delete temporal data

27 Latest Information

28 History and Trend of Community 2010.6 Start OpenStack Project 2013.4 Grizzly 2013.4 Icehouse 2010.10 1st release "Austin" History of OpenStack 2013.10 Havana 2014.10 Juno Now Hot Topics on Now Erasure Coding Storage Policy Fundamental Global Cluster Development Trend in Swift Timeline in each functions Developing Supported

29 Latest info: Erasure Coding Replication Erasure Coding Data Data Data distribute Data Data Partial Data Partial Data distribute Parity 1 Parity 2 Size 3x original 2x original

30 Latest info: Storage Policy Before Data Data Data After Data Data Data Data Data Data Partial Data Partial Data Parity 1 Parity 2 Data Data Data Data Data Same Policy on cluster Variety Policy on cluster

31 2 concepts: Integrated Searchable Storage Intelligent Resource Management Future Direction of Swift

32 Integrated Searchable Storage Swift should be integrated with Searching. It means to need searching as Scalable, Durable, Available as Swift. Users Operators Store Get Swift Storage System Managers Search

Use cases of Search 1.Content Search 2.Detection for de-duplication 3.Tiered storage Data Hash Already Stored? Data Data Hash Hash Data Hash Copyright 2014 NTT DATA Corporation Hot Content More Modified Cheap Storage Cold Content Less Modified Modifed Date is older than 05/20/2014? 33

34 Future: Integrated Searchable Storage How do we implement? Internal External Search Swift Storage feature Store Index Search feature Search Hook Swift Storage feature Store Index Search preexist Search Engine Where do search Internal Swift with search library (such as Lucene) External Search Engine (such as Solr) Redundancy High Depend on Search engine Availability High Depend on Search engine Scalability High Depend on Search engine Difficulty of implementation Hard Easy

35 Our Implementation Internal Approach Hack Swift to embed the search library. New Search API Application Proxy Node Swift s Ordinary API Indexing Searching Query Distributed by the Ring Data Metadata objectserver accountserver container-server SQLite Search ContainerA DB ContainerB DB Lucene Container A Index Container B Index Indexing

Future: Intelligent Resource Management Swift has more and more different functions. Other arbitrary processes Search...? Compression...? Encryption...? objectexpirer accountreplicator Multi-ring support Swift Storage Policy Erasure Coding objectupdater objectserver objectreplicator accountserver objectauditor proxyserver ecreconstructor ecauditor ec-stripeauditor accountauditor accountreaper containerserver containerreplicator containerauditor containerupdater containersync Copyright 2014 NTT DATA Corporation 36

37 Future: Intelligent Resource Management Resources are drained! IOPS, CPU, Network, Memory Performance Priorities of these functions are different by the Requirement. Ex1) Store performance VS Search performance Ex2) Service Level on Business Hour VS on Outside Hour 0 Outside (High-prio to check durability) 18 Business Hour (High-prio to process requests) 12 6 More Intelligent Resource Management is necessary. with cgroups

38 Summary 1.What is Swift? Swift is a Great OSS, for storing unstructured data. 2.Swift s Latest Information Erasure Coding Storage Policy 3.Swift s Future Integrated Searchable Storage Intelligent Resource management

PR: Demonstration is Now Available! We exhibit the Demo Application(Contents delivery system) built with Swift. On-demand Delivery a lot of contents(pictures or movies) stored at Swift. Implemented Searching on Swift. (Our original implementation) (map for demo booth) Copyright 2014 NTT DATA Corporation 39

Q&A: Do you have any question? Copyright 2011 NTT DATA Corporation Thank you for your attention! Please contact to hagiwarayuj@nttdata.co.jp, if you have any questions or comments. Copyright 2014 NTT DATA Corporation

Challenges and Questions How to integrate Swift with cgroups? How to use cgroups? What is the best toolset for cgroups? VFS? libcgroup? systemd? How to control multiple hosts with cgroups dynamically? How to integrate Swift with search? What is the best implementation way? What is the best search middleware? How to search Multilingual? Copyright 2014 NTT DATA Corporation 41