Erasure Coding in Windows Azure Storage
|
|
- Nickolas Briggs
- 7 years ago
- Views:
Transcription
1 Erasure Coding in Windows Azure Storage Cheng Huang Microsoft Corporation Joint work with Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin
2 Outline Introduction to Windows Azure Storage (WAS) Conventional Erasure Coding in WAS Local Reconstruction Coding in WAS 2
3 North America Region Europe Region Asia Pacific Region Major datacenter CDN PoPs 3
4 Windows Azure Storage Abstractions Blobs File store in the cloud CDN High performance file delivery through proximity caching Drives Durable NTFS volumes for Windows Azure applications Tables Massively scalable NoSQL storage Queues Reliable storage and delivery of messages Easy client access Easy to use REST APIs and Client Libraries Existing NTFS APIs for Windows Azure Drives 4
5 Massive Distributed Storage Systems in Cloud Failures are norm rather than exception As the number of components increase, so does the probability of failure MTTFFirst = MTTFOne / Redundancy is necessary to cope with failures Replication vs. Erasure Coding? n 5
6 Replication vs. Erasure Coding replication a=2 erasure coding a=2 a=2 b=3 b=3 b=3 a=2 a+b=5 a=2 b=3 6
7 Replication vs. Erasure Coding replication a=2 erasure coding a=2 a=2 b=3 b=3 b=3 a=2 a+b=5 b=3 2a+b=7 a=2 b=3 a=2, b=3 7
8 WAS Stream Layer Append-only distributed file system Provides intra-stamp redundancy Streams are very large files Has file system like namespace Ordered list of pointers to extents Extents Unit of replication Sequence of blocks Partition Layer Stream Layer Intra-stamp replication Storage Stamp Storage Location Service Inter-stamp (Geo) replication Partition Layer Stream Layer Intra-stamp replication Storage Stamp 8
9 Replication and Lazy Erasure Coding Extents triple replicated when first created while being appended 9
10 Replica Placement in a Storage Stamp 10
11 Replication and Lazy Erasure Coding Extents triple replicated when first created while being appended Extents lazily erasure coded in the background extents sealed at around 3GB when erasure coding finishes, full replicas are deleted policies to choose between replication, erasure coding, or a mix 11
12 Conventional Erasure Coding RS 6+3 sealed extent ( 3 GB ) p 1 d 0 d 1 d 2 d 3 d 4 d 5 p 2 Reed-Solomon 6+3 p 3 12
13 Erasure Coding Flow Paxos ack with metadata SM SM SM start erasure coding an extent EN 1 EN 2 EN 3 EN 13
14 Space Savings (RS 6+3 over 3-replication) Savings Replicated Data Fragments Code Fragments 14
15 How to Further Reduce Storage Cost? sealed extent ( 3 GB ) overhead (6+3)/6 = 1.5x d 0 d 1 d 2 d 3 d 4 d 5 p 0 p 1 p 2 15
16 How to Further Reduce Storage Cost? sealed extent ( 3 GB ) overhead (6+3)/6 = 1.5x d 0 d 1 d 2 d 3 d 4 d 5 p 0 p 1 p 2 (12+4)/12 = 1.33x d 0 d 1 d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 10 d 11 p 0 p 1 p 2 p 3 16
17 Challenge p 0 d 0 d 1 d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 10 d 11 p 1 reconstruction read p 2 p 3 17
18 Reconstruction Read Partition Layer Paxos SM SM SM read data reconstruction read EN 1 EN 2 EN 3 EN 18
19 Challenge p 0 d 0 d 1 d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 10 d 11 p 1 reconstruction read is expensive reading d 0 during unavailability requiring 12 fragments (12 disk I/Os, 12 net transfers) p 2 p 3 19
20 Reconstruction Read When? Load balancing avoid hot storage node serve reads via reconstruction Rolling upgrade Transient unavailability and permanent failures can we achieve1.33x overhead while requiring only 6 fragments for reconstruction? 20
21 Opportunity Conventional EC all failures are equal same reconstruction cost, regardless of failure # Cloud storage Prob(1 failure) >> Prob(2 or more failures) optimize erasure coding for cloud storage making single failure reconstruction most efficient 21
22 Local Reconstruction Code sealed extent ( 3 GB ) x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 LRC : 12 data fragments, 2 local parities and 2 global parities storage overhead: ( ) / 12 = 1.33x Local parity: reconstruction requires only 6 fragments 22
23 One More Thing Ensuring Reliability in LRC LRC needs to recover arbitrary 3 failures as many 4 failures as possible 23
24 Recover 3 Failures 1 st example q 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q 1 p x p y recover y 1 from p y (group y) 24
25 Recover 3 Failures 1 st example q 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q 1 p x p y recover y 1 from p y (group y) recover x 0 and x 2 from q 0 and q 1 25
26 Recover 3 Failures 2 nd example q 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q 1 p x p y 26
27 Recover 3 Failures 2 nd example q ' 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q ' 1 p x p y cancel all the y i s from global parities q 0 and q 1 27
28 Recover 4 Failures 1 st example q 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q 1 p x p y p y is redundant 3 parities cannot recover 4 failures 28
29 Recover 4 Failures 2 nd example q 0 x 0 x 1 x 2 x 3 x 4 x 5 y 0 y 1 y 2 y 3 y 4 y 5 q 1 p x p y Maximal Recoverability swap failure with local parity count remaining failures and global parities 29
30 Properties of LRC Achieving maximal recoverability LRC : arbitrary 3 failures and 86% of 4 failures reliability: RS 12+4 > LRC > RS 6+3 technical paper published at USENIX ATC 2012 Requiring minimum storage overhead, given reconstruction cost fault tolerance technical paper to appear in IEEE Trans. on Information Theory 30
31 Cost & Performance Analysis Vary LRC parameters trade-off points in 3D space storage overhead reconstruction cost reliability (MTTDL) Reliability is a hard requirement set MTTDL 3-replication as target reduce trade-off space to 2D 31
32 LRC vs. Reed-Solomon RS LRC (12+2+2) LRC (12+4+2) Reconstruction Read Cost 10 RS 10+4 same cost 1.5x 1.33x 8 RS 6+3 same overhead 6 half cost (6 3) Storage Overhead Reed-Solomon LRC RS 10+4 : HDFS-RAID at Facebook RS 6+3 : GFS II (Colossus) at Google 32
33 Choice of Windows Azure Storage RS (6 + 3) reconstruction cost = 6 RS (14 + 4) reconstruction cost = 14 LRC ( ) reconstruction cost = 7 14% savings 33
34 Encoding/Decoding Complexity Optimized erasure coding implementation homomorphic transformation convert complex finite field matrix operations to XOR ing large data blocks XOR scheduling further improves throughput Encoding/decoding overhead negligible, compared with network/disk overheads 34
35 Additional Lessons I/O scheduling reconstruction/recovery/on-demand traffic prioritized and throttled carefully erasure encoding rate kept up with data injection rate Data consistency checksum handoff and verification between all levels scrub periodically 35
36 Summary Erasure coding enables significant storage cost savings in Windows Azure Storage with higher reliability than 3-replication LRC achieves additional 14% savings without compromising performance Windows Azure Storage Team Blog 36
Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft
Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft Agenda: Windows Azure Storage Overview Architecture Key Design Points 2 Overview Windows Azure Storage Cloud Storage - Anywhere and
More informationCloud Scale Storage Systems. Sean Ogden October 30, 2013
Cloud Scale Storage Systems Sean Ogden October 30, 2013 Evolution P2P routing/dhts (Chord, CAN, Pastry, etc.) P2P Storage (Pond, Antiquity) Storing Greg's baby pictures on machines of untrusted strangers
More informationWindows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency
Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav,
More informationHow To Manage A Network On A Server On A Microsoft Server On An Ipad Or Ipad (For A Supermicroserve) On A Network (For An Ipa) On An Uniden (For Microsoft) Server On Your
Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav,
More informationA Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster
A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster K. V. Rashmi 1, Nihar B. Shah 1, Dikang Gu 2, Hairong Kuang
More informationINDIA 28-30 September 2011 virtual techdays
Building highly Available Services on Windows Azure Platform Pooja Singh Technical Architect, Accenture Aakash Sharma Technical Lead, Accenture Laxmikant Bhole Senior Architect, Accenture Assumptions You
More informationPractical Data Integrity Protection in Network-Coded Cloud Storage
Practical Data Integrity Protection in Network-Coded Cloud Storage Henry C. H. Chen Department of Computer Science and Engineering The Chinese University of Hong Kong Outline Introduction FMSR in NCCloud
More informationCSE-E5430 Scalable Cloud Computing P Lecture 5
CSE-E5430 Scalable Cloud Computing P Lecture 5 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 12.10-2015 1/34 Fault Tolerance Strategies for Storage
More informationGoogle File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
More informationCloud Storage over Multiple Data Centers
Cloud Storage over Multiple Data Centers Shuai MU, Maomeng SU, Pin GAO, Yongwei WU, Keqin LI, Albert ZOMAYA 0 Abstract The increasing popularity of cloud storage services has led many companies to migrate
More informationDisk Array Data Organizations and RAID
Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution
More informationwww.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
More informationWindows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com
Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com David Pallmann, Neudesic Windows Azure Windows Azure is the foundation of Microsoft s Cloud Platform It is an Operating
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) PERCEIVING AND RECOVERING DEGRADED DATA ON SECURE CLOUD
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume 4,
More informationData Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop
Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop Arguments to be discussed Scaling storage for clouds Is RAID dead? Erasure coding as RAID replacement
More informationSharePoint & Azure: Digital Asset Management
SharePoint & Azure: Digital Asset Management Project Leadership Microsoft Solutions Provider Proven Results www.attunix.com Introduction Attunix Corporation: A Bellevue, WA based business & technology
More informationDesigning a Cloud Storage System
Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes
More informationLecture 36: Chapter 6
Lecture 36: Chapter 6 Today s topic RAID 1 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for
More informationTECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE
TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE Deploy a modern hyperscale storage platform on commodity infrastructure ABSTRACT This document provides a detailed overview of the EMC
More informationReliability and Fault Tolerance in Storage
Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom
More informationStorage Codes: Managing Big Data with Small Overheads
Storage Codes: Managing Big Data with Small Overheads Anwitaman Datta, Frédérique Oggier Nanyang Technological University, Singapore {anwitaman,frederique}@ntu.edu.sg WWW: http://sands.sce.ntu.edu.sg/codingfornetworkedstorage/
More informationHDFS: Hadoop Distributed File System
Istanbul Şehir University Big Data Camp 14 HDFS: Hadoop Distributed File System Aslan Bakirov Kevser Nur Çoğalmış Agenda Distributed File System HDFS Concepts HDFS Interfaces HDFS Full Picture Read Operation
More informationOnline Code Rate Adaptation in Cloud Storage Systems with Multiple Erasure Codes
Online Code Rate Adaptation in Cloud Storage Systems with Multiple Erasure Codes Rui Zhu 1, Di Niu 1, Zongpeng Li 2 1 Department of Electrical and Computer Engineering, University of Alberta, Edmonton,
More informationHadoop Scalability at Facebook. Dmytro Molkov (dms@fb.com) YaC, Moscow, September 19, 2011
Hadoop Scalability at Facebook Dmytro Molkov (dms@fb.com) YaC, Moscow, September 19, 2011 How Facebook uses Hadoop Hadoop Scalability Hadoop High Availability HDFS Raid How Facebook uses Hadoop Usages
More informationDistributed Systems. Tutorial 12 Cassandra
Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse
More informationSnapshots in Hadoop Distributed File System
Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any
More informationCS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
More informationCS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationDistributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
More informationCompare Cost and Performance of Replication and Erasure Coding
Hitachi Review Vol. 63 (July 2014) 304 WHITE PAPER Compare Cost and Performance of Replication and Erasure Coding John D. Cook Robert Primmer Ab de Kwant OVERVIEW: Data storage systems are more reliable
More informationData Corruption In Storage Stack - Review
Theoretical Aspects of Storage Systems Autumn 2009 Chapter 2: Double Disk Failures André Brinkmann Data Corruption in the Storage Stack What are Latent Sector Errors What is Silent Data Corruption Checksum
More informationThe last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery.
Offline Operations Traffic ManagerLarge Memory SKU SQL, SharePoint, BizTalk Images HDInsight Windows Phone Support Per Minute Billing HTML 5/CORS Android Support Custom Mobile API AutoScale BizTalk Services
More informationHypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
More informationHow To Build Cloud Storage On Google.Com
Building Scalable Cloud Storage Alex Kesselman alx@google.com Agenda Desired System Characteristics Scalability Challenges Google Cloud Storage What does a customer want from a cloud service? Reliability
More informationRAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead
RAID # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead Tiffany Yu-Han Chen (These slides modified from Hao-Hua Chu National Taiwan University) RAID 0 - Striping
More informationDesign and Evolution of the Apache Hadoop File System(HDFS)
Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop
More informationCloud Computing Is In Your Future
Cloud Computing Is In Your Future Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Cloud Computing is Utility Computing Illusion
More informationDistributed Systems (5DV147) What is Replication? Replication. Replication requirements. Problems that you may find. Replication.
Distributed Systems (DV47) Replication Fall 20 Replication What is Replication? Make multiple copies of a data object and ensure that all copies are identical Two Types of access; reads, and writes (updates)
More informationTHE HADOOP DISTRIBUTED FILE SYSTEM
THE HADOOP DISTRIBUTED FILE SYSTEM Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Presented by Alexander Pokluda October 7, 2013 Outline Motivation and Overview of Hadoop Architecture,
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction
More informationThe Hadoop Distributed File System
The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS
More informationGrowth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage
Growth of Unstructured Data & Object Storage Marcel Laforce Sr. Director, Object Storage Agenda Unstructured Data Growth Contrasting approaches: Objects, Files & Blocks The Emerging Object Storage Market
More informationHDFS 2015: Past, Present, and Future
Apache: Big Data Europe 2015 HDFS 2015: Past, Present, and Future 9/30/2015 NTT DATA Corporation Akira Ajisaka Copyright 2015 NTT DATA Corporation Self introduction Akira Ajisaka (NTT DATA) Apache Hadoop
More informationBuilding Storage Clouds for Online Applications A Case for Optimized Object Storage
Building Storage Clouds for Online Applications A Case for Optimized Object Storage Agenda Introduction: storage facts and trends Call for more online storage! AmpliStor: Optimized Object Storage Cost
More informationOverview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics
Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)
More informationCloud Computing with Windows Azure. beat schwegler microsoft western europe beatsch@microsoft.com
Cloud Computing with Windows Azure beat schwegler microsoft western europe beatsch@microsoft.com why? cheaper. risk mitigation. agility. what? elastic compute. scalable storage. network topology. how?
More informationThe Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc The need for new architectures Business innovation Time-to-market
More informationIBM System x GPFS Storage Server
IBM System x GPFS Storage Server Schöne Aussicht en für HPC Speicher ZKI-Arbeitskreis Paderborn, 15.03.2013 Karsten Kutzer Client Technical Architect Technical Computing IBM Systems & Technology Group
More informationPelican: A Building Block for Exascale Cold Data Storage
Pelican: A Building Block for Exascale Cold Data Storage Austin Donnelly Microsoft Research and: Shobana Balakrishnan, Richard Black, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson,
More informationCloud Computing with Microsoft Azure
Cloud Computing with Microsoft Azure Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Azure's Three Flavors Azure Operating
More informationData Management in the Cloud
Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
More informationService Description Cloud Storage Openstack Swift
Service Description Cloud Storage Openstack Swift Table of Contents Overview iomart Cloud Storage... 3 iomart Cloud Storage Features... 3 Technical Features... 3 Proxy... 3 Storage Servers... 4 Consistency
More informationDSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group
DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a
More informationCloud Service Model. Selecting a cloud service model. Different cloud service models within the enterprise
Cloud Service Model Selecting a cloud service model Different cloud service models within the enterprise Single cloud provider AWS for IaaS Azure for PaaS Force fit all solutions into the cloud service
More informationAt-Scale Data Centers & Demand for New Architectures
Allen Samuels At-Scale Data Centers & Demand for New Architectures Software Architect, Software and Systems Solutions August 12, 2015 1 Forward-Looking Statements During our meeting today we may make forward-looking
More informationReliability of Data Storage Systems
Zurich Research Laboratory Ilias Iliadis April 2, 25 Keynote NexComm 25 www.zurich.ibm.com 25 IBM Corporation Long-term Storage of Increasing Amount of Information An increasing amount of information is
More informationThe Design and Implementation of the Zetta Storage Service. October 27, 2009
The Design and Implementation of the Zetta Storage Service October 27, 2009 Zetta s Mission Simplify Enterprise Storage Zetta delivers enterprise-grade storage as a service for IT professionals needing
More informationIntroduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and
Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and Jaliya Ekanayake Range in size from edge facilities
More informationBig Data Processing in the Cloud. Shadi Ibrahim Inria, Rennes - Bretagne Atlantique Research Center
Big Data Processing in the Cloud Shadi Ibrahim Inria, Rennes - Bretagne Atlantique Research Center Data is ONLY as useful as the decisions it enables 2 Data is ONLY as useful as the decisions it enables
More informationLearning Management Redefined. Acadox Infrastructure & Architecture
Learning Management Redefined Acadox Infrastructure & Architecture w w w. a c a d o x. c o m Outline Overview Application Servers Databases Storage Network Content Delivery Network (CDN) & Caching Queuing
More informationFiling Systems. Filing Systems
Filing Systems At the outset we identified long-term storage as desirable characteristic of an OS. EG: On-line storage for an MIS. Convenience of not having to re-write programs. Sharing of data in an
More informationHow To Encrypt Data With A Power Of N On A K Disk
Towards High Security and Fault Tolerant Dispersed Storage System with Optimized Information Dispersal Algorithm I Hrishikesh Lahkar, II Manjunath C R I,II Jain University, School of Engineering and Technology,
More informationPARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
More informationApache Hadoop FileSystem Internals
Apache Hadoop FileSystem Internals Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System dhruba@apache.org Presented at Storage Developer Conference, San Jose September 22, 2010 http://www.facebook.com/hadoopfs
More informationDeploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk
WHITE PAPER Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk 951 SanDisk Drive, Milpitas, CA 95035 2015 SanDisk Corporation. All rights reserved. www.sandisk.com Table of Contents Introduction
More informationLong term retention and archiving the challenges and the solution
Long term retention and archiving the challenges and the solution NAME: Yoel Ben-Ari TITLE: VP Business Development, GH Israel 1 Archive Before Backup EMC recommended practice 2 1 Backup/recovery process
More informationFile System Reliability (part 2)
File System Reliability (part 2) Main Points Approaches to reliability Careful sequencing of file system opera@ons Copy- on- write (WAFL, ZFS) Journalling (NTFS, linux ext4) Log structure (flash storage)
More informationMassive Data Storage
Massive Data Storage Storage on the "Cloud" and the Google File System paper by: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung presentation by: Joshua Michalczak COP 4810 - Topics in Computer Science
More informationarxiv:1302.3344v2 [cs.dc] 6 Jun 2013
CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems arxiv:1302.3344v2 [cs.dc] 6 Jun 2013 Runhui Li, Jian Lin, Patrick Pak-Ching Lee Department
More informationNetwork Coding for Distributed Storage
Network Coding for Distributed Storage Alex Dimakis USC Overview Motivation Data centers Mobile distributed storage for D2D Specific storage problems Fundamental tradeoff between repair communication and
More informationOceanStor UDS Massive Storage System Technical White Paper Reliability
OceanStor UDS Massive Storage System Technical White Paper Reliability Issue 1.1 Date 2014-06 HUAWEI TECHNOLOGIES CO., LTD. 2013. All rights reserved. No part of this document may be reproduced or transmitted
More informationTo Provide Security & Integrity for Storage Services in Cloud Computing
To Provide Security & Integrity for Storage Services in Cloud Computing 1 vinothlakshmi.s Assistant Professor, Dept of IT, Bharath Unversity, Chennai, TamilNadu, India ABSTRACT: we propose in this paper
More informationSWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation
Page:1 Openstack Swift Object Store Cloud built from the grounds up David Hadas Swift ATC HRL davidh@il.ibm.com Page:2 Object Store Cloud Services Expectations: PUT/GET/DELETE Huge Capacity (Scale) Always
More informationA Hitchhiker s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers
A Hitchhiker s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers K V Rashmi 1, Nihar B Shah 1, Dikang Gu 2, Hairong Kuang 2, Dhruba Borthakur 2, and Kannan Ramchandran 1 ABSTRACT
More informationJournal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra
More informationScaling Analysis Services in the Cloud
Our Sponsors Scaling Analysis Services in the Cloud by Gerhard Brückl gerhard@gbrueckl.at blog.gbrueckl.at About me Gerhard Brückl Working with Microsoft BI since 2006 Windows Azure / Cloud since 2013
More informationThe Google File System
The Google File System By Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Presented at SOSP 2003) Introduction Google search engine. Applications process lots of data. Need good file system. Solution:
More informationHSS: A simple file storage system for web applications
HSS: A simple file storage system for web applications Abstract AOL Technologies has created a scalable object store for web applications. The goal of the object store was to eliminate the creation of
More informationChristian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg
DAAD Summerschool Curitiba 2011 Aspects of Large Scale High Speed Computing Building Blocks of a Cloud Storage Networks 2: Virtualization of Storage: RAID, SAN and Virtualization Christian Schindelhauer
More informationhttp://cloud.dailymotion.com July 2014
July 2014 Dailymotion Cloud Positioning Two video platforms based on one infrastructure Dailymotion.com DELIVER, SHARE AND MONETIZE YOUR VIDEO CONTENT Online sharing videos platform Dailymotion Cloud CONCRETIZE
More informationErasure Coding for Cloud Storage Systems: A Survey
TSINGHUA SCIENCE AND TECHNOLOGY ISSNll1007-0214ll06/11llpp259-272 Volume 18, Number 3, June 2013 Erasure Coding for Cloud Storage Systems: A Survey Jun Li and Baochun Li Abstract: In the current era of
More informationApache Hadoop FileSystem and its Usage in Facebook
Apache Hadoop FileSystem and its Usage in Facebook Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System dhruba@apache.org Presented at Indian Institute of Technology November, 2010 http://www.facebook.com/hadoopfs
More informationScality RING. Software Defined Storage for the 21st Century. Philippe Nicolas Director of Product Strategy
Scality RING Software Defined Storage for the 21st Century Philippe Nicolas Director of Product Strategy Table of Contents 1 Executive Summary The Need for Massively Scalable Storage... 4 2 Requirements
More informationWindows Azure and private cloud
Windows Azure and private cloud Joe Chou Senior Program Manager China Cloud Innovation Center Customer Advisory Team Microsoft Asia-Pacific Research and Development Group 1 Agenda Cloud Computing Fundamentals
More informationPetabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics
More informationFinding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems
Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems 1 Some Numbers (2010) Over 260 Billion images (20 PB) 65 Billion X 4 different sizes for each image. 1 Billion
More informationSolbox Cloud Storage Acceleration
DATA SHEET Solbox Cloud Storage Acceleration Today s ongoing and rapidly-accelerating growth in data comes at the same time that organizations of all sizes are focused on cost deduction. Cloud storage
More informationDFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing
DFSgc Distributed File System for Multipurpose Grid Applications and Cloud Computing Introduction to DFSgc. Motivation: Grid Computing currently needs support for managing huge quantities of storage. Lacks
More informationModule: Business Continuity
Upon completion of this module, you should be able to: Describe business continuity and cloud service availability Describe fault tolerance mechanisms for cloud infrastructure Discuss data protection solutions
More informationHDFS and Availability Data Retragement
마스터 제목 스타일 편집 마스터 부제목 Availability 스타일 편집 and Data durability in HDFS 3 Jun 2011 nfracatals, 고등기술 연구소 / 이문수 moon@nfractals.com Company Profile and Business 1 Who we are? Since 2009 Consulting Solution
More informationDiagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
More informationBig Data Storage Architecture Design in Cloud Computing
Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,
More informationIntroduction to Azure: Microsoft s Cloud OS
Introduction to Azure: Microsoft s Cloud OS DI Andreas Schabus Technology Advisor Microsoft Österreich GmbH aschabus@microsoft.com www.codefest.at Version 1.0 Agenda Cloud Computing Fundamentals Windows
More informationGPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"
GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID
More informationBig Data Benchmark. The Cloudy Approach. Dhruba Borthakur Nov 7, 2012 at the SDSC Industry Roundtable on Big Data and Big Value
Big Data Benchmark The Cloudy Approach Dhruba Borthakur Nov 7, 2012 at the SDSC Industry Roundtable on Big Data and Big Value My Asks for a Cloudy Benchmark 1 BigData system is not a Traditional Database
More informationReferences. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline
References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
More informationIntroduction to Database Systems CSE 444
Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
More informationHadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela
Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance
More informationDistributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
More information