Cooling Down Ceph. Exploration and Evaluation of Cold Storage Techniques
|
|
|
- Arthur Morrison
- 9 years ago
- Views:
Transcription
1 Cooling Down Ceph Exploration and Evaluation of Cold Storage Techniques
2 Cold Storage Ceph Cooling Down Ceph Object Stubs Striper Prefix Hashing
3 Facebook photos [1]: 82% reads to 8% stored data Scientific data system [2]: 50% reads to 5% stored data [1] T. P. Morgan. Facebook Rolls Out New Web and Database Server Designs. _facebook_servers/, 2013 [2] M. Grawinkel, L. Nagel, M. Mäsker, F. Padua, A. Brinkmann, and L. Sorth. Analysis of the ECMW Storage Landscape. Proc. of the 13th USENIX Conference on File and Storage Technologies (FAST),
4 Cold storage is an operational mode or a method operation of a data storage device or system for inactive data where an explicit trade-off is made, resulting in data retrieval response times beyond what may be considered normally acceptable to online or production applications in order to archive significant capital and operational savings IDC Technology Assessment: Cold Storage Is Hot Again - Finding the Frost Point (2013) 4
5
6 Distributed storage system No single point of failure Horizontal scaling Run on commodity hardware 6
7 7
8 MON MON MON Client 8
9 Monitor Keeps the Cluster Map Distributed Consensus MON MON Not in data path MON Client 9
10 Client Computes placement based on Cluster Map Directly accesses s and MONs MON MON MON Client 10
11 Stores Objects Manages replication MON MON Placement MON Client 11
12 Pool PG PG 12
13 Pool s Buckets Pool Type PG Rack, Server, Disk, Type PG Replicated Erasure Coded Placement Groups 13
14 Placement Groups Pool Abstraction for placement computation PG ~ 100 per PG 14
15 Object Pool Data 4 MiB PG Name PG Matters Object Map 15
16 Pool Object Storage Device PG ~ 1 per Disk Replication PG Erasure Coding 16
17 Placement Object PG Hash CRUSH Cluster Map PG 17
18 Cooling Down Ceph
19 Ceph s Cold Storage Features Cache Tiering Erasure Coding Metadata-aware clients Semantic Pool Selection Metadata for Later Placement Striper Prefix Hashing Extra Placement Information Redirection and Stubbing Object Redirects Object Stubs Object Store Backend to Archive System Journal Cache 19
20 Ceph s Cold Storage Features Cache Tiering Erasure Coding Metadata-aware clients Semantic Pool Selection Metadata for Later Placement Striper Prefix Hashing Extra Placement Information Redirection and Stubbing Object Redirects Object Stubs Object Store Backend to Archive System Journal Cache 20
21 Object Stubs stub stub Archive System 21
22 Implementation As part of the Transparent to the clients New RADOS Operations Stub Unstub Implicit unstub 22
23 New RADOS Ops: Stub, Unstub Stub Server Stub Server stub IN stub OUT SET XAttrs RM XAttrs - URL - URL 23
24 Implementation: Stub, Unstub UNSTUB STUB Read Object HTTP PUT Get Xattr "stub_uri" HTTP GET url HTTP DELETE url TRUNCATE 0 SETXATTR "stub_uri" url WRITEFULL ect RMXATTR "stub_uri" 24
25 Operation B.5 operations that need ect data 59 zero X setxattrs Table B.1: Ceph operations Implicit Unstub Needs Object Data? Operation read X cache-flush X stat cache-evict X mapext X cache-try-flush X masktrunc X tmap2omap X sparse-read X set-alloc-hint notify notify-ack redirect unredirect assert-version clonerange X list-watchers list-snaps assert-src-version src-cmpxattr sync_read X getxattr write X getxattrs writefull X cmpxattr truncate X setxattr zero X setxattrs delete resetxattrs append X rmxattr startsync pull X settrunc push X trimtrunc X balance-reads X Needs Object Data? delete resetxattrs append X rmxattr startsync pull X settrunc push X trimtrunc X balance-reads X tmapup X unbalance-reads X tmapput X scrub X tmapget X scrub-reserve X create X scrub-unreserve X rollback X scrub-stop B.6Xstub watch scrub-map X omap-get-keys omap-get-vals omap-get-header Operation omap-get-vals-by-keys omap-set-vals omap-set-header omap-clear omap-rm-keys omap-cmp wrlock Table B.1: Ceph wrunlock operations Needs Object Data? rdlock Operation rdunlock uplock Needs Object Data? dnlock Continued on next pa call pgls pgls-filter copy-from X pg-hitset-ls copy-get-classic X pg-hitset-get undirty isdirty copy-get X pgnls pgnqls-filter
26 Benefits Supports links to external storage systems Stubbed Snapshots => Backup 26
27 Striper Prefix Hashing File File 27
28 converts an ect ID (ect_t oid) and ect locator (ect_locator_t Listing C.1: Original hash_key implementation loc, contains the pool ID) to a placement group (pg_t pg). uint32_t pg_pool_t::hash_key(const string& key, Full source code of the implementation, including the simulator, are pub- Implementation lished on github 1. Listing C.1 const showsstring& the original ns) const implementation; Listing { C.2 shows the implementation with Striper Prefix Hashing. Listing C.1: Original hash_key implementation } uint32_t pg_pool_t::hash_key(const string& key, { } string n = make_hash_str(key, ns); return ceph_str_hash(ect_hash, n.c_str(), n.length()); The changed method cuts off const the ect s string& name ns) const at the first dot found in the string inkey, which happens to be the sequence number part for ect string n = make_hash_str(key, ns); names generated by the Ceph Striper. return ceph_str_hash(ect_hash, n.c_str(), n.length()); Listing C.2: Changed hash_key implementation uint32_t The changed pg_pool_t::hash_key(const method cuts off the ect s string& name inkey, at the first dot found in the string inkey, which happens const to bestring& the sequence ns) const number part for ect names generated by the Ceph Striper. { string key(inkey); Listing C.2: Changed hash_key implementation if (flags & FLAG_HASHPSONLYPREFIX) { uint32_t pg_pool_t::hash_key(const string& inkey, string::size_type n = inkey.find("."); const string& ns) const { if (n!= string::npos) { string key(inkey); key = inkey.substr(0, n) ; } if (flags & FLAG_HASHPSONLYPREFIX) { } string::size_type n = inkey.find("."); } string n = make_hash_str(key, ns); if (n!= string::npos) { return ceph_str_hash(ect_hash, n.c_str(), n.length()); key = inkey.substr(0, n) ; } 28
29 Methodology ECMWF ECFS HPSS dump [1] 137 million files 14.8 PiB 10% random sample Simulator 29 [1] M. Grawinkel, L. Nagel, M. Mäsker, F. Padua, A. Brinkmann, and L. Sorth. Analysis of the ECMW Storage Landscape. Proc. of the 13th USENIX Conference on File and Storage Technologies (FAST), 2015
30 Table 5.1: ECFS HPSS Dump Properties Property Value Unit Workload Total #files 14,950,112 Total used capacity PiB Max file size 32 GiB ECFS HPSS 10% random sample 5.2 striper prefix hashes Size of Objects 6 4 MiB PiB Size of Objects > 4 MiB PiB Table 5.1: ECFS HPSS Dump Properties Property Value Unit Total #files 14,950,112 Total used capacity PiB Max file size 32 GiB Size of Objects 6 4 MiB PiB Size of Objects > 4 MiB PiB #Files MiB MiB-8MiB MiB-16MiB MiB-32MiB MiB-64MiB MiB-128MiB MiB-256MiB MiB-512MiB MiB-1GiB GiB-2GiB GiB-4GiB GiB-8GiB GiB-16GiB GiB-32GiB Figure 5.2: Histogram: File size distribution. Each bucket inclusive e.g 0<x6 4 MiB 30
31 Simulator 5.2 striper prefix ha 600 s PGs 45 minutes on 32 cores Simulator Table 5.2: Benchmark settings Hash Algorithm 1 RJenkins No 2 Linux No 3 RJenkins Yes 4 Linux Yes Prefix Hash Enabled? Simulating Ceph s placement decision by running a whole Ceph with Ceph FS was quickly found to be infeasible in terms of run For this test s, MONs and MDS components ran on a single s s ran on top of a FUSE overlay filesystem called blackholefs 2 cially designed to intercept writes. A special 31 simulator running only the bare minimum of Ceph and
32 pendix A contains more detailed data and values for the plots shown here. Distinct s per File Distinct s per File Table 5.3 shows statistics for the number of distinct primary s of all files in the workload or: that Does are larger than it 4work? MiB, because the number of s per file is always 1 for files with only one ect. Table 5.3: Distinct primary s per file for files larger than 4 MiB Statistic RJenkins Linux dcache RJenkins+Prefix Linux+Prefix Min Q Median Q Max As evident from the table the Striper Prefix Hash runs result in 1 per file. 32
33 In into anaccount. ideal balanced Ceph cluster all s store the same amount of data and ects. Figure 5.4 and 5.5 illustrate the number of ects per primary Balance and the size of data per primary respectively. Both measures are similar, but the size plot also takes ects not filled to the maximum into account. 1.2M 1.1M 1.0M 900.0k #Objects 800.0k 700.0k 600.0k 500.0k 400.0k 300.0k RJenkins Linux dcache RJenkins+Prefix Figure 5.4: Number of ect per Linux+Prefix Combined ect size / Byte 4.0TiB 3.8TiB 3.6TiB 3.4TiB 3.2TiB 3.0TiB 2.8TiB 2.6TiB 2.4TiB 2.2TiB 2.0TiB Figure 5.4: Number of ect per RJenkins Linux dcache 33 RJenkins+Prefix Linux+Prefix
34 Racap Ceph Cooling Down Ceph File Implementation and Evaluation Object Stubs Striper Prefix Hashing stub Archive System stub 34
35 Cooling Down Ceph Exploration and Evaluation of Cold Storage Techniques
36 Bonus Slides
37 Striper File, Block Device Image, etc. Blocks Stripes Object Set Objects
Ceph. A complete introduction.
Ceph A complete introduction. Itinerary What is Ceph? What s this CRUSH thing? Components Installation Logical structure Extensions Ceph is An open-source, scalable, high-performance, distributed (parallel,
Ceph. A file system a little bit different. Udo Seidel
Ceph A file system a little bit different Udo Seidel Ceph what? So-called parallel distributed cluster file system Started as part of PhD studies at UCSC Public announcement in 2006 at 7 th OSDI File system
Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality
Where IT perceptions are reality Product Spotlight A Look at the Future of Storage Featuring SUSE Enterprise Storage Document # SPOTLIGHT2013001 v5, January 2015 Copyright 2015 IT Brand Pulse. All rights
Building low cost disk storage with Ceph and OpenStack Swift
Background photo from: http://edelomahony.com/2011/07/25/loving-money-doesnt-bring-you-more/ Building low cost disk storage with Ceph and OpenStack Swift Paweł Woszuk, Maciej Brzeźniak TERENA TF-Storage
Snapshots in Hadoop Distributed File System
Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any
Engineering a NAS box
Engineering a NAS box One common use for Linux these days is as a Network Attached Storage server. In this talk I will discuss some of the challenges facing NAS server design and how these can be met within
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems Finding a needle in Haystack: Facebook
SUSE Enterprise Storage Highly Scalable Software Defined Storage. Gábor Nyers Sales Engineer @SUSE [email protected]
SUSE Enterprise Storage Highly Scalable Software Defined Storage Gábor Nyers Sales Engineer @SUSE [email protected] Setting the Stage Enterprise Data Capacity Utilization 1-3% 15-20% 20-25% Tier 0 Ultra
Google File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
A simple object storage system for web applications Dan Pollack AOL
A simple object storage system for web applications Dan Pollack AOL AOL Leading edge web services company AOL s business spans the internet 2 Motivation Most web content is static and shared Traditional
RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters
RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters Sage Weil, Andrew Leung, Scott Brandt, Carlos Maltzahn {sage,aleung,scott,carlosm}@cs.ucsc.edu University of California,
Understanding Storage Virtualization of Infortrend ESVA
Understanding Storage Virtualization of Infortrend ESVA White paper Abstract This white paper introduces different ways of implementing storage virtualization and illustrates how the virtualization technology
HDFS 2015: Past, Present, and Future
Apache: Big Data Europe 2015 HDFS 2015: Past, Present, and Future 9/30/2015 NTT DATA Corporation Akira Ajisaka Copyright 2015 NTT DATA Corporation Self introduction Akira Ajisaka (NTT DATA) Apache Hadoop
Sep 23, 2014. OSBCONF 2014 Cloud backup with Bareos
Sep 23, 2014 OSBCONF 2014 Cloud backup with Bareos OSBCONF 23/09/2014 Content: Who am I Quick overview of Cloud solutions Bareos and Backup/Restore using Cloud Storage Bareos and Backup/Restore of Cloud
Overview of RD Virtualization Host
RD Virtualization Host Page 1 Overview of RD Virtualization Host Remote Desktop Virtualization Host (RD Virtualization Host) is a Remote Desktop Services role service included with Windows Server 2008
Intel Virtual Storage Manager (VSM) 1.0 for Ceph Operations Guide January 2014 Version 0.7.1
Intel Virtual Storage Manager (VSM) 1.0 for Ceph Operations Guide January 2014 Version 0.7.1 Copyright Intel Corporation 2014, 2015 Page 2 Change History 0.31 Barnes - Based on VSM Product Specification
1. Comments on reviews a. Need to avoid just summarizing web page asks you for:
1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth
BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything
BlueArc unified network storage systems 7th TF-Storage Meeting Scale Bigger, Store Smarter, Accelerate Everything BlueArc s Heritage Private Company, founded in 1998 Headquarters in San Jose, CA Highest
BookKeeper overview. Table of contents
by Table of contents 1 BookKeeper overview...2 1.1 BookKeeper introduction... 2 1.2 In slightly more detail...2 1.3 Bookkeeper elements and concepts...3 1.4 Bookkeeper initial design... 3 1.5 Bookkeeper
Understanding Microsoft Storage Spaces
S T O R A G E Understanding Microsoft Storage Spaces A critical look at its key features and value proposition for storage administrators A Microsoft s Storage Spaces solution offers storage administrators
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction
dcache Introduction Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O. Box 3640 D-76021 Karlsruhe, Germany Dr. http://www.gridka.de What is dcache? Developed at DESY and FNAL Disk
The Design and Implementation of the Zetta Storage Service. October 27, 2009
The Design and Implementation of the Zetta Storage Service October 27, 2009 Zetta s Mission Simplify Enterprise Storage Zetta delivers enterprise-grade storage as a service for IT professionals needing
Understanding IBM Lotus Domino server clustering
Understanding IBM Lotus Domino server clustering Reetu Sharma Software Engineer, IBM Software Group Pune, India Ranjit Rai Software Engineer IBM Software Group Pune, India August 2009 Copyright International
Best Practices for Increasing Ceph Performance with SSD
Best Practices for Increasing Ceph Performance with SSD Jian Zhang [email protected] Jiangang Duan [email protected] Agenda Introduction Filestore performance on All Flash Array KeyValueStore
HDFS Under the Hood. Sanjay Radia. [email protected] Grid Computing, Hadoop Yahoo Inc.
HDFS Under the Hood Sanjay Radia [email protected] Grid Computing, Hadoop Yahoo Inc. 1 Outline Overview of Hadoop, an open source project Design of HDFS On going work 2 Hadoop Hadoop provides a framework
Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari
Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari 1 Agenda Introduction on the objective of the test activities
OPTIMIZING PRIMARY STORAGE WHITE PAPER FILE ARCHIVING SOLUTIONS FROM QSTAR AND CLOUDIAN
OPTIMIZING PRIMARY STORAGE WHITE PAPER FILE ARCHIVING SOLUTIONS FROM QSTAR AND CLOUDIAN CONTENTS EXECUTIVE SUMMARY The Challenges of Data Growth SOLUTION OVERVIEW 3 SOLUTION COMPONENTS 4 Cloudian HyperStore
The Use of Flash in Large-Scale Storage Systems. [email protected]
The Use of Flash in Large-Scale Storage Systems [email protected] 1 Seagate s Flash! Seagate acquired LSI s Flash Components division May 2014 Selling multiple formats / capacities today Nytro
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Cassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
Understanding Enterprise NAS
Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA
Big Data Analytics on Object Storage -- Hadoop over Ceph* Object Storage with SSD Cache
Big Data Analytics on Object Storage -- Hadoop over Ceph* Object Storage with SSD Cache David Cohen ([email protected] ) Yuan Zhou ([email protected]) Jun Sun ([email protected]) Weiting Chen ([email protected])
GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"
GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID
<Insert Picture Here> Btrfs Filesystem
Btrfs Filesystem Chris Mason Btrfs Goals General purpose filesystem that scales to very large storage Feature focused, providing features other Linux filesystems cannot Administration
EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers. Keith Swindell Dell Storage Product Planning Manager
EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers Keith Swindell Dell Storage Product Planning Manager Topics Guiding principles Network load balancing MPIO Capacity load balancing
CS514: Intermediate Course in Computer Systems
: Intermediate Course in Computer Systems Lecture 7: Sept. 19, 2003 Load Balancing Options Sources Lots of graphics and product description courtesy F5 website (www.f5.com) I believe F5 is market leader
From Internet Data Centers to Data Centers in the Cloud
From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
Large Scale Storage. Orlando Richards, Information Services [email protected]. LCFG Users Day, University of Edinburgh 18 th January 2013
Large Scale Storage Orlando Richards, Information Services [email protected] LCFG Users Day, University of Edinburgh 18 th January 2013 Overview My history of storage services What is (and is not)
Data Storage in Clouds
Data Storage in Clouds Jan Stender Zuse Institute Berlin contrail is co-funded by the EC 7th Framework Programme 1 Overview Introduction Motivation Challenges Requirements Cloud Storage Systems XtreemFS
CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT
SS Data & Storage CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT HEPiX Fall 2012 Workshop October 15-19, 2012 Institute of High Energy Physics, Beijing, China SS Outline
Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! [email protected]
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! [email protected] 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved.
Getting performance & scalability on standard platforms, the Object vs Block storage debate 1 December Webinar Session Getting performance & scalability on standard platforms, the Object vs Block storage
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High
COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Spring 2014. HDFS Basics
COSC 6397 Big Data Analytics Distributed File Systems (II) Edgar Gabriel Spring 2014 HDFS Basics An open-source implementation of Google File System Assume that node failure rate is high Assumes a small
Increase Database Performance by Implementing Cirrus Data Solutions DCS SAN Caching Appliance With the Seagate Nytro Flash Accelerator Card
Implementing Cirrus Data Solutions DCS SAN Caching Appliance With the Seagate Nytro Technology Paper Authored by Rick Stehno, Principal Database Engineer, Seagate Introduction Supporting high transaction
Web-Based Data Backup Solutions
"IMAGINE LOSING ALL YOUR IMPORTANT FILES, IS NOT OF WHAT FILES YOU LOSS BUT THE LOSS IN TIME, MONEY AND EFFORT YOU ARE INVESTED IN" The fact Based on statistics gathered from various sources: 1. 6% of
09'Linux Plumbers Conference
09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center [email protected] 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing
Hadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1
XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1 The XtreemOS Project Research project funded by the
File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez
File Systems for Flash Memories Marcela Zuluaga Sebastian Isaza Dante Rodriguez Outline Introduction to Flash Memories Introduction to File Systems File Systems for Flash Memories YAFFS (Yet Another Flash
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How
Data storage services at CC-IN2P3
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:
ZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...
www.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
Storage Class Memory Support in the Windows Operating System Neal Christiansen Principal Development Lead Microsoft nealch@microsoft.
Storage Class Memory Support in the Windows Operating System Neal Christiansen Principal Development Lead Microsoft [email protected] What is Storage Class Memory? Paradigm Shift: A non-volatile storage
Object storage in Cloud Computing and Embedded Processing
Object storage in Cloud Computing and Embedded Processing Jan Jitze Krol Systems Engineer DDN We Accelerate Information Insight DDN is a Leader in Massively Scalable Platforms and Solutions for Big Data
Alternatives to Big Backup
Alternatives to Big Backup Life Cycle Management, Object- Based Storage, and Self- Protecting Storage Systems Presented by: Chris Robertson Solution Architect Cambridge Computer Copyright 2010-2011, Cambridge
StorReduce Technical White Paper Cloud-based Data Deduplication
StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog
Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage
Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes
Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine. Ronny L. Bull Project Writeup For: CS644 Clarkson University
Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine Ronny L. Bull Project Writeup For: CS644 Clarkson University Fall 2012 Abstract If a managed switch is used during
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
The Hadoop Distributed File System
The Hadoop Distributed File System The Hadoop Distributed File System, Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, Yahoo, 2010 Agenda Topic 1: Introduction Topic 2: Architecture
Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems
Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems 1 Some Numbers (2010) Over 260 Billion images (20 PB) 65 Billion X 4 different sizes for each image. 1 Billion
Flexible Storage Allocation
Flexible Storage Allocation A. L. Narasimha Reddy Department of Electrical and Computer Engineering Texas A & M University Students: Sukwoo Kang (now at IBM Almaden) John Garrison Outline Big Picture Part
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF
Panasas at the RCF HEPiX at SLAC Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory Centralized File Service Single, facility-wide namespace for files. Uniform, facility-wide
High Availability Databases based on Oracle 10g RAC on Linux
High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN, June 2006 Luca Canali, CERN IT Outline Goals Architecture of an HA DB Service Deployment at the CERN Physics Database
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems [email protected] Big Data Invasion We hear so much on Big Data and
Hadoop Distributed File System. Dhruba Borthakur June, 2007
Hadoop Distributed File System Dhruba Borthakur June, 2007 Goals of HDFS Very Large Distributed File System 10K nodes, 100 million files, 10 PB Assumes Commodity Hardware Files are replicated to handle
Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics data 4
Improving Scalability Of Storage System:Object Storage Using Open Stack Swift
Improving Scalability Of Storage System:Object Storage Using Open Stack Swift G.Kathirvel Karthika 1,R.C.Malathy 2,M.Keerthana 3 1,2,3 Student of Computer Science and Engineering, R.M.K Engineering College,Kavaraipettai.
Tushar Joshi Turtle Networks Ltd
MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann
Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program
Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election
Cloud and Big Data initiatives. Mark O Connell, EMC
Object storage PRESENTATION systems: TITLE GOES the underpinning HERE of Cloud and Big Data initiatives Mark O Connell, EMC SNIA Legal Notice The material contained in this tutorial is copyrighted by the
Scientific Computing Data Management Visions
Scientific Computing Data Management Visions ELI-Tango Workshop Szeged, 24-25 February 2015 Péter Szász Group Leader Scientific Computing Group ELI-ALPS Scientific Computing Group Responsibilities Data
Chapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR
Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR December, 2012 Peter McGonigal [email protected] Abstract Dynamic Disk Pools (DDP) offer an exciting new approach to traditional RAID sets by substantially
Application Brief: Using Titan for MS SQL
Application Brief: Using Titan for MS Abstract Businesses rely heavily on databases for day-today transactions and for business decision systems. In today s information age, databases form the critical
Monitoring DoubleTake Availability
Monitoring DoubleTake Availability eg Enterprise v6 Restricted Rights Legend The information contained in this document is confidential and subject to change without notice. No part of this document may
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011
BookKeeper Flavio Junqueira Yahoo! Research, Barcelona Hadoop in China 2011 What s BookKeeper? Shared storage for writing fast sequences of byte arrays Data is replicated Writes are striped Many processes
Moving Virtual Storage to the Cloud
Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage
Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface
File Management Two Parts Filesystem Interface Interface the user sees Organization of the files as seen by the user Operations defined on files Properties that can be read/modified Filesystem design Implementing
COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card
Violin: A Framework for Extensible Block-level Storage
Violin: A Framework for Extensible Block-level Storage Michail Flouris Dept. of Computer Science, University of Toronto, Canada [email protected] Angelos Bilas ICS-FORTH & University of Crete, Greece
POSIX and Object Distributed Storage Systems
1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome
CSE-E5430 Scalable Cloud Computing P Lecture 5
CSE-E5430 Scalable Cloud Computing P Lecture 5 Keijo Heljanko Department of Computer Science School of Science Aalto University [email protected] 12.10-2015 1/34 Fault Tolerance Strategies for Storage
21 st Century Storage What s New and What s Changing
21 st Century Storage What s New and What s Changing Randy Kerns Senior Strategist Evaluator Group Overview New technologies in storage - Continued evolution - Each has great economic value - Differing
Choosing Storage Systems
Choosing Storage Systems For MySQL Peter Zaitsev, CEO Percona Percona Live MySQL Conference and Expo 2013 Santa Clara,CA April 25,2013 Why Right Choice for Storage is Important? 2 because Wrong Choice
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 [email protected] San Diego Supercomputer Center
