DSS. The Data Storage Services (DSS) Strategy at CERN. Jakub T. Moscicki. (Input from J. Iven, M. Lamanna A. Pace, A. Peters and A.



Similar documents
Data and Storage Services

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

ntier Verde Simply Affordable File Storage

CERNBox + EOS: Cloud Storage for Science

Storage strategy and cloud storage evaluations at CERN Dirk Duellmann, CERN IT

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

Distributed File Systems

Data storage at CERN

Key Considerations for Managing Big Data in the Life Science Industry

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

Scala Storage Scale-Out Clustered Storage White Paper

IBM System x GPFS Storage Server

Hadoop Distributed File System (HDFS) Overview

Maurice Askinazi Ofer Rind Tony Wong. Cornell Nov. 2, 2010 Storage at BNL

Storage Virtualization. Andreas Joachim Peters CERN IT-DSS

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

EMC AVAMAR. a reason for Cloud. Deduplication backup software Replication for Disaster Recovery

Next Generation Operating Systems

How To Make A Backup System More Efficient

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"


MinCopysets: Derandomizing Replication In Cloud Storage

Storage Systems Autumn Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

Distributed Filesystems

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction

Ground up Introduction to In-Memory Data (Grids)

Worry-free Storage. E-Series Simple SAN Storage

August Transforming your Information Infrastructure with IBM s Storage Cloud Solution

Parallels Cloud Storage

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

ebay Storage, From Good to Great

Object Oriented Storage and the End of File-Level Restores

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Netapp HPC Solution for Lustre. Rich Fenton UK Solutions Architect

Dynamic Disk Pools Delivering Worry-Free Storage

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Big Data With Hadoop

HyperQ Storage Tiering White Paper

Archiving On-Premise and in the Cloud. March 2015

Symantec Backup Appliances

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Intro to Map/Reduce a.k.a. Hadoop

Hadoop: Embracing future hardware

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Data storage services at CC-IN2P3

Massive Data Storage

IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO)

Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved.

Managing managed storage

RAID Storage, Network File Systems, and DropBox

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Advances in Virtualization In Support of In-Memory Big Data Applications

CHAPTER 7 SUMMARY AND CONCLUSION

SMB Direct for SQL Server and Private Cloud

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

EMC IRODS RESOURCE DRIVERS

Implementing Internet Storage Service Using OpenAFS. Sungjin Dongguen Arum

Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop

A Novel Data Placement Model for Highly-Available Storage Systems

Data Deduplication and Tivoli Storage Manager

Summer Student Project Report

Google File System. Web and scalability

TECHNICAL BRIEF. Primary Storage Compression with Storage Foundation 6.0

Cloud Computing at Google. Architecture

Cloud OS Vision. Modern platform for the world s apps

Performance Benchmark for Cloud Block Storage

Design and Evolution of the Apache Hadoop File System(HDFS)

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Big data management with IBM General Parallel File System

Creating a Cloud Backup Service. Deon George

3 common cloud challenges eradicated with hybrid cloud

M710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory

Diagram 1: Islands of storage across a digital broadcast workflow

SMART SCALE YOUR STORAGE - Object "Forever Live" Storage - Roberto Castelli EVP Sales & Marketing BCLOUD

IBM Smart Business Storage Cloud

Introduction to NetApp Infinite Volume

Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

ioscale: The Holy Grail for Hyperscale

Build more and grow more with Cloudant DBaaS

High Availability Databases based on Oracle 10g RAC on Linux

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Virtualization of the MS Exchange Server Environment

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Promise of Low-Latency Stable Storage for Enterprise Solutions

Understanding Object Storage and How to Use It

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Transcription:

The Data Storage Services () Strategy at CERN Jakub T. Moscicki (Input from J. Iven, M. Lamanna A. Pace, A. Peters and A. Wiebalck) HEPiX Spring 2012 Workshop Prague, April 2012

The big picture

Situation today and short-/mid-term AFS - low-latency, general purpose file system - 75TB, 1E9 files, 30kHz, 10ms - daily backups (6 months) - global namespace - evolve and grow x 10 - provide large user workspace, possibly for end-user analysis CASTOR - long-term bulk storage - 60PB (tape), 14PB (disk), 340M files, 200Hz, 1s..24h - focus on T0, T0-T1 transfer - focus on data integrity/durability - become the archive for physics at CERN EOS - low-latency storage for data analysis - 9.3 PB (disk), 17M files 200Hz, 20ms - data replication for QoS control - reliability, availibility, perf. - fix, improve and scale up - become the disk pool at CERN (storage for analysis facility) TSM - high-latency, pure backup - ~ 5PB (tape) - extreme number of files (1E12) - follow and adapt to tape technology Other storage providers: CERNVMFS, DBs NetApps, Experiments,... Jakub T.Moscicki: The Storage Services Strategy at CERN - 3

Storage Services at a glance Fast & Reliable Reliable & Cheap Fast & Cheap Jakub T. Moscicki: The Storage Services Strategy at CERN - 4

HW Technology Disk JBOD, NLSAS storage (unexpensive) + SSD caches scale up but maintain service quality see next talk on AFS Tape Try multiple vendors (SpectraLogic) avoid/reduce vendor-lock in lots of performance/storage headroom in the tape technology LTO, faster drives, buffered tape marks, see next talks on TSM and Castor Total volume on TSM tape Jakub T.Moscicki: The Storage Services Strategy at CERN - 5

CASTOR Tape Ensure we can read back the data - check all data at least once every 2 years - ~64PB verified so far - unrecoverable data: 37.2GB (62ppm) 2011 write: PB 1PB 60 PB of data, 208M files on tape Avg file size 260 MB, 120 tape drives Peak writing speed: 6.9 GiB/s (Heavy Ion run, 2011) Jakub T. Moscicki: The Storage Services Strategy at CERN - 6

Areas of investigation Appliances (cloud storage) Huawei (Openlab) Reliable key-value storage S3, Linux NBD, FUSE,... Test h/w: 768TB/384 nodes, 2.5 racks reduce total cost of ownership? just power-off disks, replace the whole storage unit after 3 years intervention-less operation SWIFT/OpenStack a drop-in replacement for S3 private/public cloud federation on demand? role of reduced-feature protocols/interfaces in physics? S3: large-scale stress-tests with experiment apps is it really needed / would it really work? Jakub T.Moscicki: The Storage Services Strategy at CERN - 7

Physics storage model and... Jakub T.Moscicki: The Storage Services Strategy at CERN - 8

...its evolution End of HSM specialized storage pools, data placement managed by experiments frameworks Split: Castor T0 TAPE EOS Analysis Facility DISK Jakub T.Moscicki: The Storage Services Strategy at CERN - 9

Analysis facility disk pool EOS is the key of our strategy for analysis although new AFS workspaces may also provide an alternative for end-user analysis EOS comes with features required for a very large Analysis Facility integrated with a popular protocol in the experiments (xrootd) fast metadata access (10x compared to Castor) few ms read/write/open latency designed for minimal operation cost and scalability data replicated at the file level across independent storage units (JBODs) automatic rebalancing of data replicas in case of hardware removal/addition see example later lost replicas are automatically recreated by the system, no loss of service by the client operations simplified: power-off a disk or machine at any time without loss of service (service availability) known issues: in-memory metadata server (not a problem for now, on a todo list) EOS is freely available (GPL) but not packaged/supported off-the-shelf by us Jakub T.Moscicki: The Storage Services Strategy at CERN - 10

EOS: Current Status Usable Jakub T.Moscicki: The Storage Services Strategy at CERN - 11

Rebalancing example February 2012: automatic rebalancing in EOSCMS. 1.6 PB added and rebalanced in 1 week internally (avg. ~2.7 GB/s), disk usage under 5% (~2.800 disks). Jakub T.Moscicki: The Storage Services Strategy at CERN - 12

Future: overhead, reliability, performance Future: m+n block replication for any set of m chunks chosen among the m+n you can reconstruct the missing n chunks chunks stored on independent storage units (disk/servers) different algorithms possible double, triple parity, LDPC, Reed Solomon (space-optimal, at the theoretical limit), This will possibly allow to define/adjust the quality of service parameters per file container (directory) tune storage overhead vs reliability vs performance simple replication: 3 copies, 200% overhead, 3 x streaming performance 10+3 replicaton: can lose any 3, remaining 10 are enough to reconstruct, 30 % storage overhead more possibilities different QoS classes This will be phased in carefully into production adding block-replicas to existing file-replicas first Jakub T.Moscicki: The Storage Services Strategy at CERN - 13

Summary We expect all major systems to be there in 2015... and much improved! AFS,CASTOR,EOS,TSM Paradigm shift at CERN is a fact analysis disk pools separated from the archive pools (no HSM) Evolution is constrained many external drivers influence the strategy large data volumes create inertia rich features sets create inertia Cloud storage: we are ready no pressure from the users (yet?) price tag not there yet (?) Jakub T.Moscicki: The Storage Services Strategy at CERN - 14