Research Data Storage Infrastructure (RDSI) Project. DaSh Straw-Man

Size: px
Start display at page:

Download "Research Data Storage Infrastructure (RDSI) Project. DaSh Straw-Man"

Transcription

1 Research Data Storage Infrastructure (RDSI) Project DaSh Straw-Man

2 Recap from the Node Workshop (Cherry-picked) *Higher Tiered DCs cost roughly twice the cost of Lower Tiered DCs. * However can provide a robust Higher Tiered like service. * Using co-operating Lower Tiered DCs. * With distributed and/or replicated mechanisms. * If a service (partially) fails another DC can temporarily provide it. * If a DC fails other DCs can provide its services temporarily. *Loss of service pardonable. Loss of data unforgivable. *Need to provide concrete assurances to the end user.

3 *Whats DaSh all about? * Developing sufficient elements of potential technical architectures for data interoperability and sharing. * So that its use can be appropriately specified the call for nodes proposal. * Mile high view of technical architectures to get data into and out of the RDSI node(s). *Ensure (meta)data durability and curation. * Loss of (meta)data is a capital offence. *Ensure data scalability. * Storage capacity, moving data into and out of a node(s). *Ensure End-user usability. * Provide a good end-user experience. *DaSh straw-man seeks community opinion on the various possible architectures.

4 Re-exported FS Building Blocks HSM, Tiers Storage Classes protocol neg. SRM SRM Wide Area xfers REST S3 Clouds GRIDs Wide Area xfers gsiftp, https dcap, DPM, xrootd NFS, CIFS WebDAV, FUSE

5 *irods and Federation *Federation is a feature in which separate irods Zones (irods instances), can be integrated. * When zones 'A' and 'B' are federated, they work together. * Each zone continues to be separately administrated. * Users in the multiple zones, if given permission, will be able to access data and metadata in the other zones. * No user passwords exchanged * Zone admins setup trust relationships to other zones.

6 ARCS Data Fabric icat only. Hosted on NeCTAR NSP irods server + tape irods server + tape irods server irods server irods server + tape irods server irods server + tape

7 Node s Eye View. (N=6) No Federation.

8 Node s Eye View. (N=6) Too much Federation. Too much confusion!!

9 Node s Eye View. (N=6) Just right Federation. Slave ICAT Slave ICAT Slave ICAT Master ICAT Slave ICAT Slave ICAT Slave ICAT

10 Distributed Fault- Tolerant Parallel FS Over N=6 nodes Re-exported FS Distributed vs Federated HSM, Tiers Storage Classes protocol neg. SRM SRM Wide Area xfers REST S3 Clouds GRIDs Wide Area xfers gsiftp, https dcap, DPM, xrootd NFS, CIFS WebDAV, FUSE

11 Distributed Pros and Cons *Distributed over a larger number of nodes. * Geographic scaling as well as node scaling. * Inherent data replication. *Fault Tolerant. * A storage brick took lickin but the service keep on tickin. * A node took a lickin but the service keep on tickin. *Parallel I/O. * All nodes can participate to move data. High aggregate BW. *Single global namespace. * Rather than separate logical namespaces. *Cost Effective * Use cheap hardware. Big disks over fast disks. * Design to expect failures.

12 File Replication *Whole file * Duplicated and stored on multiple bricks. *Slices of file * File sliced and diced, slices stored on multiple bricks. * A single brick may not contain the whole file. * Erasure Codes * Parity Blocks * (used in RAID) * Reed-Solomon * Over sampled polynomial constructed from data. * Add Erasure codes and slice file * Need M of N pieces to recover file (M < N) * Can store a slice on multiple bricks. Extra redundancy.

13 SurfNET Survey of Wide Area Distributed Storage. (Circa 2010) [1/4] Requirements required: *Scalable. * Capacity, performance and concurrent access. * Expandable storage without degrading performance. *High Availability. * Keeps data available to apps and clients. * Even in the event of a malfunction. * Or system reconfiguration. * Needs to replicates data to multiple locations.

14 SurfNET Survey of Wide Area Distributed Storage. (Circa 2010) [2/4] *Durability * No data is lost from a single software or hardware failure. * Automatically maintain minimum number of replicas. * Support backup to tape. *Performance at Traditional SAN/NAS Level. * Comparable performance to traditional non-distributed SAN/NAS. *Dynamic Operation. * Availability, durability, performance configurable per application. * Reduce costs as not running at highest support level at the time. * Allow users, apps, sysadmins to balance cost vs features. * System should be self-configurable, self-tunable. * Support data movement between different storage technologies. * Tiered functionality. Classes of Storage.

15 SurfNET Survey of Wide Area Distributed Storage. (Circa 2010) [3/4] *Cost Effective * Must be possible to build, configure, run and maintain in a cost effective manner. * Must work with commodity hardware. * Hardware may not be as reliable as high end hardware. * Configuration of system and its maintenance must be easy and straight forward. * Operation of system is energy efficient. * License fees for software when applicable must be limited. *Generic Interfaces. * System offers generic interfaces to apps and clients * POSIX interface. POSIX/NFSv4.1 semantics. * Block device (iscsi, etc).

16 SurfNET Survey of Wide Area Distributed Storage. (Circa 2010) [4/4] *Protocols Based on Open Standards * System build using open protocols * Reduces vendor lock-in * More economical in the long run. *Multi-Party Access * System must support access by multiple geographically dispersed parties at the same time. * Promotes collaboration between these parties.

17 SurfNET Survey of Wide Area Distributed Storage.(Circa 2010) Candidates *Lustre *GlusterFS *GPFS *Ceph *+ dcache Non-Candidates *XtreemFS *MogileFS *NFS v4.1 (pnfs) *ZFS *VERITAS FS *Parascale *CAStor *Tahoe-LAFS *DRBD

18 Nordic DataGrid Facility (dcache)

19 The DEISA Global File System at European Scale (Multi-Cluster General Parallel File System)

20 TeraGrid (GPFS & Lustre)

21 SurfNET Survey of Wide Area Distributed Storage + dcache Lustre GlusterFS GPFS Ceph dcache Owner Oracle Gluster IBM Newdream dcache.org Licence GNU GPL GNU GPL commercial GNU GPL DESY Data Primitive Object (file) Object (file) block Object (file) Object (file) Data placement Metadata Storage tiers Round robin + free space heuristics Max 2 metadata servers Pools of object targets Different strategies via modules Stored with file Policy based Distribute over storage servers Placement groups, random mappings Multiple metadata servers Policy based pnfs (postgresql) unknown Policy defined CRUSH rules Policy defined

22 SurfNET Survey of Wide Area Distributed Storage + dcache. Lustre GlusterFS GPFS Ceph dcache Failure handling Replication WAN deployment example Client interface Node types Assuming reliable nodes Server side (failover pairs) TeraGrid Native client, FUSE, CIFS, NFS Clients, metadata, objects Assuming unreliable nodes Assuming reliable nodes, Failure groups Assuming unreliable nodes Assuming reliable nodes Client side Server side Server side Server side City Cloud (Swedish IaaS provider) Native Client, FUSE TeraGrid DEISA Native Client, exports NFSv3, CIFS, pcifs, WebDAV, SRM (StoRM) unknown Native client, FUSE Client, data Client, data Clients, metadata, objects Fermilab, Swegrid, NDGF NFSv4.1, HTTP, WebDAV, GridFTP, Xrootd, SRM, dcap Clients, metadata, objects

23 WAN Data Caching and Performance Bringing data closer to where it is consumed. *Researchers are naturally distributed over the city and country *Some may not benefit from the high speed networks provide by AARNet and the NRN due to their location. *Can RDSI help these spatially disenfranchised? *Yes, (sort of). *Take the model of Content Delivery Networks. * ie Akamai, Amazon CloudFront, etc * Web content, videos etc are cached close to the end user. *But focus on data caching rather that content caching. *May not provide the same experience as the spatially franchised. * But every bit helps!

24 WAN Data Caching with GPFS.

25 WAN Data Caching Continued *dcache is a distributed cache system. * Locate a dcache pool close to the spatially disenfranchised. * dcache admin can populate required data collections to spatially disenfranchised using standard SRM processes. * Potentially a (reasonably) fast parallel transfer. *BioTorrents < * Allows scientists to rapidly share their results, datasets, and software using the popular BitTorrent file sharing technology. * All data is open-access and any illegal filesharing is not allowed on BioTorrents. * Or RDSI nodes can provide bit-torrent seeders itself from its nodes. * Ignoring the bad press BitTorrent is very good at what it does.

26 Data Durability. Things that go bump in the night (or not!) *Data Durability is an absolute necessity. *RDSI must provide a safe and enduring home for research data. * This might be more difficult as it appears! *The enemy is *Physics. *The world is a complex quantum/probabilistic system. * And so are all your computing and storage infrastructure. *Random events in your infrastructure will create: *Bit Rot and Silent Corruptions. *But you can engineer around the laws of physics.

27 Data Durability. Sources of Bit Rot and Silent Corruptions User Space ECC errors Corrupted Metadata Corrupted Data Inter-op issues Bugs in FW Wear Out Flipped Bits Latent sector errors VM Memory Filesystems Block layer SCSI layer Low-level drivers Controller firmware Storage firmware Disk Mechanics + Physical magnetic media All interconnecting cables Cosmic rays/sun spots EM Radiation, etc Lost Writes Torn Writes Misdirected Writes From Silent Corruptions, Peter.Kelemen, CERN

28 Data Durability. Expected Background Bit Error Rate (BER) * NIC/Link/HBA: (1 bit in ~1.1 GB) * Check-summed, retransmit if necessary * Memory: (1 bit in ~116 GB) * ECC * SATA Disk: (1 bit in ~11.3 TB) * Various error correction codes * Enterprise Disk: (1 bit in ~113 TB) * Various error correction codes * Tape: (1 bit in ~1.11 PB) * Various error correction codes * Data maybe encoded up to five or more times as it travels to and from physical disk/tape to user space. * At petascale incredibly infrequent events happen all the time. From Silent Corruptions, Peter.Kelemen, CERN

29 Data Durability. The errors you know. The errors you don t know. There are known errors; there are errors we know we know. We also know there are known unknown errors; that is to say we know there are some things we do not know. But there are also unknown unknown errors; the ones we don't know we don't know. Paraphrased from Donald Rumsfeld From Silent Corruptions, Peter.Kelemen, CERN

30 Data Durability. The errors you know. The errors you don t know. *There are Data Errors that you will now know about. * Logs message. * SMART messages * Detection: SW/HW-level with error messages * Correction: SW/HW-level with warnings * If your really lucky your kernel will panic so you ll know something happened. *There are Data Errors that you will never know about. * As far as your storage infrastructure knows that write/read was executed perfectly. * In reality you will probably never know the data has been corrupted. * (Unless you design for this eventuality.) From Silent Corruptions, Peter.Kelemen, CERN

31 Data Durability. How to discover the unknown unknowns. * checksums * (CRC32, MD5, SHA1,...) * Checksum (meta)data. * Transport checksum with meta(data) for later comparison. * Error detection and correction codings. * Detects errors caused by noise, etc. (See checksums.) * Corrects detected errors and reconstruction of the original, error-free data. * Backward error correction: * Automatic Retransmit on error detection. * Forward error correction: * Encode extra redundant data. * Regenerate data from Forward Error Codes. * Multiple copies with quorum. From Silent Corruptions, Peter.Kelemen, CERN

32 Data Durability. Silent Corruptions and CERN * Circa PB tape. 4PB disk nodes drives, 1200 RAID. * Probabilistic storage integrity check (fsprobe) on 4000 nodes. * Write known bit pattern * Read it back. * Compare and alert when mismatch found. * 6 cycles over 1 hour each. * Low I/O footprint for background operation on 2GB file. * Keep complexity to the minimum. * use static buffers * Attempt to preserve details about detected corruptions for further analysis. From Silent Corruptions, Peter.Kelemen, CERN

33 Data Durability. Silent Corruptions and CERN *2000 incidents reported over 97 PB of traffic. * 6/day on average observed! * 192 MB of data silent data corruption. *320 nodes affected over 27 hardware types. *Multiple types of corruptions. *Some corruptions are transient. *Overall BER considering all the links in the chain * 3x10-7. *Not the spec d rates. From Silent Corruptions, Peter.Kelemen, CERN

34 Data Durability. Types of silent Corruptions * Type 1 * Single/double bit flip errors. Usually persistent. * Usually bad memory (RAM, cache, etc.) * Happens with expensive ECC memory too. * Type II * Small, 2 n -sized random chunks ( bytes) of unknown origin. * Usually transient. * Possible OOM Killer or corrupted SLAB/SLUB allocator. * Type III * multiple large chunks of 64K, old file data. I/O command timeouts * Usually persistent. * Type IV various sized chunks of zeros. From Silent Corruptions, Peter.Kelemen, CERN

35 Data Durability. What Can Be Done? *Self-examining/healing hardware. *WRITE-READ cycles before ACK. *Check-summing though not necessarily enough. *End-to-end check-summing. *Store multiple copies. *Regular scrubbing of RAID arrays. *Data refresh. Re-read cycles on tapes. *Generally accept and prepare for corruptions. From Silent Corruptions, Peter.Kelemen, CERN

36 Data Durability. The solutions. ZFS. The Good. * Developed by Sun (now Oracle) on Solaris. * Designed from the ground up with a focus on data integrity. * Combined filesystem, logical volume manager * RAID-Z, RAID-Z2, RAID-Z3, or mirrored * Copy-on-write. Transactional operation. * Built-in end-to-end data integrity. * Data/metadata checksum all the way to the root. * Always consistent on disk. no fsck or journaling * Automatic self-healing. * Intelligent online scrubbing and resilvering. * Very large filesystem limits. Max. 256 ZB FS * Deduplication. * Snapshots. and much much more.

37 Data Durability. The solutions. ZFS. The Bad. *Supported on Solaris only. * OpenSolaris is no more. *Kernel ports for FreeBSD and NetBSD. * Using OpenSolaris kernel source code. *Linux port via ZFS-FUSE. * Kernel space good. User space not so good. *ZFS on Linux. * Supported by Lawrence Livermore National Laboratory. * Issues with CDDL and GPL license compatibility in the kernel. * Solaris Portability layer/shim to the rescue. * Currently v0.6.0-rc4. It worked for me but not production grade yet.

38 Data Durability. The solutions. ZFS for Lustre. *1999: Peter Bramm from CMU creates Lustre. * A GPL massively parallel distributed file system. * 2003: Bramm created Cluster File Systems Inc to continue work. * 2007: Sun acquires Cluster File Systems Inc. * Works to combines ZFS and Lustre. * High Performance parallel FS with end to end data integrity. * But only supported on solaris. *2009: LLNL starts porting ZFS kernel to linux. * Oracle acquires Sun. *2010: Oracle announced ZFS/Lustre only for Solaris. *2011: LLNL starts ZFS/Lustre port for linux. *Late 2011: LLNL plans ZFS/Lustre FS. * 50 PB. 512GB/s 1TB/s bandwidth.

39 Data Durability. The solutions. DataDirect Networks S2S Technology. *SATA storage with: * Enterprise-class performance. * Reliability and data integrity. * Automatic self-healing * Detects anomalies and begins journaling all writes while recovering operations. *Dynamic Maid (D-MAID) * Save additional power and cooling by powering down the platters, * Where over 80% of power is consumed. * DC friendly.

40 Community Input Time. *Are we barking up the right tree. *Are we barking up the wrong tree. *Is there even a tree in the first place. *You decide.

41 Building Block *Are the base building blocks sufficient? * If not what should be added? *Is there a need for additional data transfer protocols. * If so what should be added? *Is there a need for additional file system protocol? * If so what should be added? *What additional public cloud storage infrastructure should RDSI consider? *What additional private cloud storage infrastructure should RDSI consider?

42 Federated vs Distributed. *Should RDSI continue to embrace the federated irods model? *Should RDSI embrace the Distributed FS model? *Should RDSI embrace both the federated and distributed model?

43 Distributed Fault Tolerant Parallel Filesystems. *If RDSI chooses to use a Distributed Fault Tolerant Parallel filesystem component, are there such systems that we have not yet consider?

44 WAN Data Caching There are always going to researchers who may not be able to benefit from the high speed networks provide by AARNet and the NRN. WAN Data Caching may partially eliminate their disadvantage but at cost. *Should RDSI consider the use of WAN Data Caches? *If so what sites would benefit from these data caches?

45 Data Durability. Data Durability is one of the foremost challenges of RDSI. However it seems impossible to entirely eliminate the various issues of bit rot and silent corruptions. *Given this fact of nature what level of data durability is the research community willing to accept?

Survey of Technologies for Wide Area Distributed Storage

Survey of Technologies for Wide Area Distributed Storage Survey of Technologies for Wide Area Distributed Storage Project : GigaPort3 Project Year : 2010 Project Manager : Rogier Spoor Author(s) Completion Date : 2010-06-29 Version : 1.0 : Arjan Peddemors, Christiaan

More information

Introduction to Gluster. Versions 3.0.x

Introduction to Gluster. Versions 3.0.x Introduction to Gluster Versions 3.0.x Table of Contents Table of Contents... 2 Overview... 3 Gluster File System... 3 Gluster Storage Platform... 3 No metadata with the Elastic Hash Algorithm... 4 A Gluster

More information

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF Panasas at the RCF HEPiX at SLAC Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory Centralized File Service Single, facility-wide namespace for files. Uniform, facility-wide

More information

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes

More information

Moving Virtual Storage to the Cloud

Moving Virtual Storage to the Cloud Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage

More information

Silent data corruption in SATA arrays: A solution

Silent data corruption in SATA arrays: A solution Silent data corruption in SATA arrays: A solution Josh Eddy August 2008 Abstract Recent large academic studies have identified the surprising frequency of silent read failures that are not identified or

More information

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT SS Data & Storage CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT HEPiX Fall 2012 Workshop October 15-19, 2012 Institute of High Energy Physics, Beijing, China SS Outline

More information

Building Storage Service in a Private Cloud

Building Storage Service in a Private Cloud Building Storage Service in a Private Cloud Sateesh Potturu & Deepak Vasudevan Wipro Technologies Abstract Storage in a private cloud is the storage that sits within a particular enterprise security domain

More information

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

More information

Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

More information

ZFS Administration 1

ZFS Administration 1 ZFS Administration 1 With a rapid paradigm-shift towards digital content and large datasets, managing large amounts of data can be a challenging task. Before implementing a storage solution, there are

More information

Storage Architectures for Big Data in the Cloud

Storage Architectures for Big Data in the Cloud Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas

More information

Sun Storage Perspective & Lustre Architecture. Dr. Peter Braam VP Sun Microsystems

Sun Storage Perspective & Lustre Architecture. Dr. Peter Braam VP Sun Microsystems Sun Storage Perspective & Lustre Architecture Dr. Peter Braam VP Sun Microsystems Agenda Future of Storage Sun s vision Lustre - vendor neutral architecture roadmap Sun s view on storage introduction The

More information

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute PADS GPFS Filesystem: Crash Root Cause Analysis Computation Institute Argonne National Laboratory Table of Contents Purpose 1 Terminology 2 Infrastructure 4 Timeline of Events 5 Background 5 Corruption

More information

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari 1 Agenda Introduction on the objective of the test activities

More information

CSE-E5430 Scalable Cloud Computing P Lecture 5

CSE-E5430 Scalable Cloud Computing P Lecture 5 CSE-E5430 Scalable Cloud Computing P Lecture 5 Keijo Heljanko Department of Computer Science School of Science Aalto University [email protected] 12.10-2015 1/34 Fault Tolerance Strategies for Storage

More information

Object storage in Cloud Computing and Embedded Processing

Object storage in Cloud Computing and Embedded Processing Object storage in Cloud Computing and Embedded Processing Jan Jitze Krol Systems Engineer DDN We Accelerate Information Insight DDN is a Leader in Massively Scalable Platforms and Solutions for Big Data

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

Lessons learned from parallel file system operation

Lessons learned from parallel file system operation Lessons learned from parallel file system operation Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association

More information

High Performance Computing Specialists. ZFS Storage as a Solution for Big Data and Flexibility

High Performance Computing Specialists. ZFS Storage as a Solution for Big Data and Flexibility High Performance Computing Specialists ZFS Storage as a Solution for Big Data and Flexibility Introducing VA Technologies UK Based System Integrator Specialising in High Performance ZFS Storage Partner

More information

StorPool Distributed Storage Software Technical Overview

StorPool Distributed Storage Software Technical Overview StorPool Distributed Storage Software Technical Overview StorPool 2015 Page 1 of 8 StorPool Overview StorPool is distributed storage software. It pools the attached storage (hard disks or SSDs) of standard

More information

Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved.

Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved. Getting performance & scalability on standard platforms, the Object vs Block storage debate 1 December Webinar Session Getting performance & scalability on standard platforms, the Object vs Block storage

More information

Managed Storage @ GRID or why NFSv4.1 is not enough. Tigran Mkrtchyan for dcache Team

Managed Storage @ GRID or why NFSv4.1 is not enough. Tigran Mkrtchyan for dcache Team Managed Storage @ GRID or why NFSv4.1 is not enough Tigran Mkrtchyan for dcache Team What the hell do physicists do? Physicist are hackers they just want to know how things works. In moder physics given

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012

<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012 Cloud Archive Trends and Challenges PASIG Winter 2012 Raymond A. Clarke Enterprise Storage Consultant, Oracle Enterprise Solutions Group How Is PASIG Pronounced? Is it PASIG? Is it

More information

ZFS Backup Platform. ZFS Backup Platform. Senior Systems Analyst TalkTalk Group. http://milek.blogspot.com. Robert Milkowski.

ZFS Backup Platform. ZFS Backup Platform. Senior Systems Analyst TalkTalk Group. http://milek.blogspot.com. Robert Milkowski. ZFS Backup Platform Senior Systems Analyst TalkTalk Group http://milek.blogspot.com The Problem Needed to add 100's new clients to backup But already run out of client licenses No spare capacity left (tapes,

More information

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system Christian Clémençon (EPFL-DIT)  4 April 2013 GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID

More information

Data Storage in Clouds

Data Storage in Clouds Data Storage in Clouds Jan Stender Zuse Institute Berlin contrail is co-funded by the EC 7th Framework Programme 1 Overview Introduction Motivation Challenges Requirements Cloud Storage Systems XtreemFS

More information

WHITE PAPER. Software Defined Storage Hydrates the Cloud

WHITE PAPER. Software Defined Storage Hydrates the Cloud WHITE PAPER Software Defined Storage Hydrates the Cloud Table of Contents Overview... 2 NexentaStor (Block & File Storage)... 4 Software Defined Data Centers (SDDC)... 5 OpenStack... 5 CloudStack... 6

More information

June 2009. Blade.org 2009 ALL RIGHTS RESERVED

June 2009. Blade.org 2009 ALL RIGHTS RESERVED Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS

More information

Next Generation Tier 1 Storage

Next Generation Tier 1 Storage Next Generation Tier 1 Storage Shaun de Witt (STFC) With Contributions from: James Adams, Rob Appleyard, Ian Collier, Brian Davies, Matthew Viljoen HEPiX Beijing 16th October 2012 Why are we doing this?

More information

Data storage services at CC-IN2P3

Data storage services at CC-IN2P3 Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:

More information

New Storage System Solutions

New Storage System Solutions New Storage System Solutions Craig Prescott Research Computing May 2, 2013 Outline } Existing storage systems } Requirements and Solutions } Lustre } /scratch/lfs } Questions? Existing Storage Systems

More information

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction dcache Introduction Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O. Box 3640 D-76021 Karlsruhe, Germany Dr. http://www.gridka.de What is dcache? Developed at DESY and FNAL Disk

More information

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007 Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms Cray User Group Meeting June 2007 Cray s Storage Strategy Background Broad range of HPC requirements

More information

The dcache Storage Element

The dcache Storage Element 16. Juni 2008 Hamburg The dcache Storage Element and it's role in the LHC era for the dcache team Topics for today Storage elements (SEs) in the grid Introduction to the dcache SE Usage of dcache in LCG

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Maurice Askinazi Ofer Rind Tony Wong HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Traditional Storage Dedicated compute nodes and NFS SAN storage Simple and effective, but SAN storage became very expensive

More information

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska Oracle Maximum Availability Architecture with Exadata Database Machine Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska MAA is Oracle s Availability Blueprint Oracle s MAA is a best practices

More information

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs Distributed File System Choices: Red Hat Storage, GFS2 & pnfs Ric Wheeler Architect & Senior Manager, Red Hat June 27, 2012 Overview Distributed file system basics Red Hat distributed file systems Performance

More information

IBM System x GPFS Storage Server

IBM System x GPFS Storage Server IBM System x GPFS Storage Server Schöne Aussicht en für HPC Speicher ZKI-Arbeitskreis Paderborn, 15.03.2013 Karsten Kutzer Client Technical Architect Technical Computing IBM Systems & Technology Group

More information

Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop

Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop Data Protection Technologies: What comes after RAID? Vladimir Sapunenko, INFN-CNAF HEPiX Spring 2012 Workshop Arguments to be discussed Scaling storage for clouds Is RAID dead? Erasure coding as RAID replacement

More information

Ceph. A file system a little bit different. Udo Seidel

Ceph. A file system a little bit different. Udo Seidel Ceph A file system a little bit different Udo Seidel Ceph what? So-called parallel distributed cluster file system Started as part of PhD studies at UCSC Public announcement in 2006 at 7 th OSDI File system

More information

Sep 23, 2014. OSBCONF 2014 Cloud backup with Bareos

Sep 23, 2014. OSBCONF 2014 Cloud backup with Bareos Sep 23, 2014 OSBCONF 2014 Cloud backup with Bareos OSBCONF 23/09/2014 Content: Who am I Quick overview of Cloud solutions Bareos and Backup/Restore using Cloud Storage Bareos and Backup/Restore of Cloud

More information

Apache Hadoop FileSystem and its Usage in Facebook

Apache Hadoop FileSystem and its Usage in Facebook Apache Hadoop FileSystem and its Usage in Facebook Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System [email protected] Presented at Indian Institute of Technology November, 2010 http://www.facebook.com/hadoopfs

More information

XtreemFS Extreme cloud file system?! Udo Seidel

XtreemFS Extreme cloud file system?! Udo Seidel XtreemFS Extreme cloud file system?! Udo Seidel Agenda Background/motivation High level overview High Availability Security Summary Distributed file systems Part of shared file systems family Around for

More information

The Design and Implementation of the Zetta Storage Service. October 27, 2009

The Design and Implementation of the Zetta Storage Service. October 27, 2009 The Design and Implementation of the Zetta Storage Service October 27, 2009 Zetta s Mission Simplify Enterprise Storage Zetta delivers enterprise-grade storage as a service for IT professionals needing

More information

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week Michael Thomas, Dorian Kcira California Institute of Technology CMS Offline & Computing Week San Diego, April 20-24 th 2009 Map-Reduce plus the HDFS filesystem implemented in java Map-Reduce is a highly

More information

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices Sawmill Log Analyzer Best Practices!! Page 1 of 6 Sawmill Log Analyzer Best Practices! Sawmill Log Analyzer Best Practices!! Page 2 of 6 This document describes best practices for the Sawmill universal

More information

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

Linux Powered Storage:

Linux Powered Storage: Linux Powered Storage: Building a Storage Server with Linux Architect & Senior Manager [email protected] June 6, 2012 1 Linux Based Systems are Everywhere Used as the base for commercial appliances Enterprise

More information

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID

More information

Scalable filesystems boosting Linux storage solutions

Scalable filesystems boosting Linux storage solutions Scalable filesystems boosting Linux storage solutions Daniel Kobras science + computing ag IT-Dienstleistungen und Software für anspruchsvolle Rechnernetze Tübingen München Berlin Düsseldorf Motivation

More information

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015 Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015 Index - Storage use cases - Bluearc - Lustre - EOS - dcache disk only - dcache+enstore Data distribution by solution

More information

Home storage and backup options. Chris Moates Head of Lettuce

Home storage and backup options. Chris Moates Head of Lettuce Home storage and backup options Chris Moates Head of Lettuce Who I Am Lead Systems Architect/Administrator for Gaggle Previously employed as Staff Engineer by EarthLink/MindSpring Linux hobbyist since

More information

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc. The Panasas Parallel Storage Cluster What Is It? What Is The Panasas ActiveScale Storage Cluster A complete hardware and software storage solution Implements An Asynchronous, Parallel, Object-based, POSIX

More information

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a

More information

Google File System. Web and scalability

Google File System. Web and scalability Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

More information

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations

More information

Long term retention and archiving the challenges and the solution

Long term retention and archiving the challenges and the solution Long term retention and archiving the challenges and the solution NAME: Yoel Ben-Ari TITLE: VP Business Development, GH Israel 1 Archive Before Backup EMC recommended practice 2 1 Backup/recovery process

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Netapp @ 10th TF-Storage Meeting

Netapp @ 10th TF-Storage Meeting Netapp @ 10th TF-Storage Meeting Wojciech Janusz, Netapp Poland Bogusz Błaszkiewicz, Netapp Poland Ljubljana, 2012.02.20 Agenda Data Ontap Cluster-Mode pnfs E-Series NetApp Confidential - Internal Use

More information

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc. The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc The need for new architectures Business innovation Time-to-market

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System [email protected] Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction

More information

General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems

General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems Veera Deenadhayalan IBM Almaden Research Center 2011 IBM Corporation Hard Disk Rates Are Lagging There have been recent

More information

Large Scale Storage. Orlando Richards, Information Services [email protected]. LCFG Users Day, University of Edinburgh 18 th January 2013

Large Scale Storage. Orlando Richards, Information Services orlando.richards@ed.ac.uk. LCFG Users Day, University of Edinburgh 18 th January 2013 Large Scale Storage Orlando Richards, Information Services [email protected] LCFG Users Day, University of Edinburgh 18 th January 2013 Overview My history of storage services What is (and is not)

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Maxta Storage Platform Enterprise Storage Re-defined

Maxta Storage Platform Enterprise Storage Re-defined Maxta Storage Platform Enterprise Storage Re-defined WHITE PAPER Software-Defined Data Center The Software-Defined Data Center (SDDC) is a unified data center platform that delivers converged computing,

More information

High Availability with Windows Server 2012 Release Candidate

High Availability with Windows Server 2012 Release Candidate High Availability with Windows Server 2012 Release Candidate Windows Server 2012 Release Candidate (RC) delivers innovative new capabilities that enable you to build dynamic storage and availability solutions

More information

WHITE PAPER. QUANTUM LATTUS: Next-Generation Object Storage for Big Data Archives

WHITE PAPER. QUANTUM LATTUS: Next-Generation Object Storage for Big Data Archives WHITE PAPER QUANTUM LATTUS: Next-Generation Object Storage for Big Data Archives CONTENTS Executive Summary....................................................................3 The Limits of Traditional

More information

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation Overview of I/O Performance and RAID in an RDBMS Environment By: Edward Whalen Performance Tuning Corporation Abstract This paper covers the fundamentals of I/O topics and an overview of RAID levels commonly

More information

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive

More information

Trends in Enterprise Backup Deduplication

Trends in Enterprise Backup Deduplication Trends in Enterprise Backup Deduplication Shankar Balasubramanian Architect, EMC 1 Outline Protection Storage Deduplication Basics CPU-centric Deduplication: SISL (Stream-Informed Segment Layout) Data

More information

ovirt and Gluster hyper-converged! HA solution for maximum resource utilization

ovirt and Gluster hyper-converged! HA solution for maximum resource utilization ovirt and Gluster hyper-converged! HA solution for maximum resource utilization 31 st of Jan 2016 Martin Sivák Senior Software Engineer Red Hat Czech FOSDEM, Jan 2016 1 Agenda (Storage) architecture of

More information

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read

More information

High Availability Databases based on Oracle 10g RAC on Linux

High Availability Databases based on Oracle 10g RAC on Linux High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN, June 2006 Luca Canali, CERN IT Outline Goals Architecture of an HA DB Service Deployment at the CERN Physics Database

More information

ZFS In Business. Roch Bourbonnais Sun Microsystems [email protected]

ZFS In Business. Roch Bourbonnais Sun Microsystems Roch.Bourbonnais@sun.com ZFS In Business Roch Bourbonnais Sun Microsystems [email protected] 1 What is ZFS Integrated Volume and Filesystem w no predefined limits Volume Management > pooling of disks, luns... in raid-z

More information

EMC XTREMIO EXECUTIVE OVERVIEW

EMC XTREMIO EXECUTIVE OVERVIEW EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying

More information

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges

More information

Alternatives to Big Backup

Alternatives to Big Backup Alternatives to Big Backup Life Cycle Management, Object- Based Storage, and Self- Protecting Storage Systems Presented by: Chris Robertson Solution Architect Cambridge Computer Copyright 2010-2011, Cambridge

More information

Globus and the Centralized Research Data Infrastructure at CU Boulder

Globus and the Centralized Research Data Infrastructure at CU Boulder Globus and the Centralized Research Data Infrastructure at CU Boulder Daniel Milroy, [email protected] Conan Moore, [email protected] Thomas Hauser, [email protected] Peter Ruprecht,

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012 Implementing Enterprise Disk Arrays Using Open Source Software Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012 Mott Community College (MCC) Mott Community College is a mid-sized

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

Chapter 12: Mass-Storage Systems

Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management RAID Structure

More information

High Performance Computing OpenStack Options. September 22, 2015

High Performance Computing OpenStack Options. September 22, 2015 High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,

More information

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...

More information

Reliability and Fault Tolerance in Storage

Reliability and Fault Tolerance in Storage Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

Analisi di un servizio SRM: StoRM

Analisi di un servizio SRM: StoRM 27 November 2007 General Parallel File System (GPFS) The StoRM service Deployment configuration Authorization and ACLs Conclusions. Definition of terms Definition of terms 1/2 Distributed File System The

More information