High Performance Storage System. Overview



Similar documents
High Performance Storage System Interfaces. Harry Hulen

Archival Storage At LANL Past, Present and Future

GPFS und HPSS am HLRS

Quick Introduction to HPSS at NERSC

Large File System Backup NERSC Global File System Experience

XenData Archive Series Software Technical Overview

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

NERSC Archival Storage: Best Practices

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

(Scale Out NAS System)

XenData Video Edition. Product Brief:

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

New Storage System Solutions

Large Scale Storage. Orlando Richards, Information Services LCFG Users Day, University of Edinburgh 18 th January 2013

Quantum StorNext. Product Brief: Distributed LAN Client

List of Figures and Tables

Multi-Terabyte Archives for Medical Imaging Applications

3Gen Data Deduplication Technical

High Availability Databases based on Oracle 10g RAC on Linux

IBM Global Technology Services September NAS systems scale out to meet growing storage demand.

HPSS Best Practices. Erich Thanhardt Bill Anderson Marc Genty B

Big data management with IBM General Parallel File System

GridFTP: A Data Transfer Protocol for the Grid

How to Choose your Red Hat Enterprise Linux Filesystem

A Survey of Shared File Systems

Rapid Data Backup and Restore Using NFS on IBM ProtecTIER TS7620 Deduplication Appliance Express IBM Redbooks Solution Guide

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows

Evaluation of Enterprise Data Protection using SEP Software

NAS or iscsi? White Paper Selecting a storage system. Copyright 2007 Fusionstor. No.1

IBM Infrastructure for Long Term Digital Archiving

Microsoft SQL Server 2005 on Windows Server 2003

Solution Brief: Creating Avid Project Archives

How To Back Up A Computer To A Backup On A Hard Drive On A Microsoft Macbook (Or Ipad) With A Backup From A Flash Drive To A Flash Memory (Or A Flash) On A Flash (Or Macbook) On

Red Hat Storage Server

irods at CC-IN2P3: managing petabytes of data

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Flexible Scalable Hardware independent. Solutions for Long Term Archiving

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software

POWER ALL GLOBAL FILE SYSTEM (PGFS)

XenData Product Brief: SX-550 Series Servers for LTO Archives

ETERNUS CS High End Unified Data Protection

Reference Architecture. EMC Global Solutions. 42 South Street Hopkinton MA

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data

Implementing a Digital Video Archive Based on XenData Software

EMC Disk Library with EMC Data Domain Deployment Scenario

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

High Performance Computing OpenStack Options. September 22, 2015

Distributed File Systems

Research Technologies Data Storage for HPC

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006

Data storage services at CC-IN2P3

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Quantum DXi6500 Family of Network-Attached Disk Backup Appliances with Deduplication

Implementing Offline Digital Video Storage using XenData Software

BMC Recovery Manager for Databases: Benchmark Study Performed at Sun Laboratories

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

Backup of NAS devices with Avamar

IBM System x GPFS Storage Server

XenData Product Brief: SX-520 Series Servers for Sony Optical Disc Archives

Introduction to NetApp Infinite Volume

Hitachi NAS Platform and Hitachi Content Platform with ESRI Image

XenData Product Brief: SX-250 Archive Server for LTO

Symantec NetBackup OpenStorage Solutions Guide for Disk

Lessons learned from parallel file system operation

Scale and Availability Considerations for Cluster File Systems. David Noy, Symantec Corporation

Implementing a Digital Video Archive Using XenData Software and a Spectra Logic Archive

EMC BACKUP MEETS BIG DATA

Choosing the best architecture for data protection in your Storage Area Network

Ultimate Guide to Oracle Storage

W H I T E P A P E R T h e C r i t i c a l N e e d t o P r o t e c t M a i n f r a m e B u s i n e s s - C r i t i c a l A p p l i c a t i o n s

IBM Scale Out Network Attached Storage

10th TF-Storage Meeting

WHITE PAPER BRENT WELCH NOVEMBER

Portable, Scalable, and High-Performance I/O Forwarding on Massively Parallel Systems. Jason Cope

Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata

SAN Conceptual and Design Basics

EMC DATA DOMAIN OVERVIEW. Copyright 2011 EMC Corporation. All rights reserved.

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Cray s Storage History and Outlook Lustre+ Jason Goodman, Cray LUG Denver

Scalable NAS for Oracle: Gateway to the (NFS) future

SAN/iQ Remote Copy Networking Requirements OPEN iscsi SANs 1

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

Backup Solutions for the Celerra File Server

EMC Backup Storage Solutions: The Value of EMC Disk Library with TSM

Enterprise Backup and Restore technology and solutions

General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems

XenData Product Brief: SX-250 Archive Server for LTO

Performance Analysis of RAIDs in Storage Area Network

Transcription:

Overview * Please review disclosure statement on last slide Please see disclaimer on last slide Copyright IBM 2009 Slide Number 1

Hierarchical storage software developed in collaboration with five US Department of Energy Labs since 1992. Provides stewardship for 100s of billions of files spanning 100s of petabytes of tapes for the HPC community. Licensed and supported by IBM Global Business Services in Houston, Texas. Highest Performance Computing Highest Performance Storage LANL Roadrunner (top 500 #1) ORNL Jaguar (top 500 #2) LLNL Blue Gene/L (top 500 #5) ANL Blue Gene/P (top 500 #7) ECMWF Power 575 IU Big Red LBNL NERSC (top 500 #7) LLNL ASC Purple NCEP Power 575 NCSA Blue Water Copyright IBM 2009 Slide Number 2

Disk and tape file repository Hierarchical storage management (HSM) with automatic migration and recall Highly scalable for high-end computing and storage customers A single instance of is capable of concurrently accessing hundreds of tapes for extremely high aggregate data transfers. User sees as a single Unix file system Classic presents its own file system New for GPFS extends IBM s most scalable file system to tape Automatic migration and recall Hierarchical Storage Management (Also called Space Management) DISK TAPE Copyright IBM 2009 Slide Number 3

The Collaboration Collaboration (copyright holders & developers) IBM Global Services in Houston, Texas Lawrence Berkeley National Laboratory Lawrence Livermore National Laboratory Los Alamos National Laboratory Oak Ridge National Laboratory Sandia National Laboratories Advantages of Collaborative Development Since 1992 Developers are users; they focus on what is needed and what works. Source code is available to collaboration members and US stake holders (though is not open source in the usual sense) Customization is allowed (subject to collaboration review). Historically, world wide, all truly high end software has required significant means other than retail sales to spread costs among interested parties, and our unique collaborative method has worked well for through 7 major 2009 IBM releases Corporation and with a game-changing 8 th in development Copyright IBM 2009 Slide Number 4

Profile of an customer Already has experience with tape is not a tape novice Understands the need for tape including acquisition cost benefits, volumetric benefits, long term archive benefits, energy and cooling benefits Has capable staff with tape experience Already has experience with tape software Understands difference between file system backup and HSM Has done enough homework to know that one or more easy solutions, and probably their existing solution, won t meet their requirements Has one or more difficult requirements, now or in the future Tens to hundreds of tape drives Tens of thousands or more tape cartridges Many small, tape-unfriendly files affecting tape performance Multiple service classes such as single tape, striped tape, mirrored files, remote mirroring, and need to group like file classes on tape Need to keep track of 100s of millions to billions of files in a single name space Need a single solution for file system backup, space management, and archive generally exceeds all others in all of these difficult requirements Copyright IBM 2009 Slide Number 5

Sizes of some of the larger sites Updated 10/31/09 System (Each system shown is a single instance and namespace) Petabytes (10^15 bytes) Million files Avg file MB The European Centre for Medium-Range Weather Forecasts (ECMWF) 15.69 63 238 Lawrence Livermore National Lab (LLNL) Secure Computing Facility (SCF) 15.45 113 131 Los Alamos National Lab (LANL) Secure Computing Facility (SCF) 14.58 126 111 Brookhaven National Lab (BNL) 11.75 67 167 National Centers for Environmental Prediction (NCEP) 10.32 9 1117 Deutsches Klimarechenzentrum GmbH (DKRZ) 9.53 24 378 LLNL Open Computing Facility (OCF) 8.88 117 72 Oak Ridge National Laboratory (ORNL) 8.06 16 474 Commissariat à l'energie Atomique/Division des Applications Militaires (CEA) 7.27 2 3,233 Institute National de Physique Nucléaire et de Physique des Particules (IN2P3) 7.02 26 258 Lawrence Berkeley Lab (LBL) National Energy Research Scientific Computing Center (NERSC) 5.80 79 70 Stanford Linear Accelerator Center (SLAC) 5.22 7 760 San Diego Supercomputer Center (SDSC) 4.46 58 74 Indiana University (IU) 3.25 25 124 LBL NERSC Backup System 3.09 13 224 LANL Open Computing Facility 2.12 30 68 NASA Langley (LaRC) RIKEN in Japan 1.97 1.73 6 5 304 346 follows the convention used by most enterprise disk and tape manufactures and the SNIA that one petabyte = 1000^5 bytes. To convert to binary petabytes, where one petabyte = 1024^5 bytes, multiply by 0.888 Copyright IBM 2009 Slide Number 6

New sites Recently in production Argonne National Laboratory (ANL) High Performance Computing Center Stuttgart (HLRS) Japan Aerospace Exploration Agency (JAXA) United Kingdom Met Office (UKMO) Just starting or developing applications Library of Congress (LoC) National Center for Atmospheric Research (NCAR) Pacific Northwest National Laboratory (PNNL)Northrop Grumman A class by itself: University of Illinois National Center for Supercomputing Applications (NCSA) and NSF Blue Waters supercomputer project Production in 2011 Probably world s largest storage system 10s to 100s of billion files At least 500 tape drives Copyright IBM 2009 Slide Number 7

architecture has a clustered architecture, unique among tape-based products scales horizontally almost without limits Horizontal scaling is easy by adding cluster components Client Cluster Movers Cluster Core Server Direct or SAN Attached SAN or Network Direct or SAN Attached Metadata Mover Resources (Disk + Tape) Copyright IBM 2009 Slide Number 8

metadata architecture Under the hood, is powered by IBM s DB2 data base engine that scales to your performance needs! DB2 metadata completely characterizes all files whether on disk, single tape, striped tape, mirrored tape, shelf tape, or multi-level hierarchies of disk and tape. DB2 provides its own utilities for copying, mirroring, backup, restore, and verifying. DB2 itself is highly scalable and reliable, bringing those capabilities to. Enables fast restore after a disruption of service DB2 is bundled with for this purpose, without extra charge Copyright IBM 2009 Slide Number 9

tape architecture supports striping to disk and tape. At 100 MB/s, it takes almost 3 hours to write a 1 TB file to a single tape. Using an 8-way tape stripe, that time is cut to under 25 minutes! Software RAIT is being developed jointly by IBM and NCSA, to add up to 8+2 reliability to striping supports small file tape aggregation. Tape marks and repositioning for each small file write, prevents tape drives from streaming data to the tape poor tape performance. By aggregating many, many small files to a single object, tape drives are kept moving for longer periods of time -- greatly improving tape drive performance! For very high data rate and very high data volume tape throughput, both striping and file aggregation are required. Copyright IBM 2009 Slide Number 10

hierarchical architecture implements hierarchical storage management (HSM) using: Storage classes Storage hierarchies Classes of service Migration and purge policies Automatic staging File data are automatically copied to lower levels. Typically, file data are automatically staged back to the top level, when accessed by the user. Hierarchy Staging Staging Disk Storage Class Tape Storage Class Tape Storage Class Example of a dual copy COS Migrating Duplicate Copy Copyright IBM 2009 Slide Number 11

native interfaces Mountable POSIX-compliant file system: Linux users can mount using the Linux virtual file system (VFS) interface. Used when applications needs a file system interface to for applications like DB2, Apache, SAMBA, etc. Standard FTP and Parallel FTP Files are copied using get and put using standard ftp, or with pput and pget using pftp (parallel FTP). Application Programming Interface (API): POSIX-like (for example read(); becomes hpss_read();). Supports parallel files, parallel servers, parallel clients. Copyright IBM 2009 Slide Number 12

third party interfaces Interfaces available through using the Linux virtual file system interface: NFS v3 Open SAMBA Apache Secure FTP Copyright IBM 2009 Slide Number 13

third party interfaces, continued GridFTP Enabled for Extension of the FTP protocol for the grid computing environment. Open source Globus add-on developed by ANL. HSI and HTAR Two interfaces developed by Gleicher Enterprises. HSI, a widely used Unix-like third-party interface shell. HTAR, a stand-alone utility used to aggregate files directly into. NFS and LUSTRE-HSM bindings Two interfaces developed by CEA/DAM (France). GANESHA, a multi-usage, large cache NFSv4 server. LUSTRE-HSM bindings in development (http://arch.lustre.org/index.php?title=hsm_migration). Contact: Jacques-Charles LaFoucriere - jc.lafoucriere@cea.fr Copyright IBM 2009 Slide Number 14

Interface to IBM GPFS GPFS is IBM s General Parallel File System can provide a highly scalable GPFS tape pool provides two tightly coupled services for GPFS: Hierarchical space management of GPFS file space Disaster recovery backup of GPFS file systems Copyright IBM 2009 Slide Number 15

How works Example of an hpss_read 1. Client issues READ to Core Server 2. Core Server accesses metadata on disk DB2 Metadata 3. Core Server commands Mover to stage file from tape to disk 6. Client reads data from shared disk over SAN, or Client reads data from Mover over LAN Client Client Cluster SAN LAN Core Core Server Server Movers Movers 5. Core Server sends lock and ticket back to client 4. Mover stages file from tape to disk Disk Arrays Tape Libraries Copyright IBM 2009 Slide Number 16

at Multiple Sites Core Servers Metadata Disks Local Client Computers Router 1GbE 10GbE Home site is split into two subsystems, one at each site. Tape data mirrored at the other site. All metadata backed up at the other site. WAN Remote site Router 1GbE 10GbE Remote Client Computers Core Servers Metadata Disks Movers Disk Arrays FC SAN Movers Tape Libraries Both sites have active local client computers. In the event of a loss of a site, both core server images can be brought up at the remaining site allowing access to all data. Movers FC SAN Tape Libraries Movers Disk Arrays Copyright IBM 2009 Slide Number 17

Contacts Jim A. Gerry jgerry@us.ibm.com Harry Hulen hulen@us.ibm.com Patrick Schaefer pschaef@us.ibm.com Bob Coyne coyne@us.ibm.com Copyright IBM 2009 Slide Number 18

Disclaimer Please obtain and read product documentation before deciding to acquire. Documentation includes the License Agreement, the Statement of Work, and and other product manuals. In case of conflict between information herein and product documentation, the documentation shall take precedence. Forward looking information including schedules and future product capabilities reflect current planning that may change and should not be taken as commitments by IBM. IBM may at its sole discretion discontinue, add, or change features and function without notice. Copyright IBM 2009 Slide Number 19