Data Management Best Practices

Similar documents
NERSC File Systems and How to Use Them

Globus and the Centralized Research Data Infrastructure at CU Boulder

Kriterien für ein PetaFlop System

Data Movement and Storage. Drew Dolgert and previous contributors

New Storage System Solutions

Storage Systems: 2014 and beyond. Jason Hick! Storage Systems Group!! NERSC User Group Meeting! February 6, 2014

Research Technologies Data Storage for HPC

Beyond Embarrassingly Parallel Big Data. William Gropp

Lessons learned from parallel file system operation

Current Status of FEFS for the K computer

Introduction to Supercomputing with Janus

Quick Introduction to HPSS at NERSC

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Getting Started with HPC

Architecting a High Performance Storage System

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

Cray Lustre File System Monitoring

Monitoring Tools for Large Scale Systems

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

WHITEPAPER: Understanding Pillar Axiom Data Protection Options

FLOW-3D Performance Benchmark and Profiling. September 2012

Maurice Askinazi Ofer Rind Tony Wong. Cornell Nov. 2, 2010 Storage at BNL

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Network Contention and Congestion Control: Lustre FineGrained Routing

Data Management. Network transfers

Configure Cisco Emergency Responder Disaster Recovery System

New and Improved Lustre Performance Monitoring Tool. Torben Kling Petersen, PhD Principal Engineer. Chris Bloxham Principal Architect

Cluster Implementation and Management; Scheduling

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Lustre & Cluster. - monitoring the whole thing Erich Focht

Highly-Available Distributed Storage. UF HPC Center Research Computing University of Florida

Hadoop on the Gordon Data Intensive Cluster

Jason Hill HPC Operations Group ORNL Cray User s Group 2011, Fairbanks, AK

Lustre SMB Gateway. Integrating Lustre with Windows

CRIBI. Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012

Week Overview. Running Live Linux Sending from command line scp and sftp utilities

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

NetApp High-Performance Computing Solution for Lustre: Solution Guide

HPC Growing Pains. Lessons learned from building a Top500 supercomputer

Solution Brief: Creating Avid Project Archives

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Lustre tools for ldiskfs investigation and lightweight I/O statistics

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre

XenData Product Brief: SX-250 Archive Server for LTO

Investigation of storage options for scientific computing on Grid and Cloud facilities

Amazon-Free Big Data Analysis. Michael R. Crusoe the GED #NGS

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Building a Top500-class Supercomputing Cluster at LNS-BUAP

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

File Transfer Best Practices

XenData Product Brief: SX-550 Series Servers for LTO Archives

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

LDM-129: Data Management Infrastructure Design

Introduction to SDSC systems and data analytics software packages "

Wrangler: A New Generation of Data-intensive Supercomputing. Christopher Jordan, Siva Kulasekaran, Niall Gaffney

Data Transfer and Filesystems

HPC system startup manual (version 1.30)

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

Addressing research data challenges at the. University of Colorado Boulder

Investigation of storage options for scientific computing on Grid and Cloud facilities

Sun Constellation System: The Open Petascale Computing Architecture

Storage Capacity Expansion Plan (initial)

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Evaluation of Dell PowerEdge VRTX Shared PERC8 in Failover Scenario

Managing, Sharing and Moving Big Data Tracy Teal and Greg Mason Insttute for Cyber Enabled Research

The Asterope compute cluster

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

Software infrastructure and remote sites

Object storage in Cloud Computing and Embedded Processing

Linux Cluster Computing An Administrator s Perspective

Cray s Storage History and Outlook Lustre+ Jason Goodman, Cray LUG Denver

Fine-grained File System Monitoring with Lustre Jobstat

GPN - What is theGPFS HSI HTAR ISH?

HPC Update: Engagement Model

Analysis of VDI Storage Performance During Bootstorm

Using NeSI HPC Resources. NeSI Computational Science Team

Data management on HPC platforms

SURFsara Data Services

IBM System x GPFS Storage Server

HITACHI VIRTUAL STORAGE PLATFORM FAMILY MATRIX

The Evolution of Cray Management Services

NERSC Archival Storage: Best Practices

Option nv, Gaston Geenslaan 14, B-3001 Leuven Tel Fax Page 1 of 14

Running Native Lustre* Client inside Intel Xeon Phi coprocessor

LANL Computing Environment for PSAAP Partners

Cornell University Center for Advanced Computing

Florida Site Report. US CMS Tier-2 Facilities Workshop. April 7, Bockjoo Kim University of Florida

M710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory

Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers

Archival Storage At LANL Past, Present and Future

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

InfoScale Storage & Media Server Workloads

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

globus online Reliable, high-performance file transfer as a service

Xsan 2 Administrator Guide. for Xsan 2.3

Data storage services at CC-IN2P3

Transcription:

December 4, 2013 Data Management Best Practices Ryan Mokos

Outline Overview of Nearline system (HPSS) Hardware File system structure Data transfer on Blue Waters Globus Online (GO) interface Web GUI Command-Line Interface (CLI) Optimizing data transfers Transfer parameters Transfer rates Transfer errors 2

Gemini Fabric (HSN) DSL 48 Nodes MOM 64 Nodes BOOT Nodes SDB Nodes XE6 Compute Nodes 5,688 Blades 22,640 Nodes 362,240 Cores RSIP Nodes Network GW Nodes Unassigned Nodes Cray XE6/XK7-276 Cabinets XK7 GPU Nodes 768 Blades 3,072 Nodes 24,576 Cores 3,072 GPUs LNET Routers 582 Nodes Boot RAID SMW Boot Cabinet eslogin 4 Nodes Import/Export 28 Nodes InfiniBand fabric 10/40/100 Gb Ethernet Switch Cyber Protec\on IDPS HPSS Data Mover 50 Nodes Management Node Sonexion 25+PB online storage 144+144+1440 OSTs NCSAnet esservers Cabinets Near- Line Storage 300+PB NPCF 3

FDR IB Blue Waters 11-Petaflop System FDR IB 36 x Sonexion 6000 Lustre 2.1: > 25PB @ > 1TB/s 28 x Dell R720 IE nodes 2 x 2.1GHz w/ 8 cores 1 x 40GbE GridFTP access only 4

1GbE Core Servers 2x X3580 X5 8x8 core Nehalems RHEL 6.3 HPSS Disk Cache 4 x DDN 12k 2.4PB @ 100GB/s 16Gb FC Mover nodes (GridFTP, RAIT) 50 x Dell R720 2 x 2.9GHz w/ 8 cores 2 x 40GbE (Bonded) RHEL 6.3 GridFTP access only 6 x Spectra Logic T-Finity 12 robotic arms 360PB in 95580 slots 366 TS1140 Jaguars @ 240MB/s 5

HPSS File System Structure Your home directory /u/sciteam/<username> (same as Lustre) Default quota of 5TB; can not be increased Your project directories /projects/sciteam/<psn (e.g., jn0)> (same as Lustre) Default quota of 50TB; can be increased with a request through the Blue Waters ticket system No purge policy! Data stays for the life of your project 6

Data Transfer on Blue Waters BW Lustre ó HPSS Use GO (Globus Online) Cannot use scp and sftp BW (Lustre, HPSS) ó Outside world Use GO Can use scp, sftp, and rsync but DON T! Impacts login node performance Slower than GO BW Lustre ó BW Lustre Using cp is ok GO is faster for multiple large files Example: copying 50 1-GB files from /scratch to /home cp: 244 sec. GO: 39 sec. 7

Using Globus Online BW Portal Documentation: https://bluewaters.ncsa.illinois.edu/data-transfer-doc GO access: https://bluewaters.ncsa.illinois.edu/data Use Globus Connect to create local endpoints for your own computer/cluster 8

Globus Online Web GUI BW endpoints ncsa#bluewaters ncsa#nearline Advantages Easy transfers Select src/dest Select files/dirs Click arrow Simple option selection Limitations Some parameters inaccessible 100k file max listing Sometimes < full concurrency 9

GO CLI (Command-Line Interface) Advantages Powerful access to all features and parameters Can use commands in scripts Full concurrency Disadvantages Takes a little time to learn Verbose Transfer example: ssh cli.globusonline.org "transfer -- \ ncsa#bluewaters/scratch/sciteam/<username>/a_file \ ncsa#nearline/u/sciteam/<username>/a_file" 10

CLI Usage Either ssh into cli.globusonline.org or include ssh cli.globusonline.org at the beginning of each command Transfers Use transfer command on individual files or on entire directories with r Check transfers with status command Use cancel to stop a transfer Basic file system commands: ls, mkdir For examples, see the BW Portal For a complete listing and man pages, ssh into cli.globusonline.org and type help 11

Moving HPSS Files Important note: transfer commands (GUI- and CLI-based) only copy files To move files, use the CLI rename command (example on BW Portal) Files cannot be moved using the GO GUI 12

Optimizing Transfers GUI does pretty well, but CLI can sometimes get better results Transfer large files (GB+ range) But also transfer lots of files to take advantage of concurrency Max concurrency 20 files/transfer * max 3 active transfers = up to 60 files in flight 13

CLI-Only Transfer Parameters Format: ssh cli.globusonline.org transfer <parameters> -- <src> <dest> --perf-p <num> Parallelism level (data streams/control channel) Valid values: 1-16 --perf-cc <num> Concurrency (number of control channels; i.e., number of files in flight) Valid values: 1-16 Default on BW to HPSS: 20, but only see ~12 --perf-pp <num> Pipeline depth (files in flight/control channel) Valid values: 1-32 14

Recommendations for BW HPSS Parameters for GB-Sized Files Don t set --perf-p (parallelism) Set --perf-cc 16 (concurrency = files in flight) Set --perf-pp 1 (pipeline depth) Important note: there s a minimum queue length of 2 events, meaning you need at least 2x your concurrency in files or you won t get full concurrency E.g., need >= 32 files to get 16 files in flight with --perf-cc set to 16 Play with settings for remote sites 15

Transfer Rates Rates calculated by GO are for entire transfer, including initialization and checksum verification, if applicable Checksum approximately halves the total rate Whole file is transferred, then checksum is computed BW ó HPSS for GB+ files Single file transfer rate: ~2-3 Gbits/sec raw (1-1.5 Gbits/sec with checksum enabled) We ve seen aggregate transfer rates (16 files in flight, each file 10s of GB) up to ~36 Gbits/sec raw (18 Gbits/sec with checksum) Other sites for GB+ files BW ó Kraken and BW ó Gordon: ~0.9-1.3 Gbits/sec with checksum 16

Transfer Errors Highly recommend using checksums, which are on by default for both the GUI and CLI Errors are infrequent but do occur My testing: 1,352 50-GB transfers, 20 errors Tend to occur in bursts 17

Other Notes Lustre striping When transferring to BW, files inherit the stripe settings of the directory in which they re placed (unless the file is so big that it requires a higher stripe setting, in which case it s adjusted higher) Slow staging on HPSS tape Intelligent staging in the works One case: concurrency of only 2 when transferring from tape (files in the 10s of GB); 16 when transferring from HPSS disk Lesson: avoid writing many many files to HPSS 18

Summary Use GO for all transfers to and from both BW and HPSS (not scp, sftp, or rsync) GO web GUI is simple; CLI is more powerful Balance large file size and large number of files to optimize transfers Try to transfer files of at least 1 GB Store large files on HPSS; avoid many small files Tar up files if necessary Single-compute-node jobs recommended for large tar tasks Use checksums Ask for support: help+bw@ncsa.illinois.edu 19