Research Technologies Data Storage for HPC Supercomputing for Everyone February 17-18, 2014 Research Technologies High Performance File Systems hpfs-admin@iu.edu Indiana University
Intro to HPC on Big Red II Workshop Data Storage Overview of presentation 1. Data Workflow Thinking about the lifecycle and types of data you use and create 2. Storage Resources - How to efficiently store and use data - Policies, best practices, and optimizing performance 3. Getting your data in and out of storage systems Questions welcome any time! There will also be time at the end for discussion.
Consider the types of data you may have Source code Documenta.on Reports Scien.fic data Computa.onal data Intermediate steps Output Results of computa.on
Data Storage Requirements Source Code & Documents Easily accessible Backed- up Computa2onal Data Matches poten.al of BR2 FAST! (parallel) Store lots of input/output large capacity Ability to work together collabora.on Results Store safely for long.me Poten.ally very large amounts of data Ability to share data with collaborators
Simple Workflow Input instruc.ons Read, compute, write data in parallel Archive results
Home Directory Default loca.on For small files Data is backed up /N/u/username/BigRed2 Data Capacitor II (DC2) Large capacity High throughput For compute data Data is not backed up /N/dc2/ Scholarly Data Archive (SDA) Disk to tape archiving Distributed copies For long term storage
Big Red II Storage Resource Analogies Home Directories the family sta.on wagon daily trips to school backed up / used always Data Capacitor II (DC2) the race car VERY FAST - wear a seatbelt not backed up / workspace Scholarly Data Archive (SDA) the all- terrain vehicle extremely reliable keeps your cri.cal data safe
Home Directories Input instruc.ons
Home Directory The Family Station Wagon Default storage location for your account - Available as soon as you log in The place to store source code, shell scripts, and other small files Not meant for computational data! Do not compute against data in home directories! - Don t take your station wagon to the race track More information: https://kb.iu.edu/d/avkm#home
Home Directory Quota 10 GB quota for home directory Use the command `quota -s` to see current usage `-s` flag gives output in MB, otherwise in 1 KB blocks jupmille@login1:~> quota -s Filesystem blocks quota limit grace files quota limit grace bl-nas2:/vol/hd03 184M 9900M 10240M 3169 4295m 4295m
Home Directory Snapshots Hourly and nightly snapshots are made of your home directory Snapshots are in a hidden.snapshot directory within the source directory jupmille@login1:~> cd.snapshot jupmille@login1:~/.snapshot> ls hourly.0 hourly.1 hourly.2 hourly.3 hourly.4 hourly.5 nightly.0 nightly.1 Snapshots are taken daily at 8am, 12pm, 4pm, 8pm, midnight For more information see the Knowledge Base document https://kb.iu.edu/d/azvw
Home Directory Shared Across Systems Home directories space shared across RT systems System Big Red 2 Mason Quarry Path /N/u/username/BigRed2 /N/u/username/Mason /N/u/username/Quarry jupmille@login1:~> pwd /N/u/jupmille/BigRed2 jupmille@login1:~> cd.. jupmille@login1:/n/u/jupmille> ls BigRed2 Mason Quarry
Data Capacitor II Compute against data
Data Capacitor II The Race Car Parallel high-speed storage based on Lustre file system 3.5 PB total size, 50 GB/s throughput to BR2 Store input and output application data intended as a temporary workspace for computation» not for indefinite storage of data DC2 is not backed up Available on IU s HPC resources - Big Red II, Mason, Quarry More information available on the Knowledge Base https://kb.iu.edu/d/avvh
Data Capacitor II Lustre File System Linux + Cluster = Lustre Lustre is a parallel distributed file system High performance file system used by many Top500 supercomputers (>50%) POSIX compliant behaves like other file systems Open source software under GPLv2 Guided by non-profit Open Scalable File Systems, Inc. - http://www.opensfs.org Intel maintains canonical tree - http://hpdd.intel.com Active development - IU contributes code
Data Capacitor II Usage Lustre is designed for high-speed data access, not for metadata speed This intentional design consideration comes with a tradeoff File metadata lookups can be relatively slow Metadata server must ask each Object server for size - Otherwise Metadata would be constantly updating with size info Tips to improve interactive performance: Avoid more than 10K files in one directory - separate input, output, final results, and delete unneeded data helps with data management as well Limit the amount of metadata actions you perform - reduce file and directory operations, stat-ing files
Data Capacitor II Scratch Directories A scratch directory is available to every Big Red II user Path to your scratch space is /N/dc2/scratch/username/ Intended as temporary workspace for your data, not for sharing Files not accessed in 60 days may be purged The name scratch is from the phrase scratch paper - a piece of paper used while performing calculations, separate from answer sheet - you may know it as scrap paper - implies an impermanence to the data
Data Capacitor II Project Directories Project directories available by application for users, groups, or labs with special needs Makes it possible to share data amongst users (Unix groups) Files not accessed in 180 days may be purged Project space can be applied for by submitting application at this URL - http://rt.uits.iu.edu/systems/hpfs/allocation-request-form.php Longer storage time than scratch space, but still not forever
Data Capacitor II Purging Old Data Administrators of Data Capacitor II routinely purge old data Data not accessed in a certain amount of time will be deleted - scratch = 60 days, project = 180 days You will be notified before and after any action is taken against your data An email will be sent to you listing the eligible files A file will be place in the root of your scratch or project directory - Will-Purge-These-uid648424-jupmille-Files-On-2014-06-30.txt You will have seven (7) days to take action Afterwards, a file listing the actions taken will be created See https://kb.iu.edu/d/avvh#usage
Data Capacitor II Find Old Files You can be proactive about managing your data to prevent purging Use this command to list the files in your directory sorted by age - May take awhile, it depends on the number of files you have find /N/dc2/scratch/<username> -type f -exec stat --format="%n %x" '{}' \; sort -k2,3
Data Capacitor II Space Quota There is no strict limit on the amount of space you can use Space available for all users varies depending on system use - df -h gives you current space available Please don t use any more space than strictly necessary - Data Capacitor II is a shared resource It is intended for computation Use this command to find the total amount of space you re using: du hc /N/dc2/scratch/<username>
Data Capacitor II HIPAA and ephi DC2 is HIPAA aligned, but you are responsible for ensuring the privacy and security of ephi data Technical safeguards Set directory permissions to restrict read and write access - The most secure method is to allow access only to you jupmille@login1:/n/dc2/scratch/jupmille> chmod 700 ephi_file jupmille@login1:/n/dc2/scratch/jupmille> ls -l ephi_file -rwx------ 1 jupmille uits 0 Jan 23 14:25 ephi_file Use `umask` to ensure all new files are created with safe permissions - Add `umask 077` to shell profile - See https://kb.iu.edu/d/acge
Data Capacitor II Job Scheduling Specify DC2 as a requirement for your batch job add the dc2 file system property to the nodes directive in your in your TORQUE job script For example, if your job requires two nodes, thirty two processors per node, and the Data Capacitor II file system (/N/dc2), the resource specification line in your TORQUE job script would look like: #PBS -l nodes=2:ppn=32:dc2 Specifying the dc2 property in your script directs TORQUE to dispatch your job to only those compute nodes with the Data Capacitor II file system mounted. If DC2 is down, your job won t run. More information at: https://kb.iu.edu/d/baht
Data Capacitor II Reporting Issues If you encounter any problems using Data Capacitor II, please include these details when reporting the issue: Data and time event occurred Which system your job was running on The directory being used A brief description of what was happening when the issue occurred All Data Capacitor II issues should be reported to hpfs-admin@iu.edu
Scholarly Data Archive Archive results
Scholarly Data Archive Massive near-line and archival data storage Disk cache front end ~600 TB Magnetic tape storage 15 PB (uncompressed) Hierarchical storage management (HSM) Data migrates from disk to tape over time Retrieval from tape a small cost for safety Data integrity Geographically replicated - IUB and IUPUI each get a copy Checksums and error detection
Scholarly Data Archive Details Account can be applied for easily https://itaccounts.iu.edu/ Default quota is 50 TB replicated copy of data is not counted additional storage is available HIPAA aligned but you must secure the data Group or department accounts are available Data can be shared with Access Control Lists More information available on the Knowledge Base https://kb.iu.edu/d/aiyi
Scholarly Data Archive Usage Best Uses Files of at least 1MB Single file can be up to 10TB Archive files Files rarely updated Files need to be kept long time Files are read often frequently accessed files tend to stay on disk cache Poor Uses Small files small files should be aggregated with a tool like WinZip or tar Files that will frequently change Do not edit files in place - If you need to edit: Copy -> Edit -> Reupload
Scholarly Data Archive Helpful Tip Data stored on the SDA can be kept for a long time So long that you might even forget - Or the people who did know have left Do your future self a favor and document the data Create a manifest or annotation of the data Keep it at the top of your storage directory, and keep it up to date
Transferring Data In and Out of RT Storage
Preparing to Transfer Data It is recommended to bundle your data before transferring - Easier to manage a single file - Preserves layout, permissions - Transferring large files is often faster than many small files jupmille@login1:/n/dc2/scratch/jupmille> ls input output results jupmille@login1:/n/dc2/scratch/jupmille> tar -cvf archive.tar input/ output/ results/ input/ output/ results/ jupmille@login1:/n/dc2/scratch/jupmille> ls archive.tar input output results
Getting data in and out of Home Directories scp is the easiest way to get data in and out of your home directory - secure, but for high performance the quota is 10GB, so you re unlikely to make large transfers - no restart capabilities, so if it fails you must start over - sftp and rsync over ssh are also good options $ scp archive.tar bigred2.uits.iu.edu:~
Getting data in and out of Data Capacitor II The IU Cyberinfrastructure Gateway allows you to transfer data between your machine and Data Capacitor II IU CI Gateway information: - https://kb.iu.edu/d/bdfo transferring data with CI Gateway to DC2: - https://kb.iu.edu/d/bdqp The IU CI Gateway uses Globus Online a parallel transfer tool which requires software to be installed follow the instructions in the KB article to request an account for DC2 - The endpoint is iu#dcwan_internal - Your path is /~/N/dc2/scratch/<username>/ You can still use scp/rsync/sftp but they re not high performance tools
Getting data in and out of Scholarly Data Archive Fast access hsi and htar command line tools - To use HSI on Big Red II, you must load the HPSS module module load hpss GridFTP clients Kerberized FTP GlobusOnline also available through the IU CI Gateway Convenience protocols Web access via browser sftp Mount to desktop via CIFS/Samba (mapped drive) Knowledge Base article on SDA access https://kb.iu.edu/d/aiyr
Pull/Push data in SDA to DC2 or Home Directory Use hsi on Big Red II login node Add module statement to profile module load hpss Can be done interactively Can be scripted through Kerberos keytab authentication - https://kb.iu.edu/d/aiyu - hsi can be used in many different ways (ftp style commands) - manual available at https://computing.llnl.gov/lcdocs/hsi/
Scholarly Data Archive HSI Example jupmille@login1:~> module load hpss HPSS (command-line utility for access to the SDA) version 4.0 loaded. jupmille@login1:~> hsi Kerberos Principal: jupmille Password for jupmille@ads.iu.edu:? put samplefile.tar? ls /hpss/j/u/jupmille: samplefile.tar? get samplefile.tar? du -k? help? exit Knowledge Base: hvps://kb.iu.edu/d/avdb
Getting Data onto Big Red II Big Red II, Mason, Quarry login nodes do not enforce a time limit on data transfer tools scp, sftp, hsi, htar, wget, curl, etc. I recommend putting your data into the Scholarly Data Archive first Then use command line tools to pull from SDA into DC2, Home Directories Many ways to access the SDA, robust permissions You ll always have a distributed copy of your data!
Other RT Storage Resources Focus of this presentation was basics of storage available on Big Red II There are more storage options available - Research File System (RFS) Distributed copies, very accessible, robust permissions https://kb.iu.edu/d/aroz - Data Capacitor WAN (DC-WAN) Lustre over the wide area network, share at high speed https://kb.iu.edu/d/avvh
System Outages If there are any problems with the system, we will update IT Notices http://itnotices.iu.edu/ Data Capacitor II has regularly scheduled system maintenance First Tuesday of every month Join the maintenance mailing list to be notified - https://kb.iu.edu/d/avvh#support
Other RT Storage Resources Focus of this presentation was basics of storage available on Big Red II There are more storage options available - Research File System (RFS) Distributed copies, very accessible, robust permissions https://kb.iu.edu/d/aroz - Data Capacitor WAN (DC-WAN) Lustre over the wide area network, share at high speed https://kb.iu.edu/d/avvh
Ques.ons?