Research Technologies Data Storage for HPC
|
|
- Miles Carson
- 8 years ago
- Views:
Transcription
1 Research Technologies Data Storage for HPC Supercomputing for Everyone February 17-18, 2014 Research Technologies High Performance File Systems Indiana University
2 Intro to HPC on Big Red II Workshop Data Storage Overview of presentation 1. Data Workflow Thinking about the lifecycle and types of data you use and create 2. Storage Resources - How to efficiently store and use data - Policies, best practices, and optimizing performance 3. Getting your data in and out of storage systems Questions welcome any time! There will also be time at the end for discussion.
3 Consider the types of data you may have Source code Documenta.on Reports Scien.fic data Computa.onal data Intermediate steps Output Results of computa.on
4 Data Storage Requirements Source Code & Documents Easily accessible Backed- up Computa2onal Data Matches poten.al of BR2 FAST! (parallel) Store lots of input/output large capacity Ability to work together collabora.on Results Store safely for long.me Poten.ally very large amounts of data Ability to share data with collaborators
5 Simple Workflow Input instruc.ons Read, compute, write data in parallel Archive results
6 Home Directory Default loca.on For small files Data is backed up /N/u/username/BigRed2 Data Capacitor II (DC2) Large capacity High throughput For compute data Data is not backed up /N/dc2/ Scholarly Data Archive (SDA) Disk to tape archiving Distributed copies For long term storage
7 Big Red II Storage Resource Analogies Home Directories the family sta.on wagon daily trips to school backed up / used always Data Capacitor II (DC2) the race car VERY FAST - wear a seatbelt not backed up / workspace Scholarly Data Archive (SDA) the all- terrain vehicle extremely reliable keeps your cri.cal data safe
8 Home Directories Input instruc.ons
9 Home Directory The Family Station Wagon Default storage location for your account - Available as soon as you log in The place to store source code, shell scripts, and other small files Not meant for computational data! Do not compute against data in home directories! - Don t take your station wagon to the race track More information:
10 Home Directory Quota 10 GB quota for home directory Use the command `quota -s` to see current usage `-s` flag gives output in MB, otherwise in 1 KB blocks jupmille@login1:~> quota -s Filesystem blocks quota limit grace files quota limit grace bl-nas2:/vol/hd03 184M 9900M 10240M m 4295m
11 Home Directory Snapshots Hourly and nightly snapshots are made of your home directory Snapshots are in a hidden.snapshot directory within the source directory jupmille@login1:~> cd.snapshot jupmille@login1:~/.snapshot> ls hourly.0 hourly.1 hourly.2 hourly.3 hourly.4 hourly.5 nightly.0 nightly.1 Snapshots are taken daily at 8am, 12pm, 4pm, 8pm, midnight For more information see the Knowledge Base document
12 Home Directory Shared Across Systems Home directories space shared across RT systems System Big Red 2 Mason Quarry Path /N/u/username/BigRed2 /N/u/username/Mason /N/u/username/Quarry jupmille@login1:~> pwd /N/u/jupmille/BigRed2 jupmille@login1:~> cd.. jupmille@login1:/n/u/jupmille> ls BigRed2 Mason Quarry
13 Data Capacitor II Compute against data
14 Data Capacitor II The Race Car Parallel high-speed storage based on Lustre file system 3.5 PB total size, 50 GB/s throughput to BR2 Store input and output application data intended as a temporary workspace for computation» not for indefinite storage of data DC2 is not backed up Available on IU s HPC resources - Big Red II, Mason, Quarry More information available on the Knowledge Base
15 Data Capacitor II Lustre File System Linux + Cluster = Lustre Lustre is a parallel distributed file system High performance file system used by many Top500 supercomputers (>50%) POSIX compliant behaves like other file systems Open source software under GPLv2 Guided by non-profit Open Scalable File Systems, Inc. - Intel maintains canonical tree - Active development - IU contributes code
16 Data Capacitor II Usage Lustre is designed for high-speed data access, not for metadata speed This intentional design consideration comes with a tradeoff File metadata lookups can be relatively slow Metadata server must ask each Object server for size - Otherwise Metadata would be constantly updating with size info Tips to improve interactive performance: Avoid more than 10K files in one directory - separate input, output, final results, and delete unneeded data helps with data management as well Limit the amount of metadata actions you perform - reduce file and directory operations, stat-ing files
17 Data Capacitor II Scratch Directories A scratch directory is available to every Big Red II user Path to your scratch space is /N/dc2/scratch/username/ Intended as temporary workspace for your data, not for sharing Files not accessed in 60 days may be purged The name scratch is from the phrase scratch paper - a piece of paper used while performing calculations, separate from answer sheet - you may know it as scrap paper - implies an impermanence to the data
18 Data Capacitor II Project Directories Project directories available by application for users, groups, or labs with special needs Makes it possible to share data amongst users (Unix groups) Files not accessed in 180 days may be purged Project space can be applied for by submitting application at this URL - Longer storage time than scratch space, but still not forever
19 Data Capacitor II Purging Old Data Administrators of Data Capacitor II routinely purge old data Data not accessed in a certain amount of time will be deleted - scratch = 60 days, project = 180 days You will be notified before and after any action is taken against your data An will be sent to you listing the eligible files A file will be place in the root of your scratch or project directory - Will-Purge-These-uid jupmille-Files-On txt You will have seven (7) days to take action Afterwards, a file listing the actions taken will be created See
20 Data Capacitor II Find Old Files You can be proactive about managing your data to prevent purging Use this command to list the files in your directory sorted by age - May take awhile, it depends on the number of files you have find /N/dc2/scratch/<username> -type f -exec stat --format="%n %x" '{}' \; sort -k2,3
21 Data Capacitor II Space Quota There is no strict limit on the amount of space you can use Space available for all users varies depending on system use - df -h gives you current space available Please don t use any more space than strictly necessary - Data Capacitor II is a shared resource It is intended for computation Use this command to find the total amount of space you re using: du hc /N/dc2/scratch/<username>
22 Data Capacitor II HIPAA and ephi DC2 is HIPAA aligned, but you are responsible for ensuring the privacy and security of ephi data Technical safeguards Set directory permissions to restrict read and write access - The most secure method is to allow access only to you jupmille@login1:/n/dc2/scratch/jupmille> chmod 700 ephi_file jupmille@login1:/n/dc2/scratch/jupmille> ls -l ephi_file -rwx jupmille uits 0 Jan 23 14:25 ephi_file Use `umask` to ensure all new files are created with safe permissions - Add `umask 077` to shell profile - See
23 Data Capacitor II Job Scheduling Specify DC2 as a requirement for your batch job add the dc2 file system property to the nodes directive in your in your TORQUE job script For example, if your job requires two nodes, thirty two processors per node, and the Data Capacitor II file system (/N/dc2), the resource specification line in your TORQUE job script would look like: #PBS -l nodes=2:ppn=32:dc2 Specifying the dc2 property in your script directs TORQUE to dispatch your job to only those compute nodes with the Data Capacitor II file system mounted. If DC2 is down, your job won t run. More information at:
24 Data Capacitor II Reporting Issues If you encounter any problems using Data Capacitor II, please include these details when reporting the issue: Data and time event occurred Which system your job was running on The directory being used A brief description of what was happening when the issue occurred All Data Capacitor II issues should be reported to hpfs-admin@iu.edu
25 Scholarly Data Archive Archive results
26 Scholarly Data Archive Massive near-line and archival data storage Disk cache front end ~600 TB Magnetic tape storage 15 PB (uncompressed) Hierarchical storage management (HSM) Data migrates from disk to tape over time Retrieval from tape a small cost for safety Data integrity Geographically replicated - IUB and IUPUI each get a copy Checksums and error detection
27 Scholarly Data Archive Details Account can be applied for easily Default quota is 50 TB replicated copy of data is not counted additional storage is available HIPAA aligned but you must secure the data Group or department accounts are available Data can be shared with Access Control Lists More information available on the Knowledge Base
28 Scholarly Data Archive Usage Best Uses Files of at least 1MB Single file can be up to 10TB Archive files Files rarely updated Files need to be kept long time Files are read often frequently accessed files tend to stay on disk cache Poor Uses Small files small files should be aggregated with a tool like WinZip or tar Files that will frequently change Do not edit files in place - If you need to edit: Copy -> Edit -> Reupload
29 Scholarly Data Archive Helpful Tip Data stored on the SDA can be kept for a long time So long that you might even forget - Or the people who did know have left Do your future self a favor and document the data Create a manifest or annotation of the data Keep it at the top of your storage directory, and keep it up to date
30 Transferring Data In and Out of RT Storage
31 Preparing to Transfer Data It is recommended to bundle your data before transferring - Easier to manage a single file - Preserves layout, permissions - Transferring large files is often faster than many small files jupmille@login1:/n/dc2/scratch/jupmille> ls input output results jupmille@login1:/n/dc2/scratch/jupmille> tar -cvf archive.tar input/ output/ results/ input/ output/ results/ jupmille@login1:/n/dc2/scratch/jupmille> ls archive.tar input output results
32 Getting data in and out of Home Directories scp is the easiest way to get data in and out of your home directory - secure, but for high performance the quota is 10GB, so you re unlikely to make large transfers - no restart capabilities, so if it fails you must start over - sftp and rsync over ssh are also good options $ scp archive.tar bigred2.uits.iu.edu:~
33 Getting data in and out of Data Capacitor II The IU Cyberinfrastructure Gateway allows you to transfer data between your machine and Data Capacitor II IU CI Gateway information: - transferring data with CI Gateway to DC2: - The IU CI Gateway uses Globus Online a parallel transfer tool which requires software to be installed follow the instructions in the KB article to request an account for DC2 - The endpoint is iu#dcwan_internal - Your path is /~/N/dc2/scratch/<username>/ You can still use scp/rsync/sftp but they re not high performance tools
34 Getting data in and out of Scholarly Data Archive Fast access hsi and htar command line tools - To use HSI on Big Red II, you must load the HPSS module module load hpss GridFTP clients Kerberized FTP GlobusOnline also available through the IU CI Gateway Convenience protocols Web access via browser sftp Mount to desktop via CIFS/Samba (mapped drive) Knowledge Base article on SDA access
35 Pull/Push data in SDA to DC2 or Home Directory Use hsi on Big Red II login node Add module statement to profile module load hpss Can be done interactively Can be scripted through Kerberos keytab authentication hsi can be used in many different ways (ftp style commands) - manual available at
36 Scholarly Data Archive HSI Example module load hpss HPSS (command-line utility for access to the SDA) version 4.0 loaded. hsi Kerberos Principal: jupmille Password for put samplefile.tar? ls /hpss/j/u/jupmille: samplefile.tar? get samplefile.tar? du -k? help? exit Knowledge Base: hvps://kb.iu.edu/d/avdb
37 Getting Data onto Big Red II Big Red II, Mason, Quarry login nodes do not enforce a time limit on data transfer tools scp, sftp, hsi, htar, wget, curl, etc. I recommend putting your data into the Scholarly Data Archive first Then use command line tools to pull from SDA into DC2, Home Directories Many ways to access the SDA, robust permissions You ll always have a distributed copy of your data!
38 Other RT Storage Resources Focus of this presentation was basics of storage available on Big Red II There are more storage options available - Research File System (RFS) Distributed copies, very accessible, robust permissions - Data Capacitor WAN (DC-WAN) Lustre over the wide area network, share at high speed
39 System Outages If there are any problems with the system, we will update IT Notices Data Capacitor II has regularly scheduled system maintenance First Tuesday of every month Join the maintenance mailing list to be notified -
40 Other RT Storage Resources Focus of this presentation was basics of storage available on Big Red II There are more storage options available - Research File System (RFS) Distributed copies, very accessible, robust permissions - Data Capacitor WAN (DC-WAN) Lustre over the wide area network, share at high speed
41 Ques.ons?
NERSC Archival Storage: Best Practices
NERSC Archival Storage: Best Practices Lisa Gerhardt! NERSC User Services! Nick Balthaser! NERSC Storage Systems! Joint Facilities User Forum on Data Intensive Computing! June 18, 2014 Agenda Introduc)on
More informationIU Cyberinfrastructure Overview
IU Cyberinfrastructure Overview, Cyberinfrastructure and Service Center Indiana University Pervasive Technology Institute Science Storage Computation Analysis/ Bio/Health Visualization Campus Education/
More informationHPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
More informationRobert Ping UITS Research Technologies, Cyberinfrastructure and Service Center Indiana University Pervasive Technology Institute
Cyberinfrastucture for IU Research and Academics Robert Ping, Cyberinfrastructure and Service Center Indiana University Pervasive Technology Institute Science Storage Computation Analysis/ Bio/Health Visualization
More informationData Management Best Practices
December 4, 2013 Data Management Best Practices Ryan Mokos Outline Overview of Nearline system (HPSS) Hardware File system structure Data transfer on Blue Waters Globus Online (GO) interface Web GUI Command-Line
More informationOLCF Best Practices. Bill Renaud OLCF User Assistance Group
OLCF Best Practices Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may differ from
More informationQuick Introduction to HPSS at NERSC
Quick Introduction to HPSS at NERSC Nick Balthaser NERSC Storage Systems Group nabalthaser@lbl.gov Joint Genome Institute, Walnut Creek, CA Feb 10, 2011 Agenda NERSC Archive Technologies Overview Use Cases
More informationIntroduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
More informationGPN - What is theGPFS HSI HTAR ISH?
1/10 Storage Capacity Expansion Plan (initial) Storage Budget: $ $ $ (5PB) Back in 2009 GPFS (scratch + project) 2010-2011 2012-2013 GPFS (add 20-50%) GPFS (add 50-100%) Rationale: * the longer we wait,
More informationStorage Capacity Expansion Plan (initial)
1/14 Storage Capacity Expansion Plan (initial) Storage Budget: $ $ $ (5PB) Back in 2009 GPFS scratch + project 2010-2011 2012-2013 GPFS (add 20-50%) GPFS (add 50-100%) Rationale: * the longer we wait,
More informationIncremental Backup Script. Jason Healy, Director of Networks and Systems
Incremental Backup Script Jason Healy, Director of Networks and Systems Last Updated Mar 18, 2008 2 Contents 1 Incremental Backup Script 5 1.1 Introduction.............................. 5 1.2 Design Issues.............................
More informationPetaLibrary Storage Service MOU
University of Colorado Boulder Research Computing PetaLibrary Storage Service MOU 1. INTRODUCTION This is the memorandum of understanding (MOU) for the Research Computing (RC) PetaLibrary Storage Service.
More informationNERSC File Systems and How to Use Them
NERSC File Systems and How to Use Them David Turner! NERSC User Services Group! Joint Facilities User Forum on Data- Intensive Computing! June 18, 2014 The compute and storage systems 2014 Hopper: 1.3PF,
More informationIntroduction to Supercomputing with Janus
Introduction to Supercomputing with Janus Shelley Knuth shelley.knuth@colorado.edu Peter Ruprecht peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Who is CU Research Computing? What is a supercomputer?
More informationData Movement and Storage. Drew Dolgert and previous contributors
Data Movement and Storage Drew Dolgert and previous contributors Data Intensive Computing Location Viewing Manipulation Storage Movement Sharing Interpretation $HOME $WORK $SCRATCH 72 is a Lot, Right?
More information8/15/2014. Best Practices @OLCF (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status
Best Practices @OLCF (and more) Bill Renaud OLCF User Support General Information This presentation covers some helpful information for users of OLCF Staying informed Aspects of system usage that may differ
More informationData Management. Network transfers
Data Management Network transfers Network data transfers Not everyone needs to transfer large amounts of data on and off a HPC service Sometimes data is created and consumed on the same service. If you
More informationCisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002)
Cisco Networking Academy Program Curriculum Scope & Sequence Fundamentals of UNIX version 2.0 (July, 2002) Course Description: Fundamentals of UNIX teaches you how to use the UNIX operating system and
More informationEnhanced Research Data Management and Publication with Globus
Enhanced Research Data Management and Publication with Globus Vas Vasiliadis Jim Pruyne Presented at OR2015 June 8, 2015 Presentations and other useful information available at globus.org/events/or2015/tutorial
More informationManaging, Sharing and Moving Big Data Tracy Teal and Greg Mason Insttute for Cyber Enabled Research
Managing, Sharing and Moving Big Data Tracy Teal and Greg Mason Insttute for Cyber Enabled Research Data storage optons Storing and accessing data on the HPCC Transferring data to and from the HPCC Sharing
More informationData management on HPC platforms
Data management on HPC platforms Transferring data and handling code with Git scitas.epfl.ch September 10, 2015 http://bit.ly/1jkghz4 What kind of data Categorizing data to define a strategy Based on size?
More informationThe Einstein Depot server
The Einstein Depot server Have you ever needed a way to transfer large files to colleagues? Or allow a colleague to send large files to you? Do you need to transfer files that are too big to be sent as
More informationPrerequisites and Configuration Guide
Prerequisites and Configuration Guide Informatica Support Console (Version 2.0) Table of Contents Chapter 1: Overview.................................................... 2 Chapter 2: Minimum System Requirements.................................
More informationAmazon-Free Big Data Analysis. Michael R. Crusoe the GED Lab @ MSU @JKhedron #NGS2013 2013-06- 18
Amazon-Free Big Data Analysis Michael R. Crusoe the GED Lab @ MSU @JKhedron #NGS2013 2013-06- 18 Overview Dedicated vs Shared computing Evaluating Computing Resources XSEDE Mason Lonestar Stampede Blacklight
More informationAn Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
More informationWeek Overview. Running Live Linux Sending email from command line scp and sftp utilities
ULI101 Week 06a Week Overview Running Live Linux Sending email from command line scp and sftp utilities Live Linux Most major Linux distributions offer a Live version, which allows users to run the OS
More informationHow to Use NoMachine 4.4
How to Use NoMachine 4.4 Using NoMachine What is NoMachine and how can I use it? NoMachine is a software that runs on multiple platforms (ie: Windows, Mac, and Linux). It is an end user client that connects
More informationSoftware infrastructure and remote sites
Software infrastructure and remote sites Petr Chaloupka Nuclear Physics Institute ASCR, Prague STAR regional meeting Dubna, Russia 11/21/2003 Dubna, 11/21/2003 1 Where to go for help and informations Main
More informationGoAnywhere Director to GoAnywhere MFT Upgrade Guide. Version: 5.0.1 Publication Date: 07/09/2015
GoAnywhere Director to GoAnywhere MFT Upgrade Guide Version: 5.0.1 Publication Date: 07/09/2015 Copyright 2015 Linoma Software. All rights reserved. Information in this document is subject to change without
More informationLOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011
LOCKSS on LINUX Installation Manual and the OpenBSD Transition 02/17/2011 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 7 BIOS Settings... 10 Installation... 11 Firewall
More informationUpgrade Guide. Product Version: 4.7.0 Publication Date: 02/11/2015
Upgrade Guide Product Version: 4.7.0 Publication Date: 02/11/2015 Copyright 2009-2015, LINOMA SOFTWARE LINOMA SOFTWARE is a division of LINOMA GROUP, Inc. Contents Welcome 3 Before You Begin 3 Upgrade
More informationIntroduction to SDSC systems and data analytics software packages "
Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available
More informationNASA Workflow Tool. User Guide. September 29, 2010
NASA Workflow Tool User Guide September 29, 2010 NASA Workflow Tool User Guide 1. Overview 2. Getting Started Preparing the Environment 3. Using the NED Client Common Terminology Workflow Configuration
More informationGlobus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis
Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to
More informationIntroduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationOLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group
OLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may
More informationGlobus and the Centralized Research Data Infrastructure at CU Boulder
Globus and the Centralized Research Data Infrastructure at CU Boulder Daniel Milroy, daniel.milroy@colorado.edu Conan Moore, conan.moore@colorado.edu Thomas Hauser, thomas.hauser@colorado.edu Peter Ruprecht,
More informationIntroduction to Archival Storage at NERSC
Introduction to Archival Storage at NERSC Nick Balthaser NERSC Storage Systems Group nabalthaser@lbl.gov NERSC User Training March 8, 2011 Agenda NERSC Archive Technologies Overview Use Cases for the Archive
More informationEucalyptus Tutorial HPC and Cloud Computing Workshop http://portal.nersc.gov/project/magellan/euca-tutorial/abc.html
Eucalyptus Tutorial HPC and Cloud Computing Workshop http://portal.nersc.gov/project/magellan/euca-tutorial/abc.html Iwona Sakrejda Lavanya Ramakrishna Shane Canon June24th, UC Berkeley Tutorial Outline
More informationGlobus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University
More informationOverview of HPC Resources at Vanderbilt
Overview of HPC Resources at Vanderbilt Will French Senior Application Developer and Research Computing Liaison Advanced Computing Center for Research and Education June 10, 2015 2 Computing Resources
More informationExtreme Control Center, NAC, and Purview Virtual Appliance Installation Guide
Extreme Control Center, NAC, and Purview Virtual Appliance Installation Guide 9034968 Published April 2016 Copyright 2016 All rights reserved. Legal Notice Extreme Networks, Inc. reserves the right to
More informationAdobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud
Adobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud Contents File Transfer Protocol...3 Setting Up and Using FTP Accounts Hosted by Adobe...3 SAINT...3 Data Sources...4 Data Connectors...5
More informationIsilon OneFS. Version 7.2. OneFS Migration Tools Guide
Isilon OneFS Version 7.2 OneFS Migration Tools Guide Copyright 2014 EMC Corporation. All rights reserved. Published in USA. Published November, 2014 EMC believes the information in this publication is
More informationHPSS Best Practices. Erich Thanhardt Bill Anderson Marc Genty B
HPSS Best Practices Erich Thanhardt Bill Anderson Marc Genty B Overview Idea is to Look Under the Hood of HPSS to help you better understand Best Practices Expose you to concepts, architecture, and tape
More informationFile Transfer Best Practices
File Transfer Best Practices David Turner User Services Group NERSC User Group Meeting October 2, 2008 Overview Available tools ftp, scp, bbcp, GridFTP, hsi/htar Examples and Performance LAN WAN Reliability
More informationwww.thinkparq.com www.beegfs.com
www.thinkparq.com www.beegfs.com KEY ASPECTS Maximum Flexibility Maximum Scalability BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a
More informationCASHNet Secure File Transfer Instructions
CASHNet Secure File Transfer Instructions Copyright 2009, 2010 Higher One Payments, Inc. CASHNet, CASHNet Business Office, CASHNet Commerce Center, CASHNet SMARTPAY and all related logos and designs are
More informationHPCHadoop: MapReduce on Cray X-series
HPCHadoop: MapReduce on Cray X-series Scott Michael Research Analytics Indiana University Cray User Group Meeting May 7, 2014 1 Outline Motivation & Design of HPCHadoop HPCHadoop demo Benchmarking Methodology
More informationIsilon OneFS. Version 7.2.1. OneFS Migration Tools Guide
Isilon OneFS Version 7.2.1 OneFS Migration Tools Guide Copyright 2015 EMC Corporation. All rights reserved. Published in USA. Published July, 2015 EMC believes the information in this publication is accurate
More informationIntroduction to Big Data Analysis for Scientists and Engineers
Introduction to Big Data Analysis for Scientists and Engineers About this white paper: This paper was written by David C. Young, an employee of CSC. It was written as supplemental documentation for use
More informationHP-UX Essentials and Shell Programming Course Summary
Contact Us: (616) 875-4060 HP-UX Essentials and Shell Programming Course Summary Length: 5 Days Prerequisite: Basic computer skills Recommendation Statement: Student should be able to use a computer monitor,
More informationIntroduction to MSI* for PubH 8403
Introduction to MSI* for PubH 8403 Sep 30, 2015 Nancy Rowe *The Minnesota Supercomputing Institute for Advanced Computational Research Overview MSI at a Glance MSI Resources Access System Access - Physical
More informationJUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
More informationArchiving, Indexing and Accessing Web Materials: Solutions for large amounts of data
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center
More informationScientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015 Index - Storage use cases - Bluearc - Lustre - EOS - dcache disk only - dcache+enstore Data distribution by solution
More informationHadoop Basics with InfoSphere BigInsights
An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government
More informationLinux Overview. Local facilities. Linux commands. The vi (gvim) editor
Linux Overview Local facilities Linux commands The vi (gvim) editor MobiLan This system consists of a number of laptop computers (Windows) connected to a wireless Local Area Network. You need to be careful
More informationGlobus Research Data Management: Introduction and Service Overview
Globus Research Data Management: Introduction and Service Overview Kyle Chard chard@uchicago.edu Ben Blaiszik blaiszik@uchicago.edu Thank you to our sponsors! U. S. D E P A R T M E N T OF ENERGY 2 Agenda
More informationParallel Processing using the LOTUS cluster
Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS
More informationAnalisi di un servizio SRM: StoRM
27 November 2007 General Parallel File System (GPFS) The StoRM service Deployment configuration Authorization and ACLs Conclusions. Definition of terms Definition of terms 1/2 Distributed File System The
More informationSURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
More informationLOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013
LOCKSS on LINUX CentOS6 Installation Manual 08/22/2013 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 6 BIOS Settings... 9 Installation... 10 Firewall Configuration...
More informationCYCLOPE let s talk productivity
Cyclope 6 Installation Guide CYCLOPE let s talk productivity Cyclope Employee Surveillance Solution is provided by Cyclope Series 2003-2014 1 P age Table of Contents 1. Cyclope Employee Surveillance Solution
More informationThis is when a server versus a workstation is desirable because it has the capability to have:
Protecting your Data Protecting your data is a critical necessity of having your DemandBridge Software and data programs loaded on a computer that has the ability to integrate redundant devices such as
More informationWorkload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace
Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University plale@cs.indiana.edu LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August
More informationContent Management System
Content Management System XT-CMS INSTALL GUIDE Requirements The cms runs on PHP so the host/server it is intended to be run on should ideally be linux based with PHP 4.3 or above. A fresh install requires
More informationIntroduction to Arvados. A Curoverse White Paper
Introduction to Arvados A Curoverse White Paper Contents Arvados in a Nutshell... 4 Why Teams Choose Arvados... 4 The Technical Architecture... 6 System Capabilities... 7 Commitment to Open Source... 12
More informationFile Protection using rsync. Setup guide
File Protection using rsync Setup guide Contents 1. Introduction... 2 Documentation... 2 Licensing... 2 Overview... 2 2. Rsync technology... 3 Terminology... 3 Implementation... 3 3. Rsync data hosts...
More informationINF-110. GPFS Installation
INF-110 GPFS Installation Overview Plan the installation Before installing any software, it is important to plan the GPFS installation by choosing the hardware, deciding which kind of disk connectivity
More informationVital-IT Storage Guidelines
Introduction This document describes the current storage organization of Vital-IT and defines some rules about its usage. We need to re-specify the usage of the different parts of the infrastructure as
More informationContingency Planning and Disaster Recovery
Contingency Planning and Disaster Recovery Best Practices Guide Perceptive Content Version: 7.0.x Written by: Product Knowledge Date: October 2014 2014 Perceptive Software. All rights reserved Perceptive
More informationHadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela
Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance
More informationHow to Backup XenServer VM with VirtualIQ
How to Backup XenServer VM with VirtualIQ 1. Using Live Backup of VM option: Live Backup: This option can be used, if user does not want to power off the VM during the backup operation. This approach takes
More informationWinSCP PuTTY as an alternative to F-Secure July 11, 2006
WinSCP PuTTY as an alternative to F-Secure July 11, 2006 Brief Summary of this Document F-Secure SSH Client 5.4 Build 34 is currently the Berkeley Lab s standard SSH client. It consists of three integrated
More informationWolfr am Lightweight Grid M TM anager USER GUIDE
Wolfram Lightweight Grid TM Manager USER GUIDE For use with Wolfram Mathematica 7.0 and later. For the latest updates and corrections to this manual: visit reference.wolfram.com For information on additional
More informationGetting Started with HPC
Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage
More informationBerkeley Research Computing. Town Hall Meeting Savio Overview
Berkeley Research Computing Town Hall Meeting Savio Overview SAVIO - The Need Has Been Stated Inception and design was based on a specific need articulated by Eliot Quataert and nine other faculty: Dear
More informationStorage Systems: 2014 and beyond. Jason Hick! Storage Systems Group!! NERSC User Group Meeting! February 6, 2014
Storage Systems: 2014 and beyond Jason Hick! Storage Systems Group!! NERSC User Group Meeting! February 6, 2014 The compute and storage systems 2013 Hopper: 1.3PF, 212 TB RAM 2.2 PB Local Scratch 70 GB/s
More informationArchival Storage At LANL Past, Present and Future
Archival Storage At LANL Past, Present and Future Danny Cook Los Alamos National Laboratory dpc@lanl.gov Salishan Conference on High Performance Computing April 24-27 2006 LA-UR-06-0977 Main points of
More informationFile transfer in UNICORE State of the art
Mitglied der Helmholtz-Gemeinschaft File transfer in UNICORE State of the art Bernd Schuller, Björn Hagemeier, Michael Rambadt Federated Systems and Data division Jülich Supercomputer Centre Forschungszentrum
More informationMulti-site Best Practices
DS SOLIDWORKS CORPORATION Multi-site Best Practices SolidWorks Enterprise PDM multi-site implementation [SolidWorks Enterprise PDM 2010] [] [Revision 2] Page 1 Index Contents Multi-site pre-requisites...
More informationAFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill
AFS Usage and Backups using TiBS at Fermilab Presented by Kevin Hill Agenda History and current usage of AFS at Fermilab About Teradactyl How TiBS (True Incremental Backup System) and TeraMerge works AFS
More informationUsage of the mass storage system. K. Rosbach PPS 19-Feb-2008
Usage of the mass storage system K. Rosbach PPS 19-Feb-2008 Disclaimer This is just a summary based on the information available online at http://dv-zeuthen.desy.de/services/dcache_osm/e717/index_eng.html
More informationUsing WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014
Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan
More informationAn Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing
An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates
More informationHPCC - Hrothgar Getting Started User Guide
HPCC - Hrothgar Getting Started User Guide Transfer files High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents Transferring files... 3 1.1 Transferring files using
More informationUnix Sampler. PEOPLE whoami id who
Unix Sampler PEOPLE whoami id who finger username hostname grep pattern /etc/passwd Learn about yourself. See who is logged on Find out about the person who has an account called username on this host
More informationGDC Data Transfer Tool User s Guide. NCI Genomic Data Commons (GDC)
GDC Data Transfer Tool User s Guide NCI Genomic Data Commons (GDC) Contents 1 Getting Started 3 Getting Started.......................................................... 3 The GDC Data Transfer Tool: An
More informationXenData Product Brief: SX-550 Series Servers for LTO Archives
XenData Product Brief: SX-550 Series Servers for LTO Archives The SX-550 Series of Archive Servers creates highly scalable LTO Digital Video Archives that are optimized for broadcasters, video production
More informationActive Directory Compatibility with ExtremeZ-IP. A Technical Best Practices Whitepaper
Active Directory Compatibility with ExtremeZ-IP A Technical Best Practices Whitepaper About this Document The purpose of this technical paper is to discuss how ExtremeZ-IP supports Microsoft Active Directory.
More informationLustre* is designed to achieve the maximum performance and scalability for POSIX applications that need outstanding streamed I/O.
Reference Architecture Designing High-Performance Storage Tiers Designing High-Performance Storage Tiers Intel Enterprise Edition for Lustre* software and Intel Non-Volatile Memory Express (NVMe) Storage
More informationActive Directory Comapatibility with ExtremeZ-IP A Technical Best Practices Whitepaper
Active Directory Comapatibility with ExtremeZ-IP A Technical Best Practices Whitepaper About this Document The purpose of this technical paper is to discuss how ExtremeZ-IP supports Microsoft Active Directory.
More informationRECOVER ( 8 ) Maintenance Procedures RECOVER ( 8 )
NAME recover browse and recover NetWorker files SYNOPSIS recover [-f] [-n] [-q] [-u] [-i {nnyyrr}] [-d destination] [-c client] [-t date] [-sserver] [dir] recover [-f] [-n] [-u] [-q] [-i {nnyyrr}] [-I
More informationAttix5 Pro Server Edition
Attix5 Pro Server Edition V7.0.3 User Manual for Linux and Unix operating systems Your guide to protecting data with Attix5 Pro Server Edition. Copyright notice and proprietary information All rights reserved.
More informationAvaya G700 Media Gateway Security - Issue 1.0
Avaya G700 Media Gateway Security - Issue 1.0 Avaya G700 Media Gateway Security With the Avaya G700 Media Gateway controlled by the Avaya S8300 or S8700 Media Servers, many of the traditional Enterprise
More informationDeploying a distributed data storage system on the UK National Grid Service using federated SRB
Deploying a distributed data storage system on the UK National Grid Service using federated SRB Manandhar A.S., Kleese K., Berrisford P., Brown G.D. CCLRC e-science Center Abstract As Grid enabled applications
More informationWhite Paper. Mimosa NearPoint for Microsoft Exchange Server. Next Generation Email Archiving for Exchange Server 2007. By Bob Spurzem and Martin Tuip
White Paper By Bob Spurzem and Martin Tuip Mimosa Systems, Inc. January 2008 Mimosa NearPoint for Microsoft Exchange Server Next Generation Email Archiving for Exchange Server 2007 CONTENTS Email has become
More informationSystem Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks
System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks OnurSoft Onur Tolga Şehitoğlu November 10, 2012 v1.0 Contents 1 Introduction 3 1.1 Purpose..............................
More information