High Performance Computing Modernization Program Mass Storage
|
|
|
- Dwight Cook
- 10 years ago
- Views:
Transcription
1 High Performance Computing Modernization Program Mass Storage September 21, 2012 Designing Storage Architectures for Digital Collections Library of Congress Gene McGill, Arctic Region Supercomputing Center/HTL
2 What is the HPCMP? Five large DoD Supercomputing Resource Centers (DSRCs) and various smaller facilities Air Force Research Lab (AFRL DSRC), Dayton, OH Army Research Lab (ARL DSRC), Aberdeen Proving Ground, MD US Army Engineer Research and Development Center (ERDC DSRC), Vicksburg, MS Navy DoD Supercomputing Resource Center (Navy DSRC), Stennis Space Center, MS Maui High Performance Computing Center (MHPCC DSRC), Kihei, Maui, HI ARSC HEUE Test Lab (ARSC HTL), Fairbanks, AK Other smaller, more focused centers
3 What is the HPCMP? (cont) ~1,500 users Diverse scientific & engineering applications, systems are general-purpose, no single usage profile Ten large, unclassified, HPC clusters Total TFLOPs: 1421; 6.1 PBs local, high speed disk; Mass Storage servers SAM-QFS, SL8500, SL3000, T10000[A B C] + Remote DR. 25 PB 1 st copy, nearly all files get DR copy Less than 2% of the users own 50% of the PBs Growth is accelerating
4 Legacy Architecture Batch jobs run on HPC clusters Jobs use fast, expensive, local disk Insufficient space on HPCs to store restart files between jobs Users store files on mass storage servers that shouldn t live long-term Restart file, intermediate results, and users rarely go back and remove what they don t need Users tell us: Tools to manage the millions of files on mass storage servers are lacking HPC clusters not good for pre- & post-processing, this discourages postprocessing and saving only high-value PP d files Users don t think all files need a DR copy Program management wants to live within current tape slots
5 The HPCMP Enhanced User Environment We don t want to be an Archive like this group thinks of them! We are implementing tools that could be used for an Archive for the purpose of reducing rate of growth of data. The solution: Add Utility Server (US) Interactive use for pre-, post-processing Install PB-scale center-wide file system (CWFS) 30 day residency commitment vs. ~10 days on HPC disk Add Storage Lifecycle Management tool (SLM) to manage the archive Enhance metadata for users Implement storage management policy
6 The HPCMP Enhanced User Environment Nirvana Storage Resource Broker (SRB): Commercial offering, w/ vendor support Application runs on top of a database Automated Storage Lifecycle Management using metadata-driven policy SRB presents a global namespace of potentially many storage resources, some of which may be geographically distributed. HPCMP is layering SRB on top of existing SAM-QFS file systems
7 The HPCMP Enhanced User Environment SRB (cont): Policies we re implementing in SRB ARCHIVE: Local tape copy is automatic DR: Remote copy by user request per object, default is no DR copy EXPIRE: Automatically removes objects when they reach the end of their retention period Enhanced metadata Retention period Default is 30 days, Users can specify decades but it cannot be infinite Users must confirm retention every 3 years
8 The HPCMP Enhanced User Environment SRB (cont): Enhanced User metadata: Dublin core Name Value scheme Multi-level security features added in Up to 16 site-defined flags per object Opportunity for doing science in the metadata Automated processing during ingest can mine data to populate metadata needed for scientific visualization later Avoids accessing files in favor of using metadata in SRB/DB Metadata is query-able.
9 The HPCMP Enhanced User Environment Currently in Pilot phases at two DSRCs
10 Acknowlegements Many people working on this: Tom Kendall, Executive Agent, Army Research Lab DSRC Stephen Scherr, HPCMP Robert Hunter, HPCMP Constantin Scheder, Nirvana SRB Chief Architect Reid Bingham, Nirvana And many others.
Big Data in Test and Evaluation by Udaya Ranawake (HPCMP PETTT/Engility Corporation)
Big Data in Test and Evaluation by Udaya Ranawake (HPCMP PETTT/Engility Corporation) Approved for Public Release. Distribution Unlimited. Data Intensive Applications in T&E Win-T at ATC Automotive Data
HPCMP New Users Guide Who Are We? Distribution A: Approved for Public release; distribution is unlimited.
HPCMP New Users Guide Who Are We? HPCMP Vision and Mission VISION Our vision is one in which a pervasive culture exists within the DOD that drives the routine use of advanced computational environments
<Insert Picture Here> Solution Direction for Long-Term Archive
1 Solution Direction for Long-Term Archive Donna Harland Oracle Optimized Solutions: Solutions Architect Program Agenda Archive Layers SAM QFS connectivity for
The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - -
The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - - Hadoop Implementation on Riptide 2 Table of Contents Executive
UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO
Exhibit R-2, RDT&E Budget Item Justification: PB 2013 Office of Secretary Of Defense DATE: February 2012 COST ($ in Millions) FY 2011 FY 2012 Base PE 0603755D8Z: High Performance Computing OCO Total FY
Backup and Recovery of SAP Systems on Windows / SQL Server
Backup and Recovery of SAP Systems on Windows / SQL Server Author: Version: Amazon Web Services sap- on- [email protected] 1.1 May 2012 2 Contents About this Guide... 4 What is not included in this guide...
巨 量 資 料 分 層 儲 存 解 決 方 案
巨 量 資 料 分 層 儲 存 解 決 方 案 Lower Costs and Improve Efficiencies Cano Lei Senior Sales Consulting Manager Oracle Systems Oracle Confidential Internal/Restricted/Highly Restricted Agenda 1 2 3 Why Tiered Storage?
The Key Elements of Digital Asset Management
The Key Elements of Digital Asset Management The last decade has seen an enormous growth in the amount of digital content, stored on both public and private computer systems. This content ranges from professionally
Reference Architectures for Repositories and Preservation Archiving
Reference Architectures for Repositories and Preservation Archiving Keith Rajecki Education Solutions Architect Sun Microsystems, Inc. 1 Agenda Challenges Solution Architectures > Open Storage/Open Archive
LSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
irods at CC-IN2P3: managing petabytes of data
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods at CC-IN2P3: managing petabytes of data Jean-Yves Nief Pascal Calvat Yonny Cardenas Quentin Le Boulc h
Data Masking Secure Sensitive Data Improve Application Quality. Becky Albin Chief IT Architect [email protected]
Data Masking Secure Sensitive Data Improve Application Quality Becky Albin Chief IT Architect [email protected] Data Masking for Adabas The information provided in this PPT is entirely subject
USGS Guidelines for the Preservation of Digital Scientific Data
USGS Guidelines for the Preservation of Digital Scientific Data Introduction This document provides guidelines for use by USGS scientists, management, and IT staff in technical evaluation of systems for
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle Growth in Data Diversity and Usage 1.8 Zettabytes of Data in 2011, 20x Growth by 2020
Benefits of upgrading to Enterprise Vault 11.0.1
Benefits of upgrading to Enterprise Vault 11.0.1 Benefits of upgrading to Enterprise Vault 11.0.1 With the release of Symantec Enterprise Vault 11.0.1, there are several features that will allow your business
Grid Scheduling Dictionary of Terms and Keywords
Grid Scheduling Dictionary Working Group M. Roehrig, Sandia National Laboratories W. Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Document: Category: Informational June 2002 Status
QStar White Paper. Tiered Storage
QStar White Paper Tiered Storage QStar White Paper Tiered Storage Table of Contents Introduction 1 The Solution 1 QStar Solution 3 Conclusion 4 Copyright 2007 QStar Technologies, Inc. QStar White Paper
STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside
Managing the information that drives the enterprise STORAGE Buying Guide: DEDUPLICATION inside What you need to know about target data deduplication Special factors to consider One key difference among
Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
The Archival Upheaval Petabyte Pandemonium Developing Your Game Plan Fred Moore President www.horison.com
USA The Archival Upheaval Petabyte Pandemonium Developing Your Game Plan Fred Moore President www.horison.com Did You Know? Backup and Archive are Not the Same! Backup Primary HDD Storage Files A,B,C Copy
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory 21 st Century Research Continuum Theory Theory embodied in computation Hypotheses tested through experiment SCIENTIFIC METHODS
Optimizing IT Deployment Issues
Optimizing IT Deployment Issues Trends and Challenges for Engineering Simulation Barbara Hutchings [email protected] 1 Outline Deployment Challenges and Trends Extreme scale up and scale out
Hadoop Cluster Applications
Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday
THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.
THE EMC ISILON STORY Big Data In The Enterprise 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology Summary 2 What is Big Data? 3 The Big Data Challenge File Shares 90 and Archives 80 Bioinformatics
Exhibit to Data Center Services Service Component Provider Master Services Agreement
Exhibit to Data Center Services Service Component Provider Master Services Agreement DIR Contract No. DIR-DCS-SCP-MSA-002 Between The State of Texas, acting by and through the Texas Department of Information
Condor Supercomputer
Introducing the Air Force Research Laboratory Information Directorate Condor Supercomputer DOD s Largest Interactive Supercomputer Ribbon Cutting Ceremony AFRL, Rome NY 1 Dec 2010 Approved for Public Release
Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski 19-05-2011 Spała
Data Deduplication in Tivoli Storage Manager Andrzej Bugowski 19-05-2011 Spała Agenda Tivoli Storage, IBM Software Group Deduplication concepts Data deduplication in TSM 6.1 Planning for data deduplication
The Economics of File-based Storage
Technology Insight Paper The Economics of File-based Storage Implementing End-to-End File Storage Efficiency By John Webster March, 2013 Enabling you to make the best technology decisions The Economics
Lessons from the field: Implementing Information Governance and Records Management with Microsoft SharePoint
Lessons from the field: Implementing Information Governance and Records Management with Microsoft SharePoint Veli-Matti Vanamo - Principal Consultant at Ignia - 12 Year SharePoint Veteran (there should
ETERNUS CS High End Unified Data Protection
ETERNUS CS High End Unified Data Protection Optimized Backup and Archiving with ETERNUS CS High End 0 Data Protection Issues addressed by ETERNUS CS HE 60% of data growth p.a. Rising back-up windows Too
Long term retention and archiving the challenges and the solution
Long term retention and archiving the challenges and the solution NAME: Yoel Ben-Ari TITLE: VP Business Development, GH Israel 1 Archive Before Backup EMC recommended practice 2 1 Backup/recovery process
Applications of LTFS for Cloud Storage Use Cases
Applications of LTFS for Cloud Storage Use Cases Version 1.0 Publication of this SNIA Technical Proposal has been approved by the SNIA. This document represents a stable proposal for use as agreed upon
Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage
Growth of Unstructured Data & Object Storage Marcel Laforce Sr. Director, Object Storage Agenda Unstructured Data Growth Contrasting approaches: Objects, Files & Blocks The Emerging Object Storage Market
Make the Most of Big Data to Drive Innovation Through Reseach
White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability
Introduction to NetApp Infinite Volume
Technical Report Introduction to NetApp Infinite Volume Sandra Moulton, Reena Gupta, NetApp April 2013 TR-4037 Summary This document provides an overview of NetApp Infinite Volume, a new innovation in
SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY
SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY Among the priorities for efficient storage management is an appropriate protection architecture. This paper will examine how to architect storage
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,
Big Data Services at DKRZ
Big Data Services at DKRZ Michael Lautenschlager and Colleagues from DKRZ and Scientific Computing Research Group MPI-M Seminar Hamburg, March 31st, 2015 Big Data in Climate Research Big data is an all-encompassing
HPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange [email protected] Toàn Nguyên [email protected]
3Gen Data Deduplication Technical
3Gen Data Deduplication Technical Discussion NOTICE: This White Paper may contain proprietary information protected by copyright. Information in this White Paper is subject to change without notice and
Accelerating Innovation with Self- Service HPC
Accelerating Innovation with Self- Service HPC Thomas Goepel Director Product Management Hewlett-Packard BOEING is a trademark of Boeing Management Company Copyright 2014 Boeing. All rights reserved. Copyright
<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures
1 Refreshing Your Data Protection Environment with Next-Generation Architectures Dale Rhine, Principal Sales Consultant Kelly Boeckman, Product Marketing Analyst Program Agenda Storage
SGI High Performance Computing
SGI High Performance Computing Accelerate time to discovery, innovation, and profitability 2014 SGI SGI Company Proprietary 1 Typical Use Cases for SGI HPC Products Large scale-out, distributed memory
HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect [email protected]
HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect [email protected] EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training
European Data Infrastructure - EUDAT Data Services & Tools
European Data Infrastructure - EUDAT Data Services & Tools Dr. Ing. Morris Riedel Research Group Leader, Juelich Supercomputing Centre Adjunct Associated Professor, University of iceland BDEC2015, 2015-01-28
AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill
AFS Usage and Backups using TiBS at Fermilab Presented by Kevin Hill Agenda History and current usage of AFS at Fermilab About Teradactyl How TiBS (True Incremental Backup System) and TeraMerge works AFS
Maui High Performance Computing Center Overview
Maui High Performance Computing Center Overview March 2013 Peter Colvin MHPCC Director of Outreach & Program Development University of Hawaii 1 Agenda Background Strategy Resources Clients Summary 2 Background
Pentaho Enterprise and Community Editions Feature Comparison
Pentaho Enterprise and Community Editions Feature Comparison Copyright 2008 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information,
UNCLASSIFIED. R-1 Program Element (Number/Name) PE 0603461A / High Performance Computing Modernization Program. Prior Years FY 2013 FY 2014 FY 2015
Exhibit R-2, RDT&E Budget Item Justification: PB 2015 Army Date: March 2014 2040: Research, Development, Test & Evaluation, Army / BA 3: Advanced Technology Development (ATD) COST ($ in Millions) Prior
Hitachi NAS Platform and Hitachi Content Platform with ESRI Image
W H I T E P A P E R Hitachi NAS Platform and Hitachi Content Platform with ESRI Image Aciduisismodo Extension to ArcGIS Dolore Server Eolore for Dionseq Geographic Uatummy Information Odolorem Systems
Running Agilent GeneSpring MPP on the Cloud
Running Agilent GeneSpring MPP on the Cloud Technical Overview Authors Stephen Madden, Rick A. Fasani, and Michael Rosenberg Agilent Technologies, Inc. Santa Clara, California, USA Introduction Cloud computing
MEEC Webinar Daly and EMC BRS - EMC Backup & Archive Solutions
MEEC Webinar Daly and EMC BRS - EMC Backup & Archive Solutions Accelerating Transformation Presented By: Gabe Sales-Smith Backup & Recovery Specialist EMC Jim Rowland Senior System Architect and Project
EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE: MEETING NEEDS FOR LONG-TERM RETENTION OF BACKUP DATA ON EMC DATA DOMAIN SYSTEMS
SOLUTION PROFILE EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE: MEETING NEEDS FOR LONG-TERM RETENTION OF BACKUP DATA ON EMC DATA DOMAIN SYSTEMS MAY 2012 Backups are essential for short-term data recovery
Government Technology Trends to Watch in 2014: Big Data
Government Technology Trends to Watch in 2014: Big Data OVERVIEW The federal government manages a wide variety of civilian, defense and intelligence programs and services, which both produce and require
A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES
A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES By: Edward Whalen Performance Tuning Corporation INTRODUCTION There are a number of clustering products available on the market today, and clustering has become
Document Management and Records Management in SharePoint 2013. Scott Jamison
Document Management and Records Management in SharePoint 2013 Scott Jamison Chief Architect & CEO Digital Asset Management Document Imaging Workflow Document Management Records Management
SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS
Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
WHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper
WHITE PAPER DATA DEDUPLICATION BACKGROUND: A Technical White Paper CONTENTS Data Deduplication Multiple Data Sets from a Common Storage Pool.......................3 Fixed-Length Blocks vs. Variable-Length
Energy Efficient Storage - Multi- Tier Strategies For Retaining Data
Energy and Space Efficient Storage: Multi-tier Strategies for Protecting and Retaining Data NOTICE This White Paper may contain proprietary information protected by copyright. Information in this White
Milestone Solution Partner IT Infrastructure Components Certification Summary
Milestone Solution Partner IT Infrastructure Components Certification Summary Spectra Logic ntier Verde NAS and NVR3 Storage Solutions 10-01-2014 Table of Contents Introduction... 3 Test Process... 3 Topology...
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 [email protected] San Diego Supercomputer Center
Data Deduplication Background: A Technical White Paper
Data Deduplication Background: A Technical White Paper NOTICE This White Paper may contain proprietary information protected by copyright. Information in this White Paper is subject to change without notice
DoD High Performance Computing Modernization Program: a Distributed Computing Program January 2015 www.hpc.mil Larry Davis
DoD High Performance Computing Modernization Program: a Distributed Computing Program January 2015 www.hpc.mil Larry Davis Distribution C: Distribution authorized to U.S. Government Agencies and their
Data storage services at CC-IN2P3
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:
Big Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé ([email protected]) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery
Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja
Versity 2013. All rights reserved.
From the only independent developer of large scale archival storage systems, the Versity Storage Manager brings enterpriseclass storage virtualization to the Linux platform. Based on Open Source technology,
SURFsara HPC Cloud Workshop
SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk [email protected] Agenda Introduction and Overview (current
Introduction to Optical Archiving Library Solution for Long-term Data Retention
Introduction to Optical Archiving Library Solution for Long-term Data Retention CUC Solutions & Hitachi-LG Data Storage 0 TABLE OF CONTENTS 1. Introduction... 2 1.1 Background and Underlying Requirements
Archiving and Managing Remote Sensing Data Using State of the Art Storage Technologies
Archiving and Managing Remote Sensing Data Using State of the Art Storage Technologies Ms B Lakshmi C Chandrasekhar Reddy SVSRK Kishore SDAPSA, NRSC, Hyderabad NRSC Functions Remote Sensing Data Acquisition
Research Technologies Data Storage for HPC
Research Technologies Data Storage for HPC Supercomputing for Everyone February 17-18, 2014 Research Technologies High Performance File Systems [email protected] Indiana University Intro to HPC on Big
Server & Cloud Management
Technical Bootcamp: The Cloud-enabled Datacenter with Windows Server 2012 and System Center 2012 This 3-day, instructor-led course will help you understand how to evolve a traditional datacenter configuration
