Standards-based science data archiving at NASA's National Space Science Data Center
|
|
|
- Gladys McDaniel
- 9 years ago
- Views:
Transcription
1 Standards-based science data archiving at NASA's National Space Science Data Center J.H. King, D.M. Sawyer, P.W. McCaslin*, R.W. Post* National Space Science Data Center NASA/Goddard Space Flight Center Greenbelt, MD * also Raytheon/ITSS Introduction During the 35+ years of operations of the National Space Science Data Center (NSSDC) at NASA's Goddard Space Flight Center, NSSDC has accumulated an archive of over 4,000 distinct "data sets" from instruments flown on 370+ mostly-nasa scientific spacecraft. For many years, NSSDC has held digital data (~25 TB as of 12/31/01) on magnetic disk for immediate user access (by ftp or higher functionality interfaces), on an optical disk jukebox for minutes-response-time user access, and/or on offline tapes, CD- ROMs and other media. The optical disk jukebox was retired last year, and a Digital Linear Tape (DLT) jukebox system was introduced to support the permanent archive function. NSSDC also has extensive holdings of imagery and other data on film of various form factors. During 2001, NSSDC added 3.3 TB of digital data in starts or extensions of 132 distinct data sets from 24 spacecraft. Virtually no non-digital data arrive at NSSDC these days. In addition to its basic data stores, NSSDC has multiple information systems that facilitate NSSDC's management of the data archives and support the finding, understanding and use of the data mainly by the space science research community but also by the general public. The twin foci of this meeting are the ensuring of long term data preservation and the adding of value to the data to ensure the convertibility of data to information. Accordingly, the twin foci of this presentation are the recently implemented standardsbased approach to NSSDC permanent archiving and the management and evolution of the multiple information (metadata) systems at NSSDC that enable both NSSDC's data management and users' ability to convert data to information. [NSSDC employs data standards in facilitating data access and display (e.g., the management and evolution of the Common Data Format that underlies many NASA and other spacecraft data systems (e.g., ISTP, Cluster) and multiple NSSDC systems (e.g., CDAWeb, OMNIWeb). NSSDC also provides "value added" service by creating crossnormalized multi-source data sets (e.g., OMNI) and such systems as CDAWeb and OMNIWeb. Readers may learn of these through the NSSDC web page at but they will not be the major foci of this paper.]
2 NSSDC Permanent Archiving About three years ago, NSSDC committed to move from an older archiving environment with a number of "shortcomings" to a new environment exploiting newer data management standards and technologies. Among the shortcomings were: That NSSDC's major user-accessible nearline mass storage system (NDADS, a VMSbased optical disk jukebox system) was approaching the end of its 10-year life; That there were significant VMS dependencies in data structures and metadata in a world moving beyond VMS; That the large majority of NSSDC's digital data sets existed only on offline media, notably 9-track tapes, 3480 tape cartridges, 4-mm and 8-mm tape and CD-ROMs. The offline nature of the archive made it labor-intensive to maintain and meant these data were not network-accessible to users in an era where "accessible" and "networkaccessible" have become virtually synonymous; That the offline data sets were in a wide range of formats including many largely obsolete vendor-specific binary formats. Also, they were accompanied by documentation of a broad quality spectrum (relative to completeness, clarity, etc.). The number of data sets from the early years of the space age is large. Many such data sets, although uniquely time-stamped, have had their scientific value largely negated by later higher-resolution data sets (and hence have little future access potential). Therefore, it was judged infeasible (unaffordable and not needed) to upgrade the usability of all legacy data sets equally. That the overall documentation of the media and files of a given data set was distributed across multiple NSSDC information systems that interacted only weakly if at all. Three distinct thrusts were needed to convert from the state of three years ago to the future: First, a unix-based automated permanent archive was needed in which data files accompanied by at least minimal supporting material needed for their use had to be created and populated. Second, a conversion of formats from obsolete ones to modern ones, plus documentation reviews and upgrades, were needed for data sets having significant future use potential, while a documentation of obsolete formats was needed for those data sets of too little future-use potential to justify the expense of reformatting. Finally, NSSDC had to ensure an optimal organization of its metadata and that all relevant and available metadata would be easily accessible, along with corresponding data, both to NSSDC staffers managing the archive and to NSSDC's external users. The last of these three is still a work in progress and is more fully described in the next section. The second has led to (1) the conversion of some formats from various binaries
3 to ASCII or to NSSDC's Common Data Format and to (2) the documentation of the bitlevel structures of real and integer words of each of many vendor-specific binaries. This documentation should allow the reading of words/numbers from any old NSSDC data sets now judged of too-low future-access potential to warrant format conversion, although such usage will not be easy. This bit-level documentation is at The first of these is also a work in progress and is described in the remainder of this section. NSSDC's new nearline permanent archive presently uses a 264-slot DLT (Digital Linear Tape) jukebox hosted by a Sun Enterprise 3000 computer. The data on the jukebox is directly accessible only to NSSDC staff. An external-user-accessible RAID magnetic disk array, currently attached to the same Sun computer, hosts much of the data written to the jukebox. As of this writing (early July, 2002), NSSDC has written about one terabyte of data to the RAID array and is awaiting delivery of a second unit with double the initial unit's capacity. A database knows of the location of data files on both the DLT jukebox and magnetic disk. The DLT-archived data files are actually held as "Archival Information Packages" (AIP) where each AIP is a single file consisting of a data object (the bits from the data file), an attribute object and Consultative Committee for Space Data Systems (CCSDS)/ISO standard labels ( ). The labels include internationally unique CCSDS/ISO-recognized pointers ( ) to additional information about the data. Currently this additional information base is not uniformly populated for NSSDC assigned pointers and is discussed in more detail in the following section. The AIPs are compatible with the Open Archival Information System (OAIS) reference model ( ) developed under CCSDS and ISO sponsorship with a leading role played by NSSDC's NOST (NASA/Science Office of Standards and Technologies). The software enabling this new archive has been developed at NSSDC. On the one hand certain software modules accept a set of input data files as a given "job" and transform them to AIPs. The software ascertains files attributes, employs processing algorithms to transform data file contents to canonically formatted data objects (see below), computes Cyclic Redundancy Check checksums, creates attribute objects and finally creates the AIPs. Other modules write a tar file, containing all the AIPs of a given job, to DLT tape. Yet other modules "split" most AIPs by writing their constituent data and attribute objects to externally accessible magnetic disk files. Finally entries are made to a database about the AIPs and their constituent files including their locations in the jukebox and on magnetic disk. While various modules have different names at NSSDC, the name DIOnAS (Data Ingest and Online Access System) will be applied to the full software set for this paper. The database about the AIPs and files is called the DIOnAS database.
4 To date, the DIOnAS software has been used to create AIPs from the data files held by the retired VMS-based NSSDC Data Archive and Distribution System (NDADS). Much information from the NDADS Files 11 Extended Attribute Records (XAR) was captured into the AIP attribute objects. Certain data sets (e.g., those with files having variable length records) required conversions of the original data byte streams to conform to the appropriate NSSDC-defined canonical form. For example, for binary files with variable record lengths (as known to DIOnAS from Files 11 XARs), a 2-byte integer field, unsigned and formatted Big Endian, was inserted to prefix each record. That this was done to a data file is documented in its companion attribute object. This and many other aspects of NSSDC's new data management approaches are discussed more fully at Generalizations to the DIOnAS software are currently being developed to handle the case of AIPs with multiple data objects. This will be important for data sets of paired or otherwise mated "data" files. For example, NSSDC has several cases of "flat file" data sets from UCLA (University of California at Los Angeles) wherein an ASCII header file accompanies each binary data file. From a DIOnAS perspective, these are two data files that need to be held together. NSSDC is currently preparing to ingest data from its legacy magnetic tape archive to DIOnAS. The extensions needed for the attribute objects are being assessed. Possible generalizations to the range of "canonical files" are also being assessed. The range of NSSDC tape data sets is being reviewed to exclude from DIOnAS-archiving any which contain no unique science data and also to identify what small subset might warrant reformatting or other upgrades to facilitate future usability. It is intended that the DIOnAS processing of tape data sets, which may involve about 10,000 tapes and 2,000 distinct data sets (with unique characteristics, documentation packages, etc.), will be as automated as possible and will require no more than 2 years to complete once in production. Another very important aspect of NSSDC's transition to OAIS-specified archiving is NSSDC's working with data suppliers to enable them to create AIPs for submission to NSSDC. This significantly facilitates NSSDC's data ingest activity. To date, this has been pursued with the IMAGE (Imager for Magnetopause-to-Aurora Global Exploration) spacecraft managed at NASA/GSFC. NSSDC provided to the IMAGE team an upgraded version of its Package Generator Utility software (a module previously included under "DIOnAS"). With this software and a number of configuration files to set key values, the IMAGE Science and Missions Operations Center was able to compute needed attributes, create the attribute objects, compute other packaging information, create the CCSDS/ISO standardized labels, and finally bundle these with the data file content into AIPs for transmitting to NSSDC. For completeness we note that data currently ftp-accessible to the world from NSSDC (from "nssdcftp"), but not yet processed into AIPs, will be processed by DIOnAS. This will further populate the new permanent archive and will give NSSDC the option of building a higher level interface to all its ftp-accessible data since their whereabouts on
5 magnetic disk will be captured in the DIOnAS database. We will similarly DIOnASprocess CDF-formatted data from CDAWeb whose science content are not otherwise managed under DIOnAS. Value-added services and metadata "Value-added" can mean many different things. NSSDC provides value-added data sets by cross-normalizing and merging single-source data sets, of which the prime example is the OMNI data set of hourly solar wind data and some other data. NSSDC provides value-added services allowing concurrent access to many data sets with graphical browse and subsetting capabilities, of which CDAWeb and OMNIWeb are prime examples. NSSDC provides largely common but specific-dataset-tuned graphical browse and subset interfaces to a subset of its data holdings otherwise only ftpaccessible; this family of interfaces is known as "FTPBrowse." For the remainder of this section we take "value-added" to mean only the management of data-accompanying metadata and other material needed to transform data into information. It is useful to consider distinctions among data sets, files and media volumes. A data set is a set of like files holding data in one or more records per file. The files typically have common technical characteristics and have data at a given "processing level" acquired at successive points or intervals in the relevant independent variable space (e.g., time, or look direction to a remote object). The files and records of a given data set may have data from one or multiple sources (e.g., from multiple instruments on a given spacecraft). [To digress, the concept of processing level is important to appreciate. Scientific instruments generate and provide currents or voltages or the like which are digitized (the raw data). The raw data must then be processed, sometimes through a series of steps, to yield the "physical parameters" (e.g., magnetic field values, particle fluxes, plasma densities, calibrated images, etc.) which are used in scientific analyses. For much of its existence, NSSDC has archived physical parameter data sets only, judging that raw data would be unusable by others and that most of the science potential of the raw data had been reliably transformed to the physical parameter level. With space science instruments getting increasingly sophisticated and multi-modal, it is becoming increasingly unlikely that most of the science potential of raw data will be reliably transformed to the physical parameter level. Hence, there is now an emphasis on archiving, in addition to physical parameter data sets, the best-annotated and organized data that are not yet irreversibly transformed. This is close to the "raw data" level (cf. Ensuring the long-term correct and independent use of such low level data requires a great deal of documentation and supporting material, often more than is provided to archives with such data.] Relative to data media, most legacy data sets were provided to NSSDC on tape media, although in recent years data have arrived at NSSDC electronically or on CD's and DVD's. Not including second copies, the files of a given data set may be held on one or
6 multiple media volumes, or may reside on a media volume along with the files of distinctly different data sets. A media volume "supports" (not "belongs to") a data set. In this media, file, data set framework, the following types of information must be captured and managed: Technical information about files and media (file organization, record lengths and separators, tape densities, etc.); Inventory information: what files and media exist for a given data set and where are they; Transaction and other provenance information about the files and media; Record structure and content information enabling the interpretation of the record's bits and bytes as a series of pixel intensities, values of named variables, or other values; Scientific information about the contents of the data records. Here we address such questions as: How were the physical parameters of this data set produced from lower level data, or how can a user produce physical parameters from the low level data of this data set? What are the assumptions made in the data processing? What are the known errors in the data, under what conditions do they arise and how are they recognized? What are the uncertainties in the various physical parameters and how do their magnitudes depend on values of other dependent or independent variables? What range of science studies can the data reliably be applied to? What scientific results have resulted from prior analyses of the data? NSSDC has multiple databases and other information collections capturing much of the above information. The following paragraphs identify these databases and collections, and outline some of the issues facing NSSDC as it strives to define and implement an optimally integrated metadata and data environment. Except where noted, the databases are relational and use Oracle. The JEDS (Java-based Experiments, Data sets and Spacecraft) database contains information about data sets as a whole, as well as information about the sources of the data, typically spacecraft and their instruments. Entries are typically created prior to first data arrival at NSSDC. From an archive management perspective, JEDS knows: data set identifiers; plans and status for archiving this data set; data arrival-at-nssdc dates and types and numbers of media for each date; submitter name and other contacts; supporting material to send, with data, to requesters of data; numbers of requests for this data set, by year; etc. From an end-user perspective, JEDS knows: data set time span,
7 discipline-oriented keywords enabling a narrowing of data searches, various attributes (resolutions, data processing level, parameters contained, format, etc.) as summarized in a free-text description, etc. For most data sets, JEDS does not contain all the "scientific information" identified above as desirable in understanding the data processing already applied to, or needed by, the data. Such information, when available for legacy data sets, are typically found in publications or other reports of data generators/providers. These publications are identified in NSSDC's Technical Reference File (TRF) database that uniquely links publications to experiments or data sets. They may also be replicated in the data set catalogs discussed below. The TRF also links science papers uniquely to an experiment or data set thereby allowing data requesters to read some of the results of prior scientific analyses. A major challenge for scientists preparing low processing level data for archiving is the preparation of adequate documentation and other supporting material (e.g., software) that will be survivable over time. For metadata about individual digital data media, NSSDC has the IDA (Interactive Digital Archive) database that knows: what data set(s) a given volume supports, the location of the volume and its backup(s), the transaction history of the volume, technical attributes such as tape densities and block sizes, format (e.g., binary vs. ASCII) and, if binary, the vendor, time span of the data on the volume, etc. The IDA file is currently being evolved to become Java-based and to better support a broader range of media types. The DIOnAS database, mentioned earlier in connection with the new NSSDC approach to nearline permanent archiving, captures information about Archive Information Packages and their constituent data and attribute files. DIOnAS links these to NSSDC data sets, and captures AIP creation history information, technical information (file types, sizes, CRC checksums, etc.), location information (on DLT permanent archive and on customer-accessible RAID magnetic disk) and time span information. In addition to the above three databases which characterize at a high level NSSDC data sets and their sources, the digital media and the DIOnAS-ingested data files of NSSDC, there are three major collections of further information about NSSDC data. These primarily give more detailed information needed to understand and correctly use the data. They are readme's in the ftp-accessible area, dataset catalogs, and records of the NSSDCmaintained CCSDS Control Authority database of registered data set descriptions. The readme's are an ad hoc collection of dataset-specific descriptive texts giving record format information and variable levels of documentation on processing histories, uncertainties, etc. These are relevant to the subset of NSSDC data available from
8 ftp://nssdcftp.gsfc.nasa.gov/spacecraft_data/, some of which are known to the DIOnAS database and others not yet DIOnAS-known. The data set catalogs are collections of pages per data set accumulated by NSSDC over its history. These catalogs include format statements, tape organization and inventory information, partial tape dumps, and miscellaneous reports and communications between NSSDC staff and data providers all contributing to data understanding and usability. About 4 years ago, NSSDC scanned the contents of these hard copy catalogs and wrote the resultant GIF files to CD-ROMs indexed by NSSDC data set ID's. These catalogs, while somewhat ad hoc and varying in detail, are NSSDC's most comprehensive collections of information about the contents of data sets in a detail greater than that found in the JEDS database described above. The NASA/Science Office of Standards and Technologies (NOST) hosted by NSSDC plays several key roles relative to the international CCSDS. One such role is to be the "lead node" in an ensemble of CCSDS Control Authority Offices (CAO) which capture and "register" descriptions of data sets, and assign internationally unique CCSDS/ISO identifiers to these descriptions. The basic concept is that distributed data files can point, using these identifiers, to such descriptions held available and accessible by quasipermanent entities. NOST's CAO database at holds descriptions (rich in format and technical information but non-uniform in terms of data processing information, etc.) for ~100 data sets, a small subset of the digital data sets at NSSDC. One final virtual collection of information should be mentioned. Just as data availability is becoming increasingly distributed, so too is information about the distributed data. It is important to ensure data usability on a time scale long relative to the durations over which individual data creators/providers may make data available. Therefore the distributed information about data, in addition to the data themselves, need to be swept up by relevant archives. Conclusion Except for data made accessible by NSSDC's "active archive" partners within NASA, NSSDC is making virtually all its currently important space science data externally accessible via ftp and via a series of higher-functionality interfaces. However, NSSDC's very longevity has led to a highly heterogeneous environment relative to data formats and media and to metadata systems. NSSDC has taken important first steps to transform this legacy heterogeneous environment to a modern and integrated NSSDC Data and Information System encompassing its full archive reaching back to the dawn of the space age. Data preservation will be ensured. Convertibility of data to information will only be constrained by the adequacy of the supporting material provided with the data to NSSDC. That most legacy data are "physical parameter" data helps to minimize this problem. That more nearly raw data are being archived for recent missions places a major requirement on data providers to prepare supporting material to enable the long-term correct and provider-independent conversion of data to information. SUMMARY SESSION 1
Digital Preservation. OAIS Reference Model
Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp Aim OAIS model Understanding the functionality
The Key Elements of Digital Asset Management
The Key Elements of Digital Asset Management The last decade has seen an enormous growth in the amount of digital content, stored on both public and private computer systems. This content ranges from professionally
AHDS Digital Preservation Glossary
AHDS Digital Preservation Glossary Final version prepared by Raivo Ruusalepp Estonian Business Archives, Ltd. January 2003 Table of Contents 1. INTRODUCTION...1 2. PROVENANCE AND FORMAT...1 3. SCOPE AND
State of Michigan Document Imaging Guidelines
State of Michigan Document Imaging Guidelines Responsibility of Source Document Owners to Monitor the Work With limited human resources, the time it takes to maintain, process, and store records is becoming
SPASE: THE CONNECTION AMONG SOLAR AND SPACE PHYSICS DATA CENTERS
SPASE: THE CONNECTION AMONG SOLAR AND SPACE PHYSICS DATA CENTERS J. R. Thieman 1, D. A. Roberts 2, and T. A. King 3 * 1 Code 690.1, NASA/GSFC, Greenbelt, MD, 20771 United States. Email: [email protected]
Introduction to Optical Archiving Library Solution for Long-term Data Retention
Introduction to Optical Archiving Library Solution for Long-term Data Retention CUC Solutions & Hitachi-LG Data Storage 0 TABLE OF CONTENTS 1. Introduction... 2 1.1 Background and Underlying Requirements
Data Storage Solutions
Data Storage Solutions Module 1.2 2006 EMC Corporation. All rights reserved. Data Storage Solutions - 1 Data Storage Solutions Upon completion of this module, you will be able to: List the common storage
Organization of VizieR's Catalogs Archival
Organization of VizieR's Catalogs Archival Organization of VizieR's Catalogs Archival Table of Contents Foreword...2 Environment applied to VizieR archives...3 The archive... 3 The producer...3 The user...3
XenData Video Edition. Product Brief:
XenData Video Edition Product Brief: The Video Edition of XenData Archive Series software manages one or more automated data tape libraries on a single Windows 2003 server to create a cost effective digital
Implementing Offline Digital Video Storage using XenData Software
using XenData Software XenData software manages data tape drives, optionally combined with a tape library, on a Windows Server 2003 platform to create an attractive offline storage solution for professional
XenData Product Brief: SX-550 Series Servers for LTO Archives
XenData Product Brief: SX-550 Series Servers for LTO Archives The SX-550 Series of Archive Servers creates highly scalable LTO Digital Video Archives that are optimized for broadcasters, video production
Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software
Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software The Video Edition of XenData Archive Series software manages one or more automated data tape libraries on
XenData Archive Series Software Technical Overview
XenData White Paper XenData Archive Series Software Technical Overview Advanced and Video Editions, Version 4.0 December 2006 XenData Archive Series software manages digital assets on data tape and magnetic
GEOSPATIAL DIGITAL ASSET MANAGEMENT A SOLUTION INTEGRATING IMAGERY AND GIS WHERE WILL ALL THE PIXELS GO?(AND HOW WILL WE EVER FIND THEM?
GEOSPATIAL DIGITAL ASSET MANAGEMENT A SOLUTION INTEGRATING IMAGERY AND GIS WHERE WILL ALL THE PIXELS GO?(AND HOW WILL WE EVER FIND THEM?) Dr. Joan Lurie, GCC, Inc. 30 West 61 st Street, Apt 9A New York,
POLICY AND GUIDELINES FOR THE MANAGEMENT OF ELECTRONIC RECORDS INCLUDING ELECTRONIC MAIL (E-MAIL) SYSTEMS
POLICY AND GUIDELINES FOR THE MANAGEMENT OF ELECTRONIC RECORDS INCLUDING ELECTRONIC MAIL (E-MAIL) SYSTEMS 1. Purpose Establish and clarify a records management policy for municipal officers with respect
Implementing a Digital Video Archive Based on XenData Software
Based on XenData Software The Video Edition of XenData Archive Series software manages a digital tape library on a Windows Server 2003 platform to create a digital video archive that is ideal for the demanding
Chapter 7 Types of Storage. Discovering Computers 2012. Your Interactive Guide to the Digital World
Chapter 7 Types of Storage Discovering Computers 2012 Your Interactive Guide to the Digital World Objectives Overview Differentiate between storage devices and storage media Describe the characteristics
Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration
Solutions Integrated Storage Appliances Management Optimized Storage & Migration Archive Data Retention & Compliance Services Global Installation & Support SECURING THE FUTURE OF YOUR DATA w w w.q sta
Establishing a Mechanism for Maintaining File Integrity within the Data Archive
Establishing a Mechanism for Maintaining File Integrity within the Data Archive Thomas C. Stein, Edward A. Guinness, Susan H. Slavney Earth and Planetary Sciences, Washington University, St. Louis, MO,
North Carolina Digital Preservation Policy. April 2014
North Carolina Digital Preservation Policy April 2014 North Carolina Digital Preservation Policy Page 1 Executive Summary The North Carolina Digital Preservation Policy (Policy) governs the operation,
FlexArray Virtualization
Updated for 8.2.1 FlexArray Virtualization Installation Requirements and Reference Guide NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 U.S. Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501 Support
Archive exchange Format AXF
Archive exchange Format AXF What is AXF? Archive exchange Format (AXF) is an open format that supports interoperability among disparate content storage systems and ensures the content s long-term availability
Implementing a Digital Video Archive Using XenData Software and a Spectra Logic Archive
Using XenData Software and a Spectra Logic Archive With the Video Edition of XenData Archive Series software on a Windows server and a Spectra Logic T-Series digital archive, broadcast organizations have
Guidelines on Information Deliverables for Research Projects in Grand Canyon National Park
INTRODUCTION Science is playing an increasing role in guiding National Park Service (NPS) management activities. The NPS is charged with protecting and maintaining data and associated information that
Version 1.0 MCGILL UNIVERSITY SCANNING STANDARDS FOR ADMINISTRATIVE RECORDS. Summary
Version 1.0 MCGILL UNIVERSITY SCANNING STANDARDS FOR ADMINISTRATIVE RECORDS Summary... 1 Introduction... 2 McGill Record-Keeping: Obligations and Technical Issues... 3 Scope & Procedures... 4 Standards...
Considerations for Management of Laboratory Data
Considerations for Management of Laboratory Data 2003 Scientific Computing & Instrumentation LIMS Guide, November 2003 Michael H Elliott Drowning in a sea of data? Nervous about 21 CFR Part 11? Worried
XenData Product Brief: SX-250 Archive Server for LTO
XenData Product Brief: SX-250 Archive Server for LTO An SX-250 Archive Server manages a robotic LTO library creating a digital video archive that is optimized for broadcasters, video production companies,
Long Term Preservation of Earth Observation Data
Long Term Preservation of Earth Observation Data QA4EO Workshop RAL, October 18-20 th 2011 Mirko Albani and Bojan Bojkov* (ESA/ESRIN) Page 1 Outline Earth Observation data preservation: the need and the
Florida Digital Archive (FDA) Policy and Procedures Guide
Florida Digital Archive (FDA) Policy and Procedures Guide Version 3.0, May, 2011 Last reviewed May, 2010 without updates superseded versions: version 2.5, April 2009 version 2.4, August 2007 version 2.3,
IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
Ex Libris Rosetta: A Digital Preservation System Product Description
Ex Libris Rosetta: A Digital Preservation System Product Description CONFIDENTIAL INFORMATION The information herein is the property of Ex Libris Ltd. or its affiliates and any misuse or abuse will result
Long-term Archiving of Relational Databases with Chronos
First International Workshop on Database Preservation (PresDB'07) 23 March 2007, at the UK Digital Curation Centre and the Database Group in the School of Informatics, University of Edinburgh Long-term
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
Virtual Tape Systems for IBM Mainframes A comparative analysis
Virtual Tape Systems for IBM Mainframes A comparative analysis Virtual Tape concepts for IBM Mainframes Mainframe Virtual Tape is typically defined as magnetic tape file images stored on disk. In reality
1. Redistributions of documents, or parts of documents, must retain the SWGIT cover page containing the disclaimer.
Disclaimer: As a condition to the use of this document and the information contained herein, the SWGIT requests notification by e-mail before or contemporaneously to the introduction of this document,
Introduction. Chapter 1. Introducing the Database. Data vs. Information
Chapter 1 Objectives: to learn The difference between data and information What a database is, the various types of databases, and why they are valuable assets for decision making The importance of database
STORNEXT PRO SOLUTIONS. StorNext Pro Solutions
STORNEXT PRO SOLUTIONS StorNext Pro Solutions StorNext PRO SOLUTIONS StorNext Pro Solutions offer Post-Production and Broadcast Professionals the fastest, easiest, and most complete high-performance shared
Functionality. Overview. Broad Compatibility. 125 TB LTO Archive System Fast Robotics for Restore Intensive Applications
125 TB LTO Archive System Fast Robotics for Restore Intensive Applications managed by XenData6 Server software Functionality Fast Robotics 7s slot to drive time 125 TB near-line LTO capacity Unlimited
Chapter 13. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing
Interagency Science Working Group. National Archives and Records Administration
Interagency Science Working Group 1 National Archives and Records Administration Establishing Trustworthy Digital Repositories: A Discussion Guide Based on the ISO Open Archival Information System (OAIS)
HathiTrust Digital Assets Agreement
HathiTrust Digital Assets Agreement Complete the Agreement form below for submission of digital assets to the HathiTrust Digital Library ( HathiTrust ). Print two copies of the completed form and have
Creating the ultimate digital insurance
By Piql AS Rev. 140625 Creating the ultimate digital insurance Behind the curtains Ensuring future access Digital preservation is about ensuring future access to digital material. When long-term access
CHESS DAQ* Introduction
CHESS DAQ* Introduction Werner Sun (for the CLASSE IT group), Cornell University * DAQ = data acquisition https://en.wikipedia.org/wiki/data_acquisition Big Data @ CHESS Historically, low data volumes:
AXF Archive exchange Format: Interchange & Interoperability for Operational Storage and Long-Term Preservation
AXF Archive exchange Format: Interchange & Interoperability for Operational Storage and Long-Term Preservation Report to SMPTE Washington DC Section Bits By the Bay Conference 21 May, 2014 S. Merrill Weiss
FUJIFILM Recording Media U.S.A., Inc. LTO Ultrium Technology
FUJIFILM Recording Media U.S.A., Inc. LTO Ultrium Technology 1 What is LTO Ultrium Technology? LTO stands for Linear Tape Open, since it is linear data tape, and targeted to open systems environments in
Media Cloud Building Practicalities
Media Cloud Building Practicalities MEDIA-TECH Hamburg May 2012 Damien Bommart Director - Product Marketing www.fpdigital.com Agenda Front Porch Digital DIVASolutions Family What is a Media Content Storage
A Comparative TCO Study: VTLs and Physical Tape. With a Focus on Deduplication and LTO-5 Technology
White Paper A Comparative TCO Study: VTLs and Physical Tape With a Focus on Deduplication and LTO-5 Technology By Mark Peters February, 2011 This ESG White Paper is distributed under license from ESG.
Multi-Terabyte Archives for Medical Imaging Applications
Multi-Terabyte Archives for Medical Imaging Applications This paper describes how Windows servers running XenData Archive Series software provide an attractive solution for storing and retrieving multiple
Tier 2 Nearline. As archives grow, Echo grows. Dynamically, cost-effectively and massively. What is nearline? Transfer to Tape
Tier 2 Nearline As archives grow, Echo grows. Dynamically, cost-effectively and massively. Large Scale Storage Built for Media GB Labs Echo nearline systems have the scale and performance to allow users
2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, [email protected])
CSP CHRONOS Compliance statement for ISO 14721:2003 (Open Archival Information System Reference Model) 2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, [email protected]) The international
Virtualization s Evolution
Virtualization s Evolution Expect more from your IT solutions. Virtualization s Evolution In 2009, most Quebec businesses no longer question the relevancy of virtualizing their infrastructure. Rather,
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Nevada NSF EPSCoR Track 1 Data Management Plan
Nevada NSF EPSCoR Track 1 Data Management Plan August 1, 2011 INTRODUCTION Our data management plan is driven by the overall project goals and aims to ensure that the following are achieved: Assure that
Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer
Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance
AUTOMATED IMAGE RECEPTION, PROCESSING AND DISTRIBUTION SYSTEM FOR HIGH RESOLUTION REMOTE SENSING SATELLITES
AUTOMATED IMAGE RECEPTION, PROCESSING AND DISTRIBUTION SYSTEM FOR HIGH RESOLUTION REMOTE SENSING SATELLITES Moon-Gyu Kim, Taejung Kim, Sung-Og Park, Dongseok Shin*, Min Nyo Hong*, Sunghee Kwak*, Wookhyun
Florida Digital Archive (FDA) Policy and Procedures Guide
Florida Digital Archive (FDA) Policy and Procedures Guide Version 3.1, July 1, 2012 Last reviewed May, 2010 without updates superseded versions: version 3.0, May, 2011 version 2.5, April 2009 version 2.4,
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1
Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible
Let s Talk Digital An Approach to Managing, Storing, and Preserving Time-Based Media Art Works. DAMS Branch Manager Smithsonian Institution, OCIO
Let s Talk Digital An Approach to Managing, Storing, and Preserving Time-Based Media Art Works DAMS Branch Manager Smithsonian Institution, OCIO 1 Love and Marriage Digital Asset Management?? Digital Diamonds
XenData Product Brief: SX-250 Archive Server for LTO
XenData Product Brief: SX-250 Archive Server for LTO An SX-250 Archive Server manages a robotic LTO library creating a digital video archive that is optimized for broadcasters, video production companies,
Digital Media Storage
Summary State and local governments use computers to create, capture, or maintain public records. To be accountable to the citizens of Minnesota, government agencies are required by law to keep records
Cyber Security: Guidelines for Backing Up Information. A Non-Technical Guide
Cyber Security: Guidelines for Backing Up Information A Non-Technical Guide Essential for Executives, Business Managers Administrative & Operations Managers This appendix is a supplement to the Cyber Security:
Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?
Management Challenge Managing Hardware Assets What computer processing and storage capability does our organization need to handle its information and business transactions? What arrangement of computers
DISTRIBUTED MULTIMEDIA SYSTEMS
Components of distributed multimedia system DISTRIBUTED MULTIMEDIA SYSTEMS Application software Container object store Image and still video store Audio and component store Object directory service agent
Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows
Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows Sponsored by: Prepared by: Eric Slack, Sr. Analyst May 2012 Storage Infrastructures for Big Data Workflows Introduction Big
Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006
Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006 Introduction: Fixity, in preservation terms, means that the digital object
Product Brief: XenData X2500 LTO-6 Digital Video Archive System
Product Brief: XenData X2500 LTO-6 Digital Video Archive System Updated: March 21, 2013 Overview The XenData X2500 system includes XenData6 Workstation software which provides the archive, restore and
STORNEXT PRO SOLUTIONS. StorNext Pro Solutions
STORNEXT PRO SOLUTIONS StorNext Pro Solutions StorNext PRO SOLUTIONS StorNext Pro Solutions offer post-production and broadcast professionals the fastest, easiest, and most complete high-performance shared
Avid ISIS 2500-2000 v4.7.7 Performance and Redistribution Guide
Avid ISIS 2500-2000.7 Performance and Redistribution Guide Change History Date Release Changes 11/13/2015 4.7.7 Added support for El Capitan (Mac OS 10.11) Added support for Atto Thunderlink 10Gb for Mac
How To Store Data On A Computer (For A Computer)
TH3. Data storage http://www.bbc.co.uk/schools/gcsebitesize/ict/ A computer uses two types of storage. A main store consisting of ROM and RAM, and backing stores which can be internal, eg hard disk, or
How To Scan A Document
Guidelines For Scanning University Records Scanning, or digital imaging, is an increasingly popular strategy for dealing with records. Scanning can be a useful tool for managing your records and enhancing
RAID Technology Overview
RAID Technology Overview HP Smart Array RAID Controllers HP Part Number: J6369-90050 Published: September 2007 Edition: 1 Copyright 2007 Hewlett-Packard Development Company L.P. Legal Notices Copyright
Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files
STATE OF NEBRASKA STATE RECORDS ADMINISTRATOR DURABLE MEDIUM WRITTEN BEST PRACTICES & PROCEDURES (ELECTRONIC RECORDS GUIDELINES) OCTOBER 2009
STATE OF NEBRASKA STATE RECORDS ADMINISTRATOR DURABLE MEDIUM WRITTEN BEST PRACTICES & PROCEDURES (ELECTRONIC RECORDS GUIDELINES) OCTOBER 2009 Following is a voluntary guideline issued by the State Records
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide for Windows Release 7.5 Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide The software described in this
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
Solution Brief: Creating Avid Project Archives
Solution Brief: Creating Avid Project Archives Marquis Project Parking running on a XenData Archive Server provides Fast and Reliable Archiving to LTO or Sony Optical Disc Archive Cartridges Summary Avid
Computer Storage. Computer Technology. (S1 Obj 2-3 and S3 Obj 1-1)
Computer Storage Computer Technology (S1 Obj 2-3 and S3 Obj 1-1) Storage The place in the computer where data is held while it is not needed for processing A storage device is device used to record (store)
GENERAL INFORMATION COPYRIGHT... 3 NOTICES... 3 XD5 PRECAUTIONS... 3 INTRODUCTION... 4 FEATURES... 4 SYSTEM REQUIREMENT... 4
1 Table of Contents GENERAL INFORMATION COPYRIGHT... 3 NOTICES... 3 XD5 PRECAUTIONS... 3 INTRODUCTION... 4 FEATURES... 4 SYSTEM REQUIREMENT... 4 XD5 FAMILULARIZATION... 5 PACKAGE CONTENTS... 5 HARDWARE
