Digital Preservation. OAIS Reference Model



Similar documents
AHDS Digital Preservation Glossary

Interagency Science Working Group. National Archives and Records Administration

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (

Archival Data Format Requirements

North Carolina Digital Preservation Policy. April 2014

ADRI. Digital Record Export Standard. ADRI v1.0. ADRI Submission Information Package (ASIP)

Florida Digital Archive (FDA) Policy and Procedures Guide

REFERENCE MODEL FOR AN OPEN ARCHIVAL INFORMATION SYSTEM (OAIS)

Florida Digital Archive (FDA) Policy and Procedures Guide

Ex Libris Rosetta: A Digital Preservation System Product Description

Long-term archiving and preservation planning

Best Archiving Practice Guidance

Organization of VizieR's Catalogs Archival

Why archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria

TERRITORY RECORDS OFFICE BUSINESS SYSTEMS AND DIGITAL RECORDKEEPING FUNCTIONALITY ASSESSMENT TOOL

Queensland recordkeeping metadata standard and guideline

Digital Assets Repository 3.0. PASIG User Group Conference Noha Adly Bibliotheca Alexandrina

Technology Watch Report

INFORMATION AND DOCUMENTATION RECORDS MANAGEMENT PART 1: GENERAL IRISH STANDARD I.S. ISO :2004. Price Code

Digital Repository Initiative

Assessing a Scientific Data Center as a Trustworthy Digital Repository

The Deposit System for Electronic Publications. A Process Model. Titia van der Werf. Koninklijke Bibliotheek. Report Series 6

Space data and information transfer systems Open archival information system (OAIS) Reference model

Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006

DIGITAL PRESERVATION AT THE U.S. GOVERNMENT PRINTING OFFICE: WHITE PAPER. Version July 2008 UNITED STATES GOVERNMENT PRINTING OFFICE

Database preservation toolkit:

Preservation Metadata and the OAIS Information Model. A Metadata Framework to Support the Preservation of Digital Objects

OCLC Digital Archive Preservation Policy and Supporting Documentation Last Revised: 8 August 2006

System Requirements for Archiving Electronic Records PROS 99/007 Specification 1. Public Record Office Victoria

Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide

In ediscovery and Litigation Support Repositories MPeterson, June 2009

Technical concepts of kopal. Tobias Steinke, Deutsche Nationalbibliothek June 11, 2007, Berlin

Long term archiving (LTA) of digital product data which are not based on technical drawings. Part 4: Certification

The challenges of becoming a Trusted Digital Repository

DELAWARE PUBLIC ARCHIVES POLICY STATEMENT AND GUIDELINES MODEL GUIDELINES FOR ELECTRONIC RECORDS

Advanced Solutions of Microsoft SharePoint Server 2013 (20332) H6C76S

#MMTM15 #INFOARCHIVE #EMCWORLD 1

Technical. Overview. ~ a ~ irods version 4.x

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Preservation and Dissemination Policy of the LISS Data Archive

Brown County Information Technology Aberdeen, SD. Request for Proposals For Document Management Solution. Proposals Deadline: Submit proposals to:

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

Management of Official Records in a Business System

STATE OF NEBRASKA STATE RECORDS ADMINISTRATOR DURABLE MEDIUM WRITTEN BEST PRACTICES & PROCEDURES (ELECTRONIC RECORDS GUIDELINES) OCTOBER 2009

What is Digital Archiving? 4/10/2015. Archiving Digital Content ARMA Spring Meeting. Digital Archiving

IBM Campaign and IBM Silverpop Engage Version 1 Release 2 August 31, Integration Guide IBM

Digital Archiving Policy

Archivematica 0.8 tutorial

Chapter 5: The DAITSS Archiving Process

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un

Tivoli Storage Manager Explained

Harnessing Media Micro-Services for Stewardship of Digital Assets at CUNY TV. NDSR Project:

05.0 Application Development

Systems Requirements: Gap Analysis Tool

Long term retention and archiving the challenges and the solution

POLICY AND GUIDELINES FOR THE MANAGEMENT OF ELECTRONIC RECORDS INCLUDING ELECTRONIC MAIL ( ) SYSTEMS

Using EMC SourceOne Management in IBM Lotus Notes/Domino Environments

MHRA GMP Data Integrity Definitions and Guidance for Industry January 2015

2.2 INFORMATION SERVICES Documentation of computer services, computer system management, and computer network management.

Life Cycle of Records

Page 1 of 7 Effective Date: 12/18/03 Software Supplier Process Requirements

Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide

Why long time storage does not equate to archive

Final Report - HydrometDB Belize s Climatic Database Management System. Executive Summary

IBM Tivoli and Maximo Asset Management Development Update & Maximo 7.1 Preview

Modern Approaches to Archiving and Application Retirement

Best Practices for Long-Term Retention & Preservation. Michael Peterson, Strategic Research Corp. Gary Zasman, Network Appliance

James Hardiman Library. Digital Scholarship Enablement Strategy

Union County. Electronic Records and Document Imaging Policy

Information Security Policy September 2009 Newman University IT Services. Information Security Policy

The Australian War Memorial s Digital Asset Management System

SP A Framework for Designing Cryptographic Key Management Systems. 5/25/2012 Lunch and Learn Scott Shorter

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Managing Data Growth Within Contract Management Systems with Archiving Strategies

Analyst 1.6 Software. Laboratory Director s Guide

InfinityQS SPC Quality System & FDA s 21 CFR Part 11 Requirements

9. GOVERNANCE. Policy 9.8 RECORDS MANAGEMENT POLICY. Version 4

Assessment of Vaisala Veriteq vlog Validation System Compliance to 21 CFR Part 11 Requirements

Archival principles and data sustainability. Dr Andrew Wilson Queensland State Archives

Transcription:

Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp

Aim OAIS model Understanding the functionality of entities Understanding the interaction of entities Understanding OAIS as reference model

Overview Principle of OAIS model Technical overview Functional overview Information model Conclusion

Background: OAIS & NASA National Space Science Data Center NASA s first digital archive Experienced many technology changes since 1966 Consultative Committee for Space Data Systems International group of space agencies Developed variety of science discipline- independent standards Became working body for an ISO TC 20/ SC 13 about 1990 TC20: Aircraft and Space Vehicles SC13: Space Data and Information Transfer Systems ISO suggested that SC 13 should develop archive standards Address data used in conjunction with space missions Address intermediate and indefinite long term storage of digital data

What is a Reference Model? A Framework for understanding significant relationships among the entities of some environment, and for the development of consistent standards or specifications supporting that environment. A reference model is based on a small number of unifying concepts is an abstraction of the key concepts, their relationships, and their interfaces both to each other and to the external environment may be used as a basis for education and explaining standards to a non-specialist.

OAIS OAIS is a Reference Model NO implementation plan, NO data model, NO specification OAIS describes elements, concepts and defines terminology Aim: Determine functional entities of the reference model which are implemented by a current system Identification of functions and responsibilities for a potential solutions

OAIS References Reference Model for an Open Archival Information System (OAIS), Blue Book, CCSDS 650.0-B-1, January 2002 Preservation is based auf Blue Book and: Don Sawyer, Lou Reich: ISO Reference Model for an Open Archival Information System (OAIS) Tutorial Presentation, LOC, 13. Juni 2003 http://ssdoo.gsfc.nasa.gov/nost/isoas/overview.html

Overview Principle of OAIS model Technical overview Functional overview Information model Conclusion

Open Archival Information System (OAIS) Open Reference Model standard(s) are developed using a public process and are freely available Information Any type of knowledge that can be exchanged Independent of the forms (i.e., physical or digital) used to represent the information Data are the representation forms of information Archival Information System Hardware, software, and people who are responsible for the acquisition, preservation and dissemination of the information

Purpose, Scope, and Applicability Framework for understanding and applying concepts needed for longterm digital information preservation Long-term is long enough to be concerned about changing technologies Starting point for model addressing non-digital information Provides set of minimal responsibilities to distinguish an OAIS from other uses of archive Framework for comparing architectures and operations of existing and future archives Basis for development of additional related standards Addresses a full range of archival functions Applicable to all long-term archives and those organizations and individuals dealing with information that may need long-term preservation Does NOT specify an implementation

Model View of an OAIS Environment Producer OAIS (archive) Consumer Management Producer is the role played by those persons, or client systems, who provide the information to be preserved Management is the role played by those who set overall OAIS policy as one component in a broader policy domain Consumer is the role played by those persons, or client systems, who interact with OAIS services to find and acquire preserved information of interest

OAIS Information Definition Information is always expressed (i.e., represented) by some type of data Data interpreted using its Representation Information yields Information Information Object preservation requires clear identification and understanding of the Data Object and its associated Representation Information Interpreted Using its Yields Data Object Representation Information Information Object

Information Package Definition Content Information Preservation Description Information An Information Package is a conceptual container holding two types of information Content Information Preservation Description Information (PDI)

Information Package Variants SIP: Submission Information Package Negotiated between Producer and OAIS Sent to OAIS by a Producer AIP: Archival Information Package Information Package used for preservation Includes complete set of Preservation Description Information (PDI) for the Content Information DIP: Dissemination Information Package Includes part or all of one or more Archival Information Packages Sent to a Consumer by the OAIS

External Data Flow View Producer Submission Information Packages OAIS Archival Information Packages Dissemination Information Packages orders result sets queries Consumer

Overview Principle of OAIS model Technical overview Functional overview Information model Conclusion

Open Archival Information System: Six Functional Entities Preservation Planning P R O D U C E R SIP Ingest Data Management Descriptive Info. Archival Storage AIP Access queries result sets orders DIP C O N S U M E R Administration MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package

Functional Entities in an OAIS (1/2) Ingest: This entity provides the services and functions to accept Submission Information Packages (SIPs) from Producers and prepare the contents for storage and management within the archive Archival Storage: This entity provides the services and functions for the storage, maintenance and retrieval of Archival Information Packages Data Management: This entity provides the services and functions for populating, maintaining, and accessing both descriptive information which identifies and documents archive holdings and internal archive administrative data.

Functional Entities in an OAIS (2/2) Administration: This entity manages the overall operation of the archive system Preservation Planning: This entity monitors the environment of the OAIS and provides recommendations to ensure that the information stored in the OAIS remain accessible to the Designated User Community over the long term even if the original computing environment becomes obsolete. Access: This entity supports consumers in determining the existence, description, location and availability of information stored in the OAIS and allowing consumers to request and receive information products

Ingest Data Functions

Receive Submissions: Ingest Data Functions Intermediate storage (Staging Area) for submissions Confirmation of reception of a SIP Quality Assurance Validatoin of submission (CRC, logs, identity checks, media) Generate AIP Transformation of SIPs into AIPs that conform to the archive's standard (Transformation, Migration) Sends AIP for Audit to Administration Generate Descriptive Information Collection and extraction of descriptive information of AIPs for data management and access aids Coordinate Updates Transfer of AIPs to Archival Storage and Descriptive Information to Data Management Receives confirmation for Archival Storage Update request to Data Management

Archival Storage Functions

Archival Storage Functions Receive Data: Receives Storage Request for AIP Selects appropriate storage devices and media Returns storage confirmation Manage Storage Hierarchy Executes policies Monitors von error logs, operational statistics Replace Media Reproduction of AIPs over time (no change of content or preservation description information, change of Packaging Information) Error Checking PDI Fixity Information (CRCs, error-correcting codes,...) Disaster Recovery Duplicate of digital content Duplicate a physically separate facility Provide Data Copy of AIPs for access

Data Management Functions

Data Management Functions Administer Database Integrity of DB for Descriptive Information und system information Perform Queries Executes requests for Access Generate Reports Supplies reports to Ingest, Access, Administration Receive DB Updates Adds, modifies or deletes information in Management DB Ingest: new AIPs, Administration: updates

Administration Functions

Administration Functions Negotiate Submission Agreement Agreements with Producers (templates) Manage System Configuration Monitors functionality, performance Plans system evolution, policies Archival Information Update Updating content of the archive: Change of DIPs und Re-Submission -> Migration Establish Standards and Policies Budget, Standards, Policies Audit Submission Verification of SIPs und AIPs regarding submission agreements Verification of Representation und Package Information

Administration Functions Activate Requests Periodical comparison of content in the archive Orders on a periodic basis for consumers Customer Service Customer accounts Collects billing information from Access, sends bills and collects payment

Preservation Planning Functions

Preservation Planning Functions Monitor Designated Community Interaction with producer and consumer Monitor Technology Development of technology: HW, SW, standards, formats Develop Preservation Strategies and Standards Recommendation and development of strategies and standards Develop Packaging Designs and Migration Plans Creation of detailed migration plan Provides new Preservation Description Information

Migration Context Data Management And Access View Content Information Identifier Descriptive Information Mapping AIP Identifier Archival Storage View Archival Storage Mapping Packaging Information Content Information Preservation Description Information

Digital Migration Digital Migration is defined to be the transfer of digital information, while intending to preserve it, within the OAIS. Focus on preservation of the full information content New information implementation replaces the old OAIS has full control and responsibility over all aspects of the transfer

Migration Motivators Motivators driving digital migrations Media Decay Often this is superceded by escalating media drive maintenance costs Increased Cost Effectiveness More cost-effective media types with higher volumes and lower drive maintenance costs New User/Consumer Service Requirements New formats more compatible with user s technology and applications Proprietary software evolution New software versions used to upgrade formats of the information objects being preserved

Digital Migration Approaches Four primary types of digital migration in response to motivators, ordered by increasing risk of information loss: Refreshment Media replacement with no bit changes Replication No change to Packaging Information or Content Information bits Repackaging Some bit changes in Packaging Information Transformation Reversible: Bit changes in Content Information are reversible by an algorithm Non-reversible: Bit changes in Content Information are not reversible by an algorithm

Digital Migration and AIPs Unless migration involves transformation: no new AIP version Transformation: new AIP Version Upgrading or improvement of AIPs: new AIP Edition Extracting or aggregating from multiple AIPs: Derived AIP

Access Functions

Access Functions Coordinate Access Activities User interface, authorizing 3 categories of requests: Query requests for Data Management, returns result set Orders, access Data Management und Archival Storage, result DIP Dissemination Requests from Administration for Archival Information Update Generate DIP Retrieves AIP from Archival Storage Obtain Descriptive Information from Data Management Transforms AIPs in appropriate DIPs Deliver Response on-line und off-line deliveries Billing information for Administration

Access Preservation Effective access to digital information requires the use of software Application Programming Interfaces (APIs) may be cost-effectively maintained across time by an OAIS when: API is not too complex API is applicable to a wide variety of AIUs API source code may be ported to new environments Extensive testing is needed to ensure against information loss Preservation of executables by full emulation of underlying hardware is problematic Hard to know what is the information being preserved May not be possible to fully emulate associated devices

Common Services Modern, distributed computing applications assume a number of supporting services Examples of Common Services include: inter-process communication name services temporary storage allocation exception handling security file and directory services

Common Services Important: All actions are documented Reporting Confirmation

OAIS Composite Functional Entities

Open Archival Information System: Summary Preservation Planning P R O D U C E R SIP Ingest Data Management Descriptive Info. Archival Storage AIP Access queries result sets orders DIP C O N S U M E R Administration MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package

Overview Principle of OAIS model Technical overview Functional overview Information model Conclusion

Information Package Definition Content Information Preservation Description Information An Information Package is a conceptual container holding two types of information Content Information Preservation Description Information (PDI)

Information Object Information Object Data Object Interpreted using 1+ 1+ Representation Information Interpreted using Physical Object Digital Object 1+ Bit Sequence

Representation Information The Representation Information accompanying a physical object, like a moon rock, may give additional meaning It typically is a result of some analysis of the physically observable attributes of the rock The Representation Information accompanying a digital object, or sequence of bits, is used to provide additional meaning. It typically maps the bits into commonly recognized data types such as character, integer, and real and into groups of these data types. It associates these with higher level meanings which can have complex inter-relationships that are also described

Recursive Nature of Representation Information Interpreted using Structure Information Semantic Information Other Representation Information * Representation Information 1 1 Structure Information adds meaning to Semantic Information * Other Representation Information

Sample Representation Net

Types of Information Used in OAIS Information Object Content Information Preservation Description Information Packaging Information Descriptive Information...

Content Information The information which is the primary object of preservation An instance of Content Information is the information that an archive is tasked to preserve. Deciding what is the Content Information may not be obvious and may need to be negotiated with the Producer The Data Object in the Content Information may be either a Digital Object or a Physical Object (e.g., a physical sample, microfilm)

Preservation Description Information Provenance Information Describes the source of Content Information, who has had custody of it, what is its history Context Information Describes how the Content Information relates to other information outside the Information Package Reference Information Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified Fixity Information Protects the Content Information from undocumented alteration

PDI Examples Content Information Type Space Science Data Digital Library Collections Software Package Reference Provenance Context Fixity Object identifier Journal reference Mission, instrume nt, titl e, attribute set Bibliographic description Persistent identifier Name Author/Originator Version number Serial number Instrument description Processing history Sensor description Instrument Instrument mode Decommu tation ma p Software interface specification For scanned collections: metadata about the digitisation process pointer to master version For born-digital publications: pointer to the digital original Metadata about the preservation process: pointers to earlier versions of the collection item change history Revision history License holder Registration Copyright Calibration history Related data sets Mission Funding history Pointers to related docume nts in original environme nt at the time of publication Help file User guide Related software Language CRC Checksum Reed-Solomon coding Digital signature Checksum Authenticity indicator Certif icate Checksum Encryption CRC

Descriptive Information Contain the data that serves as the input to documents or applications called Access Aids. Access Aids can be used by a consumer to locate, analyze, retrieve, or order information from the OAIS.

Packaging Information Information which, either actually or logically, binds and relates the components of the package into an identifiable entity on specific media Examples of Packaging Information include tape marks, directory structures and filenames

OAIS Archival Information Package Package Description derived from Archival Information Package (AIP) delimited by Packaging Information e.g., Information supporting customer searches for AIP e.g., How to find Content information and PDI on some medium Content Information e.g., Hardcopy document Document as an electronic file together with its format description Scientific data set consisting of image file, text file, and format descriptions file describing the other files further described by Preservation Description Information (PDI) e.g., How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured

AIP detailed view

AIP Types Archival Information Unit (AIU) contains a single Data Object as the Content Object Archival Information Collection (AIC) contains multiple AIPs in its Content Object Each member of an AIC is an AIP containing Content Information and PDI The AIC contains unique PDI on the collection process Archival Information Unit Archival Information Package Archival Information Collection

Package Descriptions and Access Aids Package Descriptions are needed by an OAIS to provide visibility and access to the OAIS holdings Package Descriptions contain 1 or more Associated Descriptions which describe the AIP Content Information from the point of view of a single Access Aid Some example of Access Aids Include: Finding Aids - assist the consumer in locating information of interest Ordering Aids - allow the consumer to discover the cost of and order AIUs of interest Retrieval Aids - enable authorized users to retrieve the AIU described by the Unit Descriptor from Archival Storage

Information Model Summary Presented a model of information objects as containing data objects and representation objects Classified information required for Long-term archiving into 4 classes: Content Information, PDI, Packaging Information and Descriptive Information Described how these classes would be aggregated and related in an AIP to fully describe an instance of Content Information Presented information needed for Access, in addition to that needed for Long-term Preservation Put the Access oriented structures in the context of the other data needed to operate an OAIS

Conclusion OAIS is a reference model NO implementation plan, NO data model, NO specification Provides a framework including terminology, concepts, roles, operations Applicable for all archives and organization Applicable for all kinds of information objects