AXF Archive exchange Format: Interchange & Interoperability for Operational Storage and Long-Term Preservation Report to SMPTE Washington DC Section Bits By the Bay Conference 21 May, 2014 S. Merrill Weiss / Merrill Weiss Group LLC Chairman, TC-31FS30 AXF Working Group
History Proprietary Systems Background Formats on Media Interface Protocols Same System Type Required for Recovery Media Migration Not Easy Danger of Orphaned Archives If System Support Ended High Costs of Implementation & Operation Individualized System Integration Requirements Transfer Costs Resulting from Inability to Interchange Media
User Requirements Objectives Movement of Material Between Different Archive Systems Between Different Operations of Same Company Between Companies Retrieving Files & Metadata from Media Into Different Systems Flexibility in Changing Archive Management Vendors No Loss of Data or Metadata When Changing Vendors No Requirement for Native Use of Format
Objectives User Requirements cont d. Archive Investment Protection Ability to Retrieve Files & Metadata In Absence of Creating System Ability to Read Media Into Other Archive Systems Ability to Retrieve Media Contents w/simple Utility on Many OSs Automation Metadata Support Inclusion of Metadata for Systems Interacting w/archive Systems E.g., Traffic, Automation, Color Correction, Editing Systems Allow Importing Discovered Items into Databases
Objectives Additional Requirements Providing Unlimited Storage Capability File Size, Number of Files, Number of Media, etc. Providing an Implementation Strategy Providing Support for All Types of Media Current & Future Providing for Storage on More Than One Medium Spanning Providing for Updates/Changes to Archive Objects over Time Providing for Information Recovery from Damaged Media Providing for File Exchange via Communications Channels Enables Use of Cloud Storage
Objectives Underlying Assumption Same Type of Media Supported on Both Systems Source of Medium & Recipient of Medium Drives, Drivers, & Control Software
Implementation Strategy Large Existing Installed Base & Large Archive Inventory Initially, Explicit Export & Import Export of Archive Objects to Media Specifically for Interchange Importing of Archive Objects into Receiving Systems By Translating to Native Formats of Receiving Systems By Inclusion of Interchanged Objects in Databases Later, Adoption of Format as Native in Archive Systems Eliminating Need for Separate Export/Import Steps Permitting Direct Transfer of Media Between Systems
Implementation Strategy Data Recovery Utilities Available Through SMPTE To Be Contributed by WG Participants For Wide Variety of Operating Systems For All Media Types Permitting Data Recovery without an Archive System To Help Ensure Access to Archived Files & Associated Metadata
SMPTE Standard SMPTE ST 2034 Part 1 Written & In Ballot Result of Years of Work & Refinement Uses XML Schema to Define Most Structures Expected to Be Completed This Year Part 2 Will Be Standard on Use of AXF Schema Upstream Providing Mechanism for Communicating Metadata to Archive Systems Most Major Functions Defined Writing of Document Begun
How Are Digital Assets Stored? Steps Required to Ensure Long-Term Accessibility Valuable File-Based Assets Stored in Digital Archives in All Industries Key Goals of the Ideal Storage Format Ensuring Long Term Accessibility Self-Describing Assets & Self-Describing Storage Media Encapsulation to Maintain Important Metadata & File Relationships Scalability for Any Number of Elements of Any Size & Type Standardized Regardless of Storage Media Technology Transportability & Compatibility between Systems Preservation (OAIS) Features such as Fixity, Provenance, etc.
TAR Existing Storage Choices Tape ARchive Format (Originally from UNIX) Has Been Around for Many Decades No Truly Universal TAR Standard Rather, Many Customized Implementations TAR Is a Legacy Format Cannot Support Many Core Functionalities Required in M&E Space TAR Misses Many of Storage Format Goals Outlined
LTFS Existing Storage Format Choices Linear Tape File System is Simple File System for Linear Data Tape Makes Data Tapes Appear as Removable Storage Currently No Standards Bodies Have Documented LTFS Often Mistakenly Referred to as a Standard Being Documented by Storage Networking Industries Association Considered By Itself, Has Some Significant Limitations With Respect to Storage Format Goals Outlined By Itself, Is Very Useful for Physical Transport of Content But Not for Long Term Storage or Preservation Why Not?
Existing Storage Format Choices LTFS cont d. No Media Encapsulation Relies on Simple Folder Hierarchies to Form Important Asset Relationships Lacks Context Does Not Scale Well Due to Lack of Support for Spanning Across Storage Media A Significant Problem in M&E Only Supports Modern Data Tape Technologies Is Not Applicable to Any Other Storage Technologies
Existing Storage Format Choices TAR & LTFS Neither Achieves 100% of Long-Term Storage & Preservation Goals What other choices are there? Caution: Following Detailed Descriptions Are Not Adopted Yet Some Details Still May Change in Response to Ballot Inputs
AXF Archive exchange Format AXF A File Collection Wrapper Essentially an Advanced ZIP Can Encapsulate Any Number of Files of Any Type & Size Brings Universal Transport & Interoperability to Storage At Same Level as MXF Brings to Content Designed to Support All Storage Technologies Now and into the Long-Term Future IT Centric and Not Tied to Media Applications Alone Being Standardized by SMPTE
AXF Layered Context Considered within an IT System Stack Server/Storage Stack with AXF support AXF-Aware Application Archive exchange Format (AXF), Including Internal File System Block Level Addressing File System Operating System Hardware Abstraction Layer Driver Physical Drive Medium (without File System) Medium (with File System)
AXF Features Unlimited Storage Support Any Number of Files, Any Size of Files, Plus Media Spanning Resilience to Media Damage and Corruption Redundancy in All Structures Payload Independently Recoverable Support for Media With & Without a File System Raw Data Tape, LTFS Data Tape, Spinning Disk, Flash Media, Optical Storage, Holographic Storage, & Anything in Future Support for Any Operating System and/or File System Embedded File System Abstracts Underlying Systems
AXF Features Self Describing Objects & Self Describing Media Enable Simple Transport of Objects & Media between Systems Object Versioning & Collection Support Support Complex Relationships between Objects Support Additions, Updates, etc. Support for All File Types Not Just Media Files IT Centric Implementation Based on Experience Handling Big Data in M&E
Streaming & File-Based Asset Transport & Delivery Support for Streaming De Encapsulation Support for In Path Checksums for Structures & Files Support for Cloud Storage & Delivery Applications Streaming Design Maximizes Transfer Speeds To/from All Media Types AXF Features
Asset Components Metadata Structured Unstructured Proprietary Open What Is AXF? File Payload A A V V Ancillary Files Preservation Elements File System Access Control Provenance Fixity Context Reference Universal Storage-Agnostic File System AXF Object
Storage Technology & File Systems Future Storage Technologies Spinning Disk (NTFS, FAT, etc.) Solid State Disk (NTFS, FAT, etc.) AXF Object Holographic (UDF, etc.) Blu-Ray (BDFS, UDF, etc.) Flash Media (NTFS, FAT, etc.) Data Tape (Block-Based, LTFS, etc.) DVD (UDF, etc.)
AXF Object Header Metadata Container Metadata Container File 1 AXF File Footer File 2 AXF File Footer File N AXF File Footer AXF Object Footer AXF Medium Identifier AXF on Storage Medium AXF Object 1 AXF Object 2 AXF Object N Metadata Payload File Payload
Inside an AXF Object Binary Structure Containers Define Elements
File Tree Acts as Light-Weight File System Contained in Object Header & Footer Inside an AXF Object (2) Identifies Payload Files, Locations, Sizes, & Other Characteristics
Spanned Sets Medium 1 (Initial Medium of a Spanned Set) Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo1 Medium Content Fragment Footer: FragmentPairUUID = Uf1 FragmentNumber = 1 NextMediumUUID = Um2 Medium 2 (Intermediate Medium of a Spanned Set) Medium Identifier: Medium.UUID = Um2 Object Header: Object.UUID = Uo1 Fragment Header: FragmentPairUUID = Uf1 FragmentNumber = 2 PreviousMediumUUID = Um1 Medium Content Fragment Footer: FragmentPairUUID = Uf2 FragmentNumber = 2 NextMediumUUID = Um3 Medium 3 (Final Medium of a Spanned Set) Medium Identifier: Medium.UUID = Um3 Object Header: Object.UUID = Uo1 Fragment Header: FragmentPairUUID = Uf2 FragmentNumber = 3 PreviousMediumUUID = Um2 Medium Content Object Footer: Object.UUID = Uo1
Collected Sets Anchor Object + Subsequent Object(s) Compile per Instructions Product Object Anchor Object (Initial Object of a Collected Set) Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo1 CollectedSetUUID = Uo1 SetSequence = 1 File: Video1 (processing=default) File: Audio1 (processing=default) File: Audio2 (processing=default) Object Footer: Object.UUID = Uo1 CollectedSetUUID = Uo1 CollectedSetSequence = 1 Subsequent Object (Final Object of a Collected Set) Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo2 CollectedSetUUID = Uo1 CollectedSetSequence = 2 File: Audio1 (processing=replace) File: Audio2 (processing=delete) File: Closed Caption1 (processing=add) Object Footer: Object.UUID = Uo2 CollectedSetUUID = Uo1 CollectedSetSequence = 2 Processing Instructions: ADD REPLACE DELETE New Files Carried with ADD & REPLACE No Files Carried with DELETE
AXF Features Summary Objects Scale to Any Size & Encapsulate Any Number of Files With Full Support for Media Spanning No Need to Upgrade Existing Storage Infrastructures Ensures Long-Term Compatibility & Resiliency With Self-Describing Features for Both AXF Objects & AXF Media Overcomes All Technical, Operational, & Functional Limitations Of TAR & LTFS, Works Harmoniously with LTFS An IT Centric Implementation Not Limited to Media Files Alone Supports Documents, Imaging Data, etc. Satisfies All of Ideal Storage Format Goals Outlined
AXF Archive exchange Format: Interchange & Interoperability for Operational Storage and Long-Term Preservation Report to SMPTE Washington DC Section Bits By the Bay Conference 21 May, 2014 S. Merrill Weiss / Merrill Weiss Group LLC Chairman, TC-31FS30 AXF Working Group