Archive exchange Format AXF What is AXF? Archive exchange Format (AXF) is an open format that supports interoperability among disparate content storage systems and ensures the content s long-term availability no matter how storage or file system technology evolves. AXF inherently supports interoperability among existing, discrete storage systems irrespective of the operating and file systems used and also future-proofs digital storage by abstracting the underlying technology so that content remains available no matter how these technologies evolve. At the most basic level, AXF is an IT-centric file container that can encapsulate any number and any type of files in a fully self-contained and self-describing package. The encapsulated package actually contains its own file system, which abstracts the underlying operating system, storage technology, and the original file system from the AXF object and its valuable payload. It s like a file system within a file that can be store any type of data on any type of storage media. The Embedded File System Makes All the Difference This innovative embedded file system approach is AXF s defining attribute. It allows AXF to be both content- and storage-agnostic. In other words, because the AXF object itself contains the file system, it can exist on any generation of data tape, spinning disk, flash, optical media, or other storage technology that exists today or might exist tomorrow. December 2011 Page 1 of 9
Because of this neutrality, AXF certainly supports the modern generation of data tape technologies such as LTO5, IBM TS1140 and Oracle T10000C and because there is no dependency on the features of the storage technology itself (e.g. such as dual partition tape drives), it supports all legacy data tape technologies as well (any LTO generation, any SAIT generation, any IBM TS11xx drive, any Oracle T10000x, STK 9840, STK 9940, etc.). What Makes AXF Better? AXF offers many significant advantages over other formats for long-term storage, protection and preservation such as: Encapsulation AXF is built on a concept similar to TAR where individual files and elements that comprise an asset can be collected into a single container or object. This allows applications to treat complex assets as a single file when copying, moving or deleting it. AXF also layers significant advanced features on top of this encapsulation including metadata encapsulation, per structure checksums to ensure validity, recovery and index structures to add to resiliency, and more all while preserving advanced CSM features such as timecode based partial restore, etc. Scalability AXF Objects can grow to any size and encapsulate any number of individual files of any size and type. AXF was designed from an IT-centric perspective while also serving the key requirements mandated in industries such as media and entertainment, medical imaging, GIS, etc. AXF can store assets consisting of kilobytes of data through to the largest data sets envisioned today and into the future bound only by available storage technologies. AXF also supports spanning of objects across media (such as over multiple data tapes) overcoming storage media capacity limitations. December 2011 Page 2 of 9
Storage Agnostic AXF is built from the ground up to be fully storage agnostic. This means that an AXF Object found on spinning disk, flash memory, data tape media and in the cloud are all exactly the same avoiding complexities surrounding file system, storage technology or operating system intricacies. Further, because of its embedded file-system approach, AXF will also extend into yet-to-be-discovered storage technologies as there is no dependency aside from the ability to store structured bits of data in an orderly and predictable fashion. Designed for Long Term Preservation AXF includes features necessary for true preservation operations as defined in the OAIS (Open Archival Information System) model. Features such as fixity, context, provenance, etc. are all key elements of the AXF design although most are optional ensuring scale to lesser applications. For the Preservationist community, AXF offers support for the core OAIS (Open Archival Information System) reference model with built-in features such as fixity (per-file checksums and per-structure checksums), provenance, context, reference, open metadata encapsulation, and access control. Standardization At NAB 2011, Front Porch Digital announced the release of AXF as part of their DIVArchive V7.0 CSM solution and committed to providing their AXF invention, designs and intellectual property to SMPTE for standardization. Work within the SMPTE TC-31FS30 WG Archive exchange Format (AXF) group continues based on the strawman submission from FPD with participation from several manufacturers and end users from various industries. We encourage interested parties to visit smpte.org for more information on this important standardization initiative. December 2011 Page 3 of 9
Sustainability As part of the SMPTE mandate, once AXF is standardized it will be maintained and controlled in a centralized repository and versioned responsibly. All specifications, versions, enhancements and modifications to the standard will be published, tracked and made available to all interested parties. For an application vendor to claim AXF compliance they must conform to AXF published standards and it is expected that validation toolkits will be made available to the greater community to assist in future compliance testing. Resiliency AXF incorporates resiliency features that make it possible to recover object contents, descriptive metadata, and media catalogs in a multitude of failure and corruption situations. AXF also incorporates comprehensive fixity and error-checking capabilities in the form of multiple per-file and per-structure checksums that are considered mandatory for modern systems. Embedded File System The embedded file system enables AXF to translate between any generic set of files and logical block positions on any storage medium, whether or not the medium has its own file system or not. This ensures long-term abstraction and isolation from idiosyncrasies common in various operating system, file system and storage technologies of yesterday, today and into the future. AXF overcomes limitations such as filename size constraints, limited number of files per folder, character set constraints, etc. that can negatively affect the future recoverability of valuable file based assets. Because of the targeted simplicity of its embedded file system, AXF ensures accessibility and long-term preservation of these valuable assets. No Legacy Issues AXF was invented with a sight to the future but also due respect to the past and what we have learned from decades of experience. Unlike LTFS which relies on modern technologies, AXF can December 2011 Page 4 of 9
support all future as well as past storage devices transparently alleviating the need for expensive infrastructure upgrades to leverage its significant benefits. AXF also overcomes all of the very well documented issues with other legacy formats such as TAR. As modern storage devices are envisioned and developed, a pragmatic evolutionary path for AXF can be followed ensuring historical and future protection and preservation of valued assets. Self-Describing Objects and Media Unlike other approaches such as LTFS and TAR, AXF includes self-describing characteristics at both the object level and the media level. This means that once prepared for AXF, all media and objects can be fully indexed and recognized by any other system that also comprehends AXF. Further, because AXF media and objects are fully self-describing, foreign systems can receive them and have enough information to rebuild asset databases with media, object and file information including any type and any amount of structured and/or unstructured metadata. Transportability AXF Objects and AXF media can be moved between any systems which comprehend AXF in a fashion very similar to that offered by LTFS and TAR while overcoming their various limitations. AXF offers the same level of content transportability and offers significant functional and preservation centric benefits. Application Driven As AXF is based on maintained specification, application builders can simply refer to the most recent published documentation to design and develop their own specific takes on the format. This means users can ensure their long-term storage, archive and preservation goals are embodied when selecting an application to create and maintain their valuable assets. December 2011 Page 5 of 9
These are some of the main benefits of AXF and highlight its ability to support large-scale archive and preservation systems as well as simple, standalone applications in any industry. How Does AXF Work? AXF is designed so that each AXF Object (or package) is comprised of four main components regardless of what technology is used to store them (spinning disk, flash media, data tape without a file system, data tape with a file system, etc.). These are: 1. Each AXF Object originates with an AXF Object Header a structure containing descriptive XML metadata such as the AXF Object s unique identifier (UUID and UMID), creation date, object provenance, and file-tree information including file permissions, paths, etc. 2. Following the AXF Object Header is any number of optional AXF Generic Metadata packages. The AXF Generic Metadata Packages are self-contained, open metadata containers in which applications can include AXF Object-specific metadata. This metadata can be structured or unstructured, open or vendor-specific, binary, or XML. 3. The next part of the AXF Object construct is the AXF File Payload the actual byte data of the files encapsulated in the object. The payload consists of any number of triplets File Data + File Padding + File Footer. File padding, which ensures alignment of all AXF Object elements on storage medium block boundaries, is key to the AXF specification. The File Footer structure contains the exact size of the preceding file, along with file-level checksum designed to be processed on-the-fly by the application during restore operations with little or no overhead. These file footers add to the enhanced resiliency of AXF as they can be used to recover file payload data even if the AXF Object Header and Footer structures are missing or corrupt. 4. The final portion of an AXF Object is the AXF Object Footer, which repeats the information contained in the AXF Object Header and adds information captured during the AXF Object s December 2011 Page 6 of 9
creation, including per-file checksums and precise file size and structure block positions. The AXF Object Footer is important to the resiliency of the AXF specification because it allows efficient re-indexing by foreign systems when the media content is not previously known offering media transport between systems that follow the AXF specification. Because of this standardized approach to the AXF Object construct which abstracts the underlying complexities of the storage media itself, simple access to the content is ensured regardless of the evolution of technology now and into the future. Additional Structures Enhance Storage Media Although the AXF Object itself is always exactly the same regardless of the underlying storage technology used, some additional structures are added to AXF media to ensure its recoverability and self-descriptive nature. For example, when used with linear data tape (typical in large scale archives today) an AXF implementation includes three additional structures to incorporate key self-describing characteristics on the medium itself, ensuring recoverability and transportability: 1. The first structure, which appears on the medium, is an ISO/ANSI standard VOL1 volume label. This is included for compatibility purposes with legacy applications to ensure they do not erroneously handle AXF formatted media and to signal applications which do understand AXF they can proceed with accessing the objects contained on the medium. 2. The second structure is the Medium Identifier, which contains the AXF volume signature, a UUID and label for the media as well as some other information about the storage medium itself. The implementation of the Medium Identifier differs slightly depending on whether the December 2011 Page 7 of 9
storage medium is linear or nonlinear, and whether it includes a file system or not, but the overall structures are fully compatible. 3. The third structure is the AXF Object Index, which is an optional structure that assists in the rapid recoverability of AXF-formatted media by foreign systems. Information contained in this structure is sufficient to recover and rapidly reconstruct the entire catalog of AXF Objects on the storage medium. In a case where the application has not maintained the optional AXF Object Index structures, the contents of each AXF Object can still be reconstructed by simply processing each AXF Object Footer structure adding to the enhanced resiliency of the format. Who is the Ideal AXF User? Anyone. AXF was designed from an IT-centric perspective and developed to meet a broad spectrum of user needs from those accessing petabytes of data in a high performance environment through to those looking to simply encapsulate a few files and send them to a friend via email. AXF is completely scalable to accommodate an operation of any size or complexity. In all cases, AXF offers an abstraction layer that hides the complexities of the storage technology from the higher-level applications, while it also offers fundamental encapsulation, provenance, fixity, portability, and preservation characteristics. In addition, the same self-describing AXF format can be used interchangeably on all current storage technologies, such as spinning disk, flash media, and data tape from any manufacturer now and into the future. AXF was designed from the ground up to scale from the most basic to the most complex environments and applications. The Bottom Line AXF has the ability to support interoperability among systems, help ensure long-term accessibility to valued assets, and keep up with evolving storage technologies. It offers profound present and future benefits for any enterprise that uses media from heritage institutions, to schools, to broadcasters, to December 2011 Page 8 of 9
simple IT-based operations and is well on its way to becoming the long-awaited, worldwide, open standard for file-based archiving, preservation, and exchange. More information on AXF and standards-body activities is available at www.openaxf.org and www.smpte.org. December 2011 Page 9 of 9