CCS Content Conversion Specialists. METS / ALTO introduction



Similar documents
Overview of NDNP Technical Specifications

Newspaper Digitization Brief Background

METADATA STANDARDS AND GUIDELINES RELEVANT TO DIGITAL AUDIO

WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA?

How To Improve Mets

TRANSKRIBUS. Research Infrastructure for the Transcription and Recognition of Historical Documents

Creating an EAD Finding Aid. Nicole Wilkins. SJSU School of Library and Information Science. Libr 281. Professor Mary Bolin.

Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008

11 ways to migrate Lotus Notes applications to SharePoint and Office 365

The Rutgers Workflow Management System. Workflow Management System Defined. The New Jersey Digital Highway

Express Technology. Maintenance Solutions Express Technology Inc.

MBooks: Google Books Online at the University of Michigan Library

Islandora: An Open Source Institutional Repository Solution. Consortium of MnPALS Libraries Annual Meeting April 2014

erecording Best Practices for Recorders

How to add Financial Dimensions to Default cubes

HP ThinPro. Table of contents. Enabling RemoteFX for RDP. Technical white paper

PDF Primer PDF. White Paper

Internet Technologies for Digital Libraries

Submission Information Package (SIP) Specification Version 1.0 DRAFT December 19, 2001

Best practices for producing high quality PDF files

Chapter 5: The DAITSS Archiving Process

Sample Customer Reports

Frequently Asked Questions for: Settlement Disputes System Submittal Process

STANDARDS FOR ONLINE COURSE MATERIALS

Creating Content for ipod + itunes

Billy Chi-hing Kwan Associate Museum Librarian/Systems. The Image Library

Standards Development. PROS 14/00x Specification 3: Long term preservation formats

Kuali Financial System Interface Specification for Electronic Invoice Feed

Solutions Lab. AXIS Barcode Reader Beta 0.3

bbc Adobe PDF/XML Architecture - Working Samples

VThis A PP NOTE PROCESSING P2 MEDIA WITH FLIPFACTORY

Adobe Acrobat 9 Pro Accessibility Guide: Creating Accessible PDF from Microsoft Word

Business Portal for Microsoft Dynamics GP. Requisition Management User s Guide Release 10.0

A Adobe RGB Color Space

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

Digital Asset Management in Museums

New, changed, or deprecated features

How To Use The Mets Document In A Webmail Document In An Html File On A Microsoft Powerbook (Html) On A Macbook 2 (Html).1.5 (Html2)

Lou Burnard Consulting

Performance Testing Web 2.0

How To Manage Your Digital Assets On A Computer Or Tablet Device

Follow up Information. Caring for Digital Materials: Metadata, finding aids, and digital asset management. Danielle Cunniff Plumer, instructor

IBM Unica emessage Version 8 Release 6 February 13, User's Guide

PEOPLESOFT MOBILE INVENTORY MANAGEMENT FOR THE HEALTHCARE INDUSTRY

Using Dublin Core for DISCOVER: a New Zealand visual art and music resource for schools

PANAVISION Panalog After Effects Plugin

Preservation Handbook

An Oracle White Paper May Creating Custom PDF Reports with Oracle Application Express and the APEX Listener

Keep it Simple... 7 Transformation-based Development (2013 and Beyond)...7 Less Customization and More Innovation...8 Time to Market...

SAP NetWeaver MDM 5.5 SP3 SAP Portal iviews Installation & Configuration. Ron Hendrickx SAP NetWeaver RIG Americas Foundation Team

By following these guidelines, the illustrator will help production proceed more efficiently.

Cisco TelePresence VCR Converter 1.0(1.8)

Should Costing Version 1.1

The A-Z of Building a Digital Newspaper Archive: A Case Study of the Upper Hutt City Leader

Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM)

REGULATIONS COMPLIANCE ASSESSMENT

Archival Data Format Requirements

WebSphere Business Monitor

EDITING YOUR THESIS Some useful pointers. Editing is all about making it easy for the reader to read your work.

ELFRING FONTS UPC BAR CODES

Navigating Microsoft Word 2007

File Formats for Electronic Document Review Why PDF Trumps TIFF

Extender for HDMI 1.3 over CAT6. GTV-HDMI1.3-CAT6 User Manual.

E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started

All You Wanted To Know About the Management of Digital Resources in Alma

PDF/A the standard for long-term archiving

City of Mercer Island

Consumer Trend Research: Quality, Connection, and Context in TV Viewing

Avature CRM. Get Engaged to Talent

Introduction to TWAIN Direct (DRAFT COPY) December 2, 2014 Revision 0.1

Making Structured MS Word 2007 or 2010 documents from Seeing Ear.txt files

STORRE: Stirling Online Research Repository Policy for etheses

[MS-ASMS]: Exchange ActiveSync: Short Message Service (SMS) Protocol

Scanning Guidelines. Records Management

Motorola 8- and 16-bit Embedded Application Binary Interface (M8/16EABI)

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005

UNIMARC, RDA and the Semantic Web

ACE: Dreamweaver CC Exam Guide

Quick installation guide for the Vista Quantum QNVR Network Video Recorder

Whitepaper. NVIDIA Miracast Wireless Display Architecture

Using the Acrobat tab in Microsoft Word: Setting PDF Preferences

Geospatial Data Stewardship at an Interdisciplinary Data Center

BASIC VIDEO EDITING: IMOVIE

Transcription:

CCS Content Conversion Specialists METS / ALTO introduction

Why would I use METS and ALTO?! I digitize lots of different items and each type is digitized to a different format. Some are Word, some are PDF, some are XML, some are just JPG I need it all to be the same, but what is the best way?! I d like to offer full text search to the scientists and researchers in the complete collection, so I need to build a text index of the complete collection and I might want to change the presentation system in three years, so I need the source data in a non-proprietory format! I need to organize my digitization project accordingly to long term preservation standards how long will PDF exist? I might need something more robust

What is METS?! METS -> Metadata Encoding and Transmission Standard! Established in 2001! XML based open standard! Schema is hosted at Library of Congress (LOC)! Maintained by METS Editorial Board! Current version: 1.10! Usesd for longterm preservation

What does METS do?! METS files are used to describe a digital object! printed media (book, newspaper, journal), audio or video media! any other item! METS usually embeds different kinds of metadata / metadata standards! Descriptive metadata (DC, MODS, MARC21)! Administrative metadata (Mix, PREMIS, )! May contain structural information! Physical structure (page based / file based)! Logical structure (chapter based / article based / track based)! May link to any other digital objects! Image files, Audio / video files, Text files! External metadata objects

What is ALTO?! ALTO > Analyzed Layout and Text Object! XML based open standard! Schema is hosted at Library of Congress (LOC)! Maintained by ALTO Board! Current version: 2.1 (Draft 3.0)

What does ALTO do?! Contains the content of a single page! Describes the layout of a printed page to re-build the original page! Describes the styles, layout and block type information! May contain tags which contain more information about content (e.g. named entities) ALTO

Why would you use XML standards?! Fully documented XML format! Can be used by any IT party later on! Can be transformed to other formats in the future (for longterm preservation)! Readable by humans

What are your benefits using METS/ALTO?! It is the current industry standard for digitization used by hundreds of libraries and content providers! The long-term sustainability of your digital objects is greatly enhanced! Supports article and chapter segmentation! You can handle objects in an easy way and exchange them with other parties! You can create PDF, EPUB, DAISY and other formats from METS/ ALTO! It is constantly maintained and developed (hosted by the Library of Congres)

How does a METS file looks like? METS METS consists of five sections plus the METS header metshdr METS document header dmdsec descriptive metadata section amdsec administrative metadata section filesec file inventory section structlink structural map linking structmap structural map

What information include the METS sections?! dmdsec Descriptive Metadata Section, per logical unit " title and author, publishing date, language, copyright details, etc! amdsec Administrative Metadata Section, per image " resolution, bit depth, dimensions, source file name, scanner name, etc! filesec File Inventory Section " File inventory and referencing (pointers to each image file, each ALTO file, derivatives)! StructLink Structural Link " Describes the logical structure of a complex digital object! StructMap StructuralMap " Decribes page by page the structure of the digital object

Summary METS and ALTO! Established for the description of digitization of printed material! The idea was to split the descriptive information and the content itself! In order to handle objects in an easy way! If all data would be in one XML file (e.g. TEI) the XML file would be huge! METS and ALTO are maintained and continuously developed by libraries (hosted at Library of Congress)! METS/ALTO data can be exchanged with other parties! Many vendors are offering products for handling METS/ALTO data! You can make PDF, EPUB, DAISY and other formats from METS/ALTO

More Information! http://www.loc.gov/standards/mets/! http://www.loc.gov/standards/alto/! http://en.wikipedia.org/wiki/alto_(xml)! http://www.veridiansoftware.com/knowledge-base/metsalto/! http://www.nla.gov.au/ndp/project_details/documents/ ANDP_Use_of_METSv2.pdf

Contact CCS Content Conversion Specialists GmbH Weidestr. 134 22083 Hamburg Germany dwsupport@content-conversion.com www.content-conversion.com

Disclaimer All of the information in this document is the property of CCS Content Conversion Specialists GmbH (CCS). It may NOT, under any circumstances, be distributed, transmitted, copied, or displayed without the written permission of CCS. The information contained in this document has been prepared for the sole purpose of providing information about theme described in the following title. The material herein contained has been prepared in good faith; however, CCS disclaims any obligation or warranty as to its accuracy and/or suitability for any usage or purpose other than that for which it is intended. CCS Content Conversion Specialists GmbH, 2012