Mass Digitization of Manuscripts and Rare Books: Challenges and Experiences at Bavarian State Library

Similar documents
Digitization Workflow of the. Bavarian State Library. Gabriele Messmer. Bavarian State Library. Munich, Germany

Mass Digitisation and Long-term Preservation Processes and Production at Munich Digitisation Centre

Long-term preservation activities of the Bavarian State Library

E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started

The Australian War Memorial s Digital Asset Management System

SECURITY WHERE THE HISTORY LIVES

OCLC CONTENTdm. Geri Ingram Community Manager. Overview. Spring 2015 CONTENTdm User Conference Goucher College Baltimore MD May 27, 2015

PATCO Digitization Procedures

This presentation is on standards and best practices related to the digitization of photographs.

Digitization of copyright protected newspapers in Sweden. Torsten Johansson and Heidi Rosen - National Library of Sweden - Kungliga Biblioteket

Scanning, analysing and archiving photographs

GEOSPATIAL DIGITAL ASSET MANAGEMENT A SOLUTION INTEGRATING IMAGERY AND GIS WHERE WILL ALL THE PIXELS GO?(AND HOW WILL WE EVER FIND THEM?

HP Smart Document Scan Software compression schemes and file sizes

QUARTZ HD 600 x 600 dpi optical > 11lp / mm QUARTZ A1 SUPRASCAN. Just smile you are in good hands!

STORRE: Stirling Online Research Repository Policy for etheses

How To Build A Map Library On A Computer Or Computer (For A Museum)

College Archives Digital Preservation Policy. Created: October 2007 Last Updated: December 2012

Functional Overview. ELOoffice - ELOprofessional DOCUMENT MANAGEMENT ARCHIVING WORKFLOW. Brilliant ideas can be organised

A. The Treeno Data Center maintains audited advanced security systems equal to the most sophisticated systems of large corporations.

The Rutgers Workflow Management System. Workflow Management System Defined. The New Jersey Digital Highway

Quareo ICM Server Software

The EU digital libraries initiative: Europeana (and more)

ManageEngine ServiceDesk Plus - MSP Training Agenda

Xerox Multifunction Devices. Verify Device Settings via the Configuration Report

DFG form /15 page 1 of 8. for the Purchase of Licences funded by the DFG

HathiTrust Digital Assets Agreement

The University of Chicago Library

Customer Tips. Xerox Network Scanning TWAIN Configuration for the WorkCentre 7328/7335/7345. for the user. Purpose. Background

Billy Chi-hing Kwan Associate Museum Librarian/Systems. The Image Library

TWAIN Driver Reference Guide

Challenges and Experiences in the Mass Digitization of Manuscripts and Rare Books at the Bavarian State Library"

FileMaker: Complete Platform to Create, Deploy, and Manage Custom ipad and iphone Solutions for Business

How To Scan A Document

Born-digital media for long term preservation and access: Selection or deselection of media independent music productions

Digital Dunhuang: : A Case Study for Digital Preservation and Digital Asset Management

Web OPAC: An Effective Tool for Management of Reprints of ARI Scientists

INTRODUCTION TO DIGITAL PHOTOGRAPHY

TWAIN/WIA Driver. Operation Guide

Digital Assets Repository 3.0. PASIG User Group Conference Noha Adly Bibliotheca Alexandrina

San Francisco

A Selection of Questions from the. Stewardship of Digital Assets Workshop Questionnaire

How To Manage Your Digital Assets On A Computer Or Tablet Device

Key Considerations for Documentation Management Technology. Learning from Local Experience

Scanning Made Real. Apply your skills & implement your workflow!

Overview of NDNP Technical Specifications

I want to run my business with state of the art technology.

MULTIMEDIA INSTALLING THE MULTIMEDIA UPGRADE

Mass digitisation workflow management

Document scanning. Done right. TM

Newspaper Digitization Brief Background

Digital photo management using iview MediaPro 3 and Capture One PRO

Data rescue and digitization: tips and tricks resulting from the Dutch experience

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

DIGITAL OBJECT an item or resource in digital format. May be the result of digitization or may be born digital.

Image quality issues in digitization projects of historical documents

How To Use A Court Record Electronically In Idaho

Digitisation Disposal Policy Toolkit

Let s Digitize! Funds provided by

Newspaper Preservation. by H.R. Mohan Associate VP (Systems) The Hindu Chennai

Technical concepts of kopal. Tobias Steinke, Deutsche Nationalbibliothek June 11, 2007, Berlin

PRESERVATION NEEDS ASSESSMENT PRESERVATION 101

case study NZZ Neue Zürcher Zeitung AG Archive 1780 Summary Introductory Overview ORGANIZATION: PROJECT NAME:

MIRAX SCAN The new way of looking at pathology

WHITE PAPER. 3-Heights Scan to PDF Server Basics and Applications

Whitepaper Document Solutions

EPSON Perfection 2450 PHOTO. Scanner Parts. Scanner Specifications. Basic Specifications. US letter or A4 size ( inches [ mm])

Navigating to Success: Finding Your Way Through the Challenges of Map Digitization

Tube Control Measurement, Sorting Modular System for Glass Tube

Introduction to Zetadocs for NAV

I want to run my business with state of the art technology.

APPLICATION SECURITY: ONE SIZE DOESN T FIT ALL

Building next generation consortium services. Part 3: The National Metadata Repository, Discovery Service Finna, and the New Library System

Enterprise Content Management. A White Paper. SoluSoft, Inc.

MillMedia Guidelines

Archiving digital documents and s in PDF/A

Brainloop Secure Dataroom Version QR Code Scanner Apps for ios Version 1.1 and for Android

Data Sheet 1: DOCUMENT SCANNING

MFC Mikrokomerc OFFER

KIP 3000 Series MULTIFUNCTION SIMPLICITY

TMS THE MUSEUM SYSTEM

Rimage Surveillance Solutions:

Introduction. KIC Help Desk Guide v. 2.7

Third Party System Management Integration Solution

How to Use This Manual

Transcription:

Mass Digitization of Manuscripts and Rare Books: Challenges and Experiences at Bavarian State Library Dr. Markus Brantl

1. The Bavarian State Library (BSB) 1. Institute for Book and Manuscript ConServation (IBR) 2. Munich DigitiZation Center (MDZ) Agenda 2. Mass digitization and BSB s digitization strategy 3. The digitization process of manuscripts and rare prints 4. Project examples: robot scanners and hand-operated bookscanners in 16th century digitization projects 5. Throughput

Founded in 1558 The Bavarian State Library (1) European Universal Library and International Research Library of world renown Central Regional and Archival Library of Bavaria (Legal Deposit since 1668) 713 employees, annual budget: 48,2 Mio. 9.5 million volumes, 55,000 current periodicals Acquisition per year: 140,000 volumes Open daily from 8.00 a.m. to 12 p.m. (112 hours per week) Visits to General Reading Hall: 1.1 million (2009) Loans: 1.9 million (2009), Document Delivery: 400,000 (2009)

92,000 medieval manuscripts (No. 4 worldwide) The Bavarian State Library (2) 20,000 incunabula (No. 1 worldwide) 140,000 16th century rare books (No. 1 in Germany)

Institute of Book and Manuscript ConseRvation (IBR) Founded 1963 Staff 16 Focus on preventive conservation Training of conservators: Bachelorand Master-Programs

The Munich DigitiZation Center (MDZ) National competence center for digitization technology and workflows More than 100 projects since 1997 Mass digitization with state-of-the-art technology (Scan-Robotic)for 16th century books Long-term-preservation in cooperation with the Leibniz-Supercomputing-Centre (LRZ) Staff with Scanning Center 45, mostly third-party funded

MDZ-Homepage Collection of Mss. graec. available

1. The Bavarian State Library (BSB) 1. Institute for Book and Manuscript ConServation (IBR) 2. Munich DigitiZation Center (MDZ) Agenda 2. Mass digitization and BSB s digitization strategy 3. The digitization process of manuscripts and rare prints 4. Project examples: Robot scanners and hand-operated bookscanners in 16th century digitization projects 5. Throughput

Mass digitization Production of more than a million pages? volumes? Definition today production of more than a million pages within a limited time with different stages of indexing (barely/deeply)

BSB s strategy for mass digitization Objective: to digitise and make accessibly free of charge all (copyright-free) BSB library holdings ~ 1.2 million objects In the Internet

How? Third-party funds Materials from 6th-16th century Manuscripts, incunabula, special collections Public-Private Partnership with 17th-19th century Third-party funds 20th-21st century

Current projects EU-Project Europeana Regia : Collaborative initiative between European libraries for the digitization of royal manuscripts in Carolingian and Renaissance Europe; funded by the EU BSB participates with 116 delicate manuscripts, ca. 42,000 pages Term: 2010 2012 DFG-funded-Project Digitization of the BSB Incunables Ca. 9.000 titles, 1.8 million pages Term: 2008 2011 DFG-funded-Project VD16 : BSB Books printed in the 16 th century 37.000 titles; 7.5 million pages Term: 2007 2013 Operated with ScanRobots In comparison BSB s public-private-partnership with Google more than 1 million books (in less than 10 years) more than 250 millions pages Contract signed in February 2007

Agenda 1. The Bavarian State Library (BSB) 1. Institute for Book and Manuscript ConServation (IBR) 2. Munich DigitiZation Center (MDZ) 2. Mass digitization and BSB s digitization strategy 3. The digitization process of manuscripts and rare prints 4. Project Examples: robot scanners and hand-operated book- scanners in 16th century digitization projects 5. Throughput

The digitization process of manuscripts and rare prints 1. Preparation 2. Image Capture 3. Workflow and Indexing 4. Storage and digital long-term preservation 5. Access

Preparation - Cooperation between the IBR and MDZ Conservational checks at the shelves Transport to the inhouse Scanning Center Selection of scanners Training of scan staff in gentle handling of rare books Provision of tools Assistence by conservators in scanning of sensitive and high value books

Handling of the originals Manuscripts Incunabula Rare books Maps Special materials on different writing materials and with various sizes 19

Handling of delicate books: digitization of the Fugger Ehrenbuch Two conservators and one scan operator

Handling: materiality.. Inflexible paper Paper distorsions Books spine difference between papermaking and printing process by printing process, or tight-stapling Size and thickness stapling, gutter, back gluing

Handling: the opening angle 120 Opening angle Spine complex stretched but tolerable Same book: 180 opening angle Spine complex with endbands, sewing, spine lining extremly overstretched, covering leather detached Not allowed at BSB!

Handling: no glassplate 70 % of manuscripts and rare books can not be opened at 180 = reduction of throughput : No plane pressure No direct contact between the glass and the orginal

##Bilder Lighting: as short as possible cold fluorescent lamp, synchronized with CCD-line Flash with Pyrex dome LED No continous lighting during reproduction - exposure damage is cumulative

BSB conservational requirements for manuscripts and rare books reproduction The scanning devices have to follow the book requirements = a good opening angle for the book Short exposure time Different bookcradles

MDZ-Scan facilities for manuscripts and rare books

Standard-bookscanner with 180 bookcradle Reproduction without glassplate

Working without glassplate you need assistance The Munich Digit invented by our conservators

Angle bracket from 90 up to 140 - with holder

Traverse support with 110 aperture angle

Foam wegde covered with acid-free carton

Special cradle Grazer Camera Table

or the mobile version: Grazer Traveller

The ScanRobot cradle: Very flessible from 60 up. Stepless adaptable for the books requirements Self-centering cradle (books position in relation to the scanning head)

Image Production Parameters: Manuscripts, Rare Books and Special Collections Color depth: 24 Bit Resolution: 400 up to 600 ppi optical - always in relation to the original documents size Digital master-file: TIFF uncompressed Media neutral with attached ICC-Profile (Color Management) of the scanning device Authentic, i.e. visible border around the page color, grayscale and size target Image storage size between 20 Megabyte up to 800 Megabyte per image

Scanning Output: Examples

Workflow and indexing basic conditions ZEND= Zentrale Erfassungs- und NachweisDatenbank MySQL-Database, Apache Cocoon and Solr Mapping of the entire production processes in a modular system Different service providers (scanning, text capture) can supply unlimited data to ZEND Workflow-control Every item of the BSB, which will be digitised, follows only the ZEND-workflow Time and cost reduction through automation of standard-processes

The Workflow with ZEND at a glance

ZEND-modules

Indexing: basic tasks All Metadata in one XML-Framework: TEI P5 Administrative Metadata Job management Technical Metadata Image information (ICC Profiles, Formats) Bibliographical MD Data import from Catalog via Z.39.50 Allocation of an URN Catalog Structural MD Table of contents or fulltext Backbone of the production line: the unique, persistent Identifier ID & standardized image names bsb00001119_00001.tif Assignment of an URN (National Bibliography Number) Example : urn:nbn:de:bvb:12- bsb00001119

Example: TEI P5 XML-Data with OCR-text

Indexing: Example from XML to HTML Hit-highlighting in the image

Data storage and Long-Term Preservation: status and forecast 1200 Terabyte 800 500 300 190 2 10 25 50 100 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

Leibnizsupercomputing Center Partner of the Bavarian State Library in storage hosting and digital long-term preservation

LRZ backup- & archive system Total capacity 300,000 Tapes 146 PetaByte

Access different ways Internet viewers DFG-Viewer MDZ-Viewer PDF, download, printing 3D NEW: ipad, iphone Gesture based 3D-presentation system for exhibitions Print Document Delivery of high resolution images

Direct access to the digital object via catalog, Google, WorldCat

and PDF-Download: the entire book on your PC 1,000 PDF-downloads per day

3D-Viewer Selected treasures of BSB see http://www.bayerische-landesbibliothek-online.de/3d (requires Java)

3D-Viewer

ipad/iphone-application: Famous books of th Bavarian State Library

BSB-Explorer gesture-based 3D-exhibition system http://www.youtube.com/watch?v=qmmmmvnnxli

Agenda 1. The Bavarian State Library (BSB) 1. Institute for Book and Manuscript ConServation (IBR) 2. Munich DigitiZation Center (MDZ) 2. Mass Digitization and BSB s Digitization Strategy 3. The digitization process of manuscripts and rare prints 4. Project examples: Robot scanners and hand-operated book- scanners in 16th century digitization projects 5. Throughput

Digitization of 16th century rare books in the VD16-1 and VD16-2 Goal: Online Publication of all 16th century books which are unique in the inventory of BSB

Project: VD16-1 Unique books printed between 1500-1517 Project duration: 2006-2008 Production: 4,700 books in 24 months with 3 hand-operated scanners Reduced opening angle solved by book cradle support: only one-sided scanning was possible 1 Scan / click = 1 Image 3 process steps 1. Scanning all left pages 2. Scanning all right pages 3. Assembling left and right pages Problem: No pagination in 16th century books! Higher error rate Reduction of throughput

The way to the ScanRobot Objective: optimization of throughput for the scanning of early and rare prints with limited opening angle 2006 - Market evalution for automatic book scanners (hardware and software) and tender Since 2007 development partnership with Treventus for 16th century books scanning (ongoing)

The use of the ScanRobot in the VD16-2 Project Follow-up project 1518-1600 Projects: 2008-2009 and 2010-2012 Production: 37,000 books 7.5 million pages Book cradle 60 -opening angle, continously adjustable Up and down-movement of the scan-unit continously adjustable Pages taken slightly by volume flow (no sucking!) 1 Scan = 2 Images

1. The Bavarian State Library (BSB) 1. Institute for Book and Manuscript ConServation (IBR) 2. Munich DigitiZation Center (MDZ) Agenda 2. Mass Digitization and BSB s Digitization Strategy 3. The digitization process of manuscripts and rare prints 4. Project examples: Robot scanners and hand-operated bookscanners in 16th century digitization projects 5. Throughput

Throughput: our measurement base The entire base for throughput calculation covers: 1. Preprocessing Transport, creating a check form, selection of qualified scanner 2. Scan-Operating Creating a scan job, positioning of the book, scanning, target scanning, storage, data operations 3. Postprocessing Quality control, complaint and rescanning, WWW-delivery, retransport 4. Long-term preservation Automated data transfer in the archiving system, quality control, deletion of the production files in the scanning after the successful long-term preservation

= Scanning throughput at the MDZ Scanning Center Results of 2009, based on the State of the Art of our scanner equipment, among them 8 devices from 2005 on BSB strict conservational requirements: more 70 % of manuscripts and rare books can not be opened at 180 and under the assumption, that the real working time is 6 hours p. day Manuscripts/rare books and difficult objects. handoperated scanner up to ca. 200 pages/day Manuscripts/rare books with normal condition, handoperated scanner ca. 380 pages/da Rare books with ScanRobot ca. 1,000 pages/day 60

What have we done so far? 1.2 million copyright free books Books available online (28.10.10): 394,000 Up to 1600 = ~ 59,000 books

Contact: MDZ: brantl[at]bsb-muenchen.de IBR: irmhild.schaefer[at]bsb-muenchen.de All images: Copyright BSB