METADATA MANAGEMENT IN EUROPEANA PHOTOGRAPHY

Similar documents
Appendix A: Inventory of enrichment efforts and tools initiated in the context of the Europeana Network

DELIVERABLE. Grant Agreement number: Europeana Cloud: Unlocking Europe s Research via The Cloud

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data

Links, languages and semantics: linked data approaches in The European Library and Europeana.

User research for information architecture projects

The digital Strategy of the Dresden State Art Collection

Service Road Map for ANDS Core Infrastructure and Applications Programs

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Europeana Core Service Platform

Europeana Food and Drink. D2.4 Europeana Food and Drink Content Upload Report

CURATING DIGITAL HERITAGE BY MUSIS - THE SOUTH-WESTERN GERMAN MUSEUM NETWORK

Alere: diagnosing and monitoring health conditions globally.

Importance of Metadata in Digital Asset Management

Big Data in the Digital Cultural Heritage

Coordination of standard and technologies for the enrichment of Europeana

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Market Analysis For Photography: Extract From D5.1

Data Quality Committee Mission Statement

Queensland recordkeeping metadata standard and guideline

Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute

The challenges of becoming a Trusted Digital Repository

Why a single source for assets should be. the backbone of all your digital activities

Cross-Sectional Integration of LAM Resources on the Basis of Authority Data

Metadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

Digital Asset Management Software & Brand Portal

Social media strategy

Judy Jeng, Ph. D. International Conference of Digital Archives and Digital Humanities Taipei, Taiwan December 1-2, 2009

WHITE PAPER TOPIC DATE Enabling MaaS Open Data Agile Design and Deployment with CA ERwin. Nuccio Piscopo. agility made possible

Secure Semantic Web Service Using SAML

Indigenous Knowledge and Resource Management in Northern Australia (IKRMNA) - ARC Linkage Project

Using NVivo to Manage Qualitative Data. R e i d Roemmi c h R HS A s s e s s me n t Office A p r i l 6,

Copyright Notice: digital images, photographs and the internet

DELIVERABLE. D5.3: Product Development Plan. Project Acronym: Ev3 Grant Agreement number: Project Title: Europeana Version 3

Reason-able View of Linked Data for Cultural Heritage

Information and documentation The Dublin Core metadata element set

Towards an EXPAND Assessment Model for ehealth Interoperability Assets. Dipak Kalra on behalf of the EXPAND Consortium

Digital Humanities in a scientific museum context: the case of KMKG-MRAH

IT Challenges for the Library and Information Studies Sector

A guide to the lifeblood of DAM:

Design of a Federation Service for Digital Libraries: the Case of Historical Archives in the PORTA EUROPA Portal (PEP) Pilot Project

local content in a Europeana cloud for small & medium content providers

Digital libraries of the future and the role of libraries

T HE I NFORMATION A RCHITECTURE G LOSSARY

The Rutgers Workflow Management System. Workflow Management System Defined. The New Jersey Digital Highway

Databases & Data Infrastructure. Kerstin Lehnert

Cite My Data M2M Service Technical Description

ONTOLOGY-BASED APPROACH TO DEVELOPMENT OF ADJUSTABLE KNOWLEDGE INTERNET PORTAL FOR SUPPORT OF RESEARCH ACTIVITIY

Standard 1: Learn and develop skills and meet technical demands unique to dance, music, theatre/drama and visual arts.

From Stored Knowledge to Smart Knowledge

The ECB Statistical Data Warehouse: improving data accessibility for all users

in a nutshell EUROPEANA FASHION disclosing European fashion heritage online Marco Rendina Europeana Fashion International Association

Role Description Curator - Digital Assets

Semantically Enhanced Web Personalization Approaches and Techniques

1.2. The key benefits of the shift to the cloud The emergence of No-code metadata driven application platforms... 4

ONE YEAR COURSES FASHION IMAGE & STYLING INTENSIVE

your terminology as a part of the semantic web recommendations for design and management

Don t Build the Zoo: Discover Content in its Natural Habitat Written By: Chris McKinzie CEO & Co-Founder

Sustainable Internet marketing technical framework concept

Checklist: Persistent identifiers

OVERVIEW OF JPSEARCH: A STANDARD FOR IMAGE SEARCH AND RETRIEVAL

ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT

of the public interface service and will also act as the national aggregator for Europeana.

3D models and IPR based on the experiences of the 3D-ICONS Project

Websites & Copyright. INFORMATION SHEET G057v12 April info@copyright.org.au

Building next generation consortium services. Part 3: The National Metadata Repository, Discovery Service Finna, and the New Library System

The FAO Open Archive: Enhancing Access to FAO Publications Using International Standards and Exchange Protocols

CAPZLES TUTORIAL INTRODUCTION

Digital Preservation Strategy,

EFFECTS+ Clustering of Trust and Security Research Projects, Identifying Results, Impact and Future Research Roadmap Topics

Introduction. Metadata is

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

Publishing Europe s Television Heritage on the Web.

Semantic Exploration of Archived Product Lifecycle Metadata under Schema and Instance Evolution

How To Use The Alabama Data Portal

Information Standards on the Net

One Complete Intranet Solution

AAC Road Map. Introduction

Digital Collecting Strategy

AN AUTOMATIC AND METHODOLOGICAL APPROACH FOR ACCESSIBLE WEB APPLICATIONS

Monitoring and Reporting Drafting Team Monitoring Indicators Justification Document

The AXIS Conceptual Reference Model (Concepts and Implementation)

Creative media and digital activity

EUR-Lex 2012 Data Extraction using Web Services

The National Arts Education Standards: Curriculum Standards <

Healthcare, transportation,

METADATA STANDARDS AND GUIDELINES RELEVANT TO DIGITAL AUDIO

Open Government Data in Austria --- Organisation, Procedures and Uptake

WATER-RELATED EDUCATION, TRAINING AND TECHNOLOGY TRANSFER - Finding Information - Paul Nieuwenhuysen

MULTILINGUAL ACCESS TO CONTENT THROUGH CIDOC CRM ONTOLOGY

Free web-based solution to manage photographs that could be used to manage collection items online if there is a photo of every item.

Checklist for a Data Management Plan draft

Building the European HEA: Building quality in information systems

How To Write A Blog Post On Globus

DATA MANAGEMENT PLAN DELIVERABLE NUMBER RESPONSIBLE AUTHOR. Co- funded by the Horizon 2020 Framework Programme of the European Union

Impelsys: Your Partner for Digital Product Development & Commercialization

Assessing the Opportunities Presented by the Modern Enterprise Archive

Knowledge Management and elearning in Professional Development

The Importance of a Digital Library in Modern Business

BUILDING A BIODIVERSITY CONTENT MANAGEMENT SYSTEM FOR SCIENCE, EDUCATION, AND OUTREACH

Transcription:

SCIentific RESearch and Information Technology Ricerca Scientifica e Tecnologie dell'informazione Vol 4, Issue 2 (2014), 127-132 e- ISSN 2239-4303, DOI 10.2423/i22394303v4n2p127 CASPUR- CIBER Publishing, http://caspur- ciberpublishing.it METADATA MANAGEMENT IN EUROPEANA PHOTOGRAPHY *Royal Museums of Art and History Brussels, Belgium Abstract Nacha Van Steen* This paper describes the metadata management experience of the Europeana Photography project, a digitization action with final aim of making available in the internet over 430.000 items of early photographs with historical, cultural and artistic value, belonging to the first 100 years of the art of photography. The metadata converged in Europeana, the European Digital Library which collects about 31 million digitized items of European cultural heritage. It illustrates the way the Europeana Photography project has created and enriched metadata of images being delivered to Europeana through the project, and the context in which it has done so. Keywords Europeana Photography, semantic web, vocabulary Whether you choose to see the current possibilities as a distinct evolution of the internet, web 3.0, or rather share the opinion that those possibilities were present form the first, the semantic web is the future of online life, for cultural heritage and their guardian institutions just as well. While we re evolving to a place where your smartphone, - watch or tablet becomes your personal butler, knowing what you want before you even realize it yourself, memory institutions stay rather slow on the uptake for various reasons financial, a matter of priorities, or simply because the cultural heritage sector is often not the envisioned target audience of technical and technological developements. However, as the Europeana Photography project 1 is coming to a close at the end of January 2015, having delivered over 400.000 photos from the period 1839-1939 to the Europeana 2 portal, we d like to look back at the metadata management within the project and share with you how we tried to play a part in delivering data that is not only interesting, but also rich in metadata and description, and searchable in a large variety of languages. For your enjoyment, and as a teaser, Fig. 1-4 give you an idea of the content as delivered by Europeana Photography. 1 Project under the 2007-2013 CIP framework by the European Commission 2 www.europeana.eu Fig. 1: Gerona. Paseo Central de la Dehesa. Photo by Desconegut, 1900, courtesy of Ajuntament de Girona. Public Domain marked For us as a project, delivering content from 19 partners from 13 different European member states, metadata management and quality control was a key issue to guarantee high quality metadata on Europeana. While the images may speak for themselves, and include masterpieces from eminent private and public photo agencies, institutions and museums, they still are in danger of getting lost on a platform that operates on the scale of Europeana. Good metadata is key when you want your information found, discussed and shared, so from the beginning, good metadata was the rule within our consortium.

(2014), n. 2 N. Van Steen This started with defining the metadata elements which we considered to be elementary, necessary, and mandatory for all project partners. 3 Luckily for us, there was a shared sensibility within the consortium, but even at that time, the differences in approach between partners became visible. As a consortium grouping both private partners, who depend on high quality data and metadata in order to supply whatever demands of their clients, and public partners, who tend to breach the subject of metadata traditionally from a scientific point of view, this was both a very interesting contrast as well as a point of attention in dealing with the different datasets. Fig 2: Portrait of Count Mišelio de Boré. Photo by Fotoateljé "Kaufmann & Kesseler", 1868, courtesy of Kretingos muziejus. Public Domain marked 3 Europeana Photography deliverable 2.1: Content seminar proceedings, www.europeana- photography.eu/index.php?en/115/deliverables The Europeana Photography vocabulary After setting out the minimal requirements, it was possible to provide the partners with a software platform which would allow them to upload, map and transform their own in- house databases into a metadata schema compatible with the Europeana portal. This is the MINT online metadata mapping tool provided by the National Technical University of Athens (NTUA), in use for many other Europeana- feeder projects, which was customized to suit our needs, and came complete with bookmarks and a user manual. Then, a second issue had to be addressed: there were not only differences in approach when creating metadata, but also 12 different languages in use. To us as a consortium, it was not interesting enough to simply provide our partners data as small photo albums within a larger one, we wanted to reach our public together, and across language barriers. So, we created a vocabulary. The vocabulary was to be a thesaurus, as it provided us with structure and control, while not being as time consuming to create as an ontology we needed it ready for use within the year. It allowed us to add synonyms and a variety of languages, and could be created rather easily in a spreadsheet. Because we wanted to achieve the highest quality of vocabulary possible, we used the sources available to us: the Art and Architecture Thesaurus from the Getty Institute 4 for the techniques, the headers from IPTC 5 for keywords, the information gathered throughout the Sepia 6 project, but mainly, we drew on the experience and expertise of our partners, all experts in their field, and with great knowledge of the needs of their collections. Creating the shared Europeana Photography vocabulary meant creating a tool that allowed us to tag our metadata in the languages of all partners (since its conception, it has become available in 16 languages, including Chinese and Hebrew), aligning our collections to improve 4 www.getty.edu/research/tools/vocabularies/aat/. The translations of the technical concepts will also be delivered to the AAT itself, enriching their vocabulary and adding to the sustainability of ours. 5 International Press Telecommunications Council, www.iptc.org 6 Safeguarding European Photographic Images for Access. Some documentation is available here: www.ica.org/5671/paag- resources/publications- for- archivists- managing- photograph- and- film- collections.html 128

(2014), n. 2 Metadata management in Europeana photography searchability on Europeana, to more easily identify underlying themes in the collections, and to present the collections as a whole. It meant, from a more technical point of view, identifying those metadata elements who would most benefit from the multilinguality and uniformisation, without losing the individual characters from the datasets, and implementing the vocabulary in the delivery workflow. Therefore, we decided to focus our vocabulary on three main topics: the photographic techniques used in the creation of the original, the keywords describing the image, and the reason for which the image was created, which we dubbed photographic practice 7. We organized the concepts in a thesaurus structure, finding middle ground between high detail and rich descriptions, high quality of both concepts, translations and structure, and usability for partners as well as the public. Using a spreadsheet for the first drafts allowed to drag and drop, add new concepts easily, and share fast and simple between partners. Using the thesaurus structure allowed us to make use of synonyms, broader and narrower terms, but also to add scope notes and sources, again improving the quality of the vocabulary. The vocabulary was then added, either in our partners local databases, or in the mapping stages before delivery, to the existing metadata. Further enrichment was done by Europeana itself, who, upon delivery of the content from all partners, matched agents names, place names, time periods and keywords to databases as DBPedia and Geonames, further improving searchability. Throughout the whole process, quality control was a key issue for us, from the digitization of the images supported by fact sheets 8 summarizing the available knowledge on the subject into an easily accessible format -, over the development of the vocabulary by experts, to the extensive use of the available resouces by the partners. We needed our partners to deliver rich, multilingual data to Europeana to do justice to the images they keep. Creating a vocabulary on behalf of a project of course also has downsides: the timeframe in which to work is limited, therefore not all wishes can be granted; it is dependent on the input of the 7 The entire vocabulary, in all 16 languages, is available online at bib.arts.kuleuven.be/photovocabulary/en.html 8 www.europeana- photography.eu/index.php?en/117/documents partners, who may have their own priorities; and it is project- driven, so not all themes are covered as they are/were not pertinent to the project. Enriched metadata as part of the semantic web: a web of opportunities While enriching metadata with vocabularies, either specifically created or existing ones, improves searchability and multilinguality dramatically and immediately, it s important to see it in the larger context of the semantic web, and understand and discover its potential for the cultural heritage world. Fig. 3: Detail from The Birth of Christ. Anonymous, courtesy of KU Leuven. Public Domain marked The semantic web is not a new concept: it has been around for over a decade, but is slowly finding footing in the cultural heritage community. It is, depending on the sources you wish to follow, either a third phase in the evolution of the internet, and therefore often considered synonymous to web 3.0, or a fulfillment of potential of things already there at the invention of the world wide web. 9 Either way, to be able to consider its potential, a basic understanding of the concept is useful. The semantic web harvests all the knowledge available on the internet to deliver you highly personalized content. To do so, it needs to understand the available information, so content creators (publishers, memory institutions, or even people on social media) need to identify their content: which strings of letters are names, 9 www.w3.org/2001/sw 129

(2014), n. 2 N. Van Steen which are place names, and how do they relate to each other? These relationships between concepts then create a second layer of information in the background, and make it possible to link data to each other: you create a web of semantically 10 identified information. This allows you for instance to compare flight prices, get personalized publicity, of get introduced to the right people on social media. A new world of interaction, exploration and possibilities opens up, at a price. 11 When contributing to the semantic web, whether it s information on the object in your memory institution or highly specialized vocabularies, you need to think first of the license under which you will publish your content, and find a balance between protecting what is your intellectual property and what you wish to share. In practice, an open license is virtually necessary. spreadsheet, and SKOSifying 14 the vocabulary, before publishing it online under an open license, and adding the links in our metadata. Making use of, and contributing to, the possibilities of the semantic web means sharing your stories beyond traditional channels, making them available to interest groups you hadn t thought of, or whose languages you do not speak. It means showing them in context, and making them more sustainable through sharing of the information. And it means engaging a larger audience in new ways: from here, it s a very short step to virtual exhibitions Europeana Photography has created the All our Yesterdays virtual exhibition 15, based on our partners images crowdsourcing or augmented reality applications. While Europeana Photography may be coming to the end of its life as a project, the group of partners will stay together, and for this reason founded an association: the expertise and drive developed thanks to Europeana Photography will continue on in the Photoconsortium 16, exploring the different roads to take with our content, providing innovative services related to the photographic heritage and sharing our knowledge. Fig. 4: Balkan Wars (1912-1913). Commander of 51th Infantry Regiment gives the order to attack. Photo by Karastoyanov, D., 1912, courtesy of NALIS Foundation. Public Domain marked Next, you identify your content with URIs 12, and you link your data through existing or bespoke software 13, being careful to use standards every step of the way. For the Europeana Photography project, this meant adding URIs in our 10 Information that has been identified by its meaning, that has been defined 11 Generally speaking, the infringement on privacy and the question on the trustworthiness of the links are considered the greatest threats to the system. 12 Uniform Resource Identifiers. URLs Uniform Resource Locators can be considered a subtype of URIs 13 The Europeana Photography project works with MINT, developed by NTUA, for both metadata and vocabulary mapping. We also collaborate with the projects Linked Heritage and Athena Plus, who have developed a Terminology Management Platform precisely for this kind of vocabulary linking. 14 Simple Knowledge Organisation System, standard for organizing concepts; classifies resources in terms of broader or narrower, preferred and alternate labels and allows for quick publication of thesauri and glossaries to the internet. 15 www.earlyphotography.eu and in the App Store 16 www.photoconsortium.net 130

(2014), n. 2 Metadata management in Europeana photography Fig. 5: Europeana Photography consortium during the Europeana Photography plenary meeting at Vilnius, Latvia, 10 September 2013. Photo by Vaidotas Aukštaitis (Lithuanian Art Museum), 2013. This photo has been made with an ICA photocamera (Dresden, ca. 1920), slightly modified for contemporary use. CC- BY- NC- ND. 131

(2014), n. 2 N. Van Steen REFERENCES Bachi, V., Debernardi, E. & A. Fresa (2014). D4.3.2 Enrichment Report (second release) preliminary version. To be retrieved from http://www.europeana- photography.eu/index.php?en/115/deliverables Bruce, T.R., & Hillmann, D.I. (2004). The continuum of metadata quality: defining, expressing, exploiting. Retrieved from http://www.ecommons.cornell.edu/handle/1813/7895 Feigenbaum, L., Herman, I. et al. (2007). The semantic web in action. Retrieved from http://www.thefigtrees.net/lee/sw/sciam/semantic- web- in- action Harping, P. (2014). The Getty vocabularies and Linked Open Data: introduction and editorial perspective. Retrieved from http://www.getty.edu/research/tools/vocabularies/linked_data_getty_vocabularies.pdf Hassanzadeh, O. (2011). Introduction to Semantic Web Technologies & Linded Data. Retrieved from http://www.cs.toronto.edu/~oktie/slides/web- of- data- intro.pdf Herman, I. (2009). Introduction to the Semantic Web (tutorial). Retrieved from http://www.w3.org/2009/talks/0615- SanJose- tutorial- IH/Slides.pdf Margaritopoulos, T., Margaritopoulos, M. et al. (2008). A conceptual framework for metadata quality assessment. Retrieved from http://dcpapers.dublincore.org/pubs/article/viewfile/923/919 Truyen, F. (2012). D2.1 Content seminar proceedings. Retrieved from http://www.europeana- photography.eu/index.php?en/115/deliverables Van Steen, N. (2013). D4.1 EuropeanaPhotography Vocabulary Definition. Retrieved from http://www.europeana- photography.eu/index.php?en/115/deliverables 132