Mapping VRA Core 4.0 to the CIDOC/CRM ontology



Similar documents
Mapping Encoded Archival Description to CIDOC CRM

From MARC21 and Dublin Core, through CIDOC CRM: First Tenuous Steps towards Representing Library Data in FRBRoo

Definition of the CIDOC Conceptual Reference Model

CIDOC-CRM Extensions for Conservation Processes: A Methodological Approach

ARIADNE CONSERVATION DOCUMENTATION SYSTEM: CONCEPTUAL DESIGN AND PROJECTION ON THE CIDOC CRM. FRAMEWORK AND LIMITS

INFORMATION INTEGRATION: MAPPING CULTURAL HERITAGE METADATA INTO CIDOC CRM CARRASCO, L. B., THALLER, M., CARVALHO, J. R. ***

Definition of the CIDOC Conceptual Reference Model

Facilitating access to cultural heritage content in Czechia: National Authority Files and INTERMI project

OWL based XML Data Integration

Information Services for Smart Grids

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Application of ontologies for the integration of network monitoring platforms

Some Methodological Clues for Defining a Unified Enterprise Modelling Language

CRM dig : A generic digital provenance model for scientific observation

Concept for an Ontology Based Web GIS Information System for HiMAT

Ontology and automatic code generation on modeling and simulation

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

Cataloging in the Cloud: Shared Shelf and ArchaeoCore

Monika Hagedorn-Saupe: Museums in Europeana ATHENA Linked Heritage LIDO

Context Aware Mobile Network Marketing Services

Business Process Models as Design Artefacts in ERP Development

Trailblazing Metadata: a diachronic and spatial research platform for object-oriented analysis and visualisations

A Semantic Approach for Access Control in Web Services

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

Implementing Ontology-based Information Sharing in Product Lifecycle Management

Semantic Search in Portals using Ontologies

Databases & Data Infrastructure. Kerstin Lehnert

A Methodology for the Development of New Telecommunications Services

A Generic Database Schema for CIDOC-CRM Data Management

Service Oriented Architecture

technische universiteit eindhoven WIS & Engineering Geert-Jan Houben

UNIMARC, RDA and the Semantic Web

Secure Semantic Web Service Using SAML

Big Data in the Digital Cultural Heritage

Semantic Exploration of Archived Product Lifecycle Metadata under Schema and Instance Evolution

Semantic Information on Electronic Medical Records (EMRs) through Ontologies

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

Lightweight Data Integration using the WebComposition Data Grid Service

ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004

Core Enterprise Services, SOA, and Semantic Technologies: Supporting Semantic Interoperability

A generic approach for data integration using RDF, OWL and XML

Improving EHR Semantic Interoperability Future Vision and Challenges

ONTOLOGY-BASED APPROACH TO DEVELOPMENT OF ADJUSTABLE KNOWLEDGE INTERNET PORTAL FOR SUPPORT OF RESEARCH ACTIVITIY

SWAP: ONTOLOGY-BASED KNOWLEDGE MANAGEMENT WITH PEER-TO-PEER TECHNOLOGY

Supporting Change-Aware Semantic Web Services

Knowledge Management

Information for the Semantic Web. Procedures for Data Integration through h CIDOC CRM Mapping

European standards for the documentation of historic buildings and their relationship with CIDOC-CRM

USAGE OF BUSINESS RULES IN SUPPLY CHAIN MANAGEMENT

Towards the Russian Linked Culture Cloud: Data Enrichment and Publishing

FRBR. object-oriented definition and mapping to FRBR ER (version 1.0)

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

Automatic Timeline Construction For Computer Forensics Purposes

Modern Databases. Database Systems Lecture 18 Natasha Alechina

Online multilingual generation of Cultural Heritage content

Data Integration Hub for a Hybrid Paper Search

On the general structure of ontologies of instructional models

DC2AP: A Dublin Core Application Profile to Analysis Patterns

Managing enterprise applications as dynamic resources in corporate semantic webs an application scenario for semantic web services.

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

From Databases to Natural Language: The Unusual Direction

Joint Steering Committee for Development of RDA

INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS

Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM)

Software Engineering. System Models. Based on Software Engineering, 7 th Edition by Ian Sommerville

SEARCH The National Consortium for Justice Information and Statistics. Model-driven Development of NIEM Information Exchange Package Documentation

Designing a Semantic Repository

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

An Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration

The Ontological Approach for SIEM Data Repository

Transcription:

1 st Workshop on Digital Information Management March 30-31, 2011 Mapping VRA Core 4.0 to the CIDOC/CRM ontology Panorea Gaitanou, Manolis Gergatsoulis Database and Information Systems Group (DBIS) Laboratory on Digital Libraries and Electronic Publishing Department of Archives and Library Science Ionian University {rgaitanou@ionio.gr, manolis@ionio.gr}

Presentation Overview Cultural heritage domain Interoperability issues Ontology-based integration Our integration scenario Mapping VRA Core 4 to the CIDOC/CRM ontology Brief presentation of VRA Core 4.0 and CIDOC/CRM Mapping Description Language (MDL) Mapping VRA elements to CIDOC/CRM paths Conclusions Future directions References 2

Cultural heritage domain With the advent of new digital information technologies, access to cultural heritage repositories is obviously improved. The documentation, management and preservation of cultural artifacts and collections still seems to be rather a complex task. Even in homogeneous environments (relational databases etc), there are problems in the exchange of information, because of different structures and syntaxes used. In cultural heritage collections, the problem of heterogeneity is even more complex, as these collections contain multiple material. 3

Interoperability issues The continuous development of various metadata schemas leads to the need to integrate all these standards, in such a way that data and information exchange and sharing can be fully achieved. Cultural heritage communities explore several methodologies: to reduce search time to guarantee unified access to these resources (under a global schema) 4

Ontology-based integration "An ontology is a formal, explicit specification of a shared conceptualization". (Gruber, 1993) Ontologies can play a leading role in this area, as they are considered to be an important building block for integration architectures. provide the means for defining common vocabularies, representing the domain knowledge, facilitating at the same time knowledge sharing and reuse among heterogeneous and distributed application systems. can act as a mediating schema that semantically integrates all meta(data) 5

Our integration scenario 6

VRA Core 4.0 (1/2) Visual Resources Association's Data Standards Committee + Network Development and MARC Standards Office of the Library of Congress (LC) Provides guidance on describing works of visual culture, collections, as well as images that document them. Two XML Schema versions Unrestricted version (specifies the basic structure and imposes no restrictions on the values entered into any of the elements, sub-elements, or attributes) Restricted version (extends the unrestricted one by imposing controlled type lists and date formats) 7

VRA Core 4.0 (2/2) Record types: Work, Image, Collection 19 elements (with their subelements) (agent, culturalcontext, date, description, inscription, location, material, measurements, relation, rights, source, stateedition, styleperiod, subject, technique, textref, title, worktype) global attributes (datadate, extent, href, pref, refid, rules, source, vocab, xml:lang) 8

A VRA record in XML <?xml version="1.0" encoding="utf-8"?> <vra> <work id="w_000777" refid="000597" source="history of Art Visual Resources Collection, UCB"> <agentset> <agent> <name type="personal">cropsey, Jasper Francis</name> <dates type="life"> <earliestdate>1823</earliestdate> <latestdate>1900</latestdate> </dates> </agent> </agentset> <dateset> <date type="creation"> <earliestdate>1860</earliestdate> <latestdate>1860</latestdate> </date> </dateset> <inscriptionset> <inscription> <author vocab="ulan" refid="500012491">cropsey, Jasper Francis</author> <position>lower center</position> <text>autumn-on the Hudson River/J. F Cropsey/London 1860</text> </inscription> </inscriptionset> </work> </vra> 9

CIDOC Conceptual Reference Model (CIDOC/CRM, Version 5.0.2) CIDOC Documentation Standards Group (1999) formal extensible ontology, which aims at providing a conceptual representation of cultural heritage domain, promoting semantic interoperability and integration. object-oriented model composed of entities, organized into a hierarchy, and semantically related to each other properties. targets to cover contextual information: the historical, geographical, and theoretical background in which individual items are placed and which gives them much of their significance and value defines the complex interrelationships between objects, actors, events, places, and other concepts used in the cultural heritage domain consists of a hierarchy of 86 classes and 137 properties. 10

A sample of CIDOC/CRM properties Property Id and Name Entity - Domain Entity - Range P1 is identified by (identifies) E1 CRM Entity E61 Appellation P2 has type (is type of) E1 CRM Entity E55 Type P4 has time-span (is time span of) P14 carried out by (performed) P53 has former or current location (is former or current location of) P108 has produced (was produced by) E2 Temporal Entity E7 Activity E18 Physical Thing E12 Production E52 Time-Span E39 Actor E53 Place E24 Physical Man-Made Thing 11

Mapping methodology Path-oriented approach: a mapping from a source schema to a target schema transforms each instance of the source schema into a valid instance of the target schema. VRA paths based on XPath equivalent CIDOC/CRM paths. VRA path: a sequence of VRA elements and subelements, starting from the schema root element <vra> separated by the slash symbol (/). CIDOC/CRM path: a chain in the form entity-property-entity (e-p-e), such that the entities associated by a property correspond to the property's domain and range. 12

Mapping Description Language (MDL) The syntax of MDL (in EBNF notation) is: 13

Agent element (work level) (1/7) Element <agent> Subelements: <name> Attribute: type Values: personal, corporate, family, other <dates> Attribute: type Values: life, activity, other Subelements: <earliestdate> <latestdate> <culture> <role> <attribution> 14

Agent element (work level) (2/7) VRA: work/agentset/agent CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor VRA: work/agentset/agent/name[@type] CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type -> E55 Type 15

Agent element (work level) (3/7) VRA: work/agentset/agent/name[@type] Possible values for type attribute= personal, corporate, family, other. The value of the type attribute of the subelement name in the VRA paths defines different equivalent CIDOC/CRM paths. 16

Agent element (work level) (4/7) VRA: work/agentset/agent/name[@type= personal ] 1) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type > E55 Type (instance= personal ) Alternatively, 2) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E21 Person -> Ρ131 is identified by -> E82 Actor Appellation VRA: work/agentset/agent/name[@type= corporate ] 1) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type > E55 Type (instance= corporate ) Alternatively, 2) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E40 Legal Body -> Ρ131 is identified by -> E82 Actor Appellation VRA: work/agentset/agent/name[@type= family ] 1) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type > E55 Type (instance= family ) Alternatively, 2) CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E74 Group -> Ρ131 is identified by -> E82 Actor Appellation 17

Agent element (work level) (5/7) VRA: work/agentset/agent[name[@type= personal ]]/dates[@type] Possible values for the type attribute of the dates subelement: life, activity, other 18

Agent element (work level) (6/7) VRA: work/agentset/agent[name[@type= personal ]]/dates[@type= life ]/earliest Date CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E21 Person -> P98B was born -> E67 Birth -> P4 has time-span -> E52 Time-Span -> P78 is identified by -> E50 Date VRA: work/agentset/agent[name[@type= personal ]]/dates[@type= life ]/latestd ate CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E21 Person -> P100B died in -> E69 Death -> P4 has time-span -> E52 Time-Span -> P78 is identified by -> E50 Date 19

Agent element (work level) (7/7) VRA: work/agentset/agent/culture CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E21 Person -> P107B is current or former member of -> E74 Group VRA: work/agentset/agent/role CIDOC/CRM: E24 Physical Man-Made Thing > P108B was produced by > E12 Production > P14 carried out by [P14.1 in the role of > E55 Type] > E21 Person 20

Agent element (work level) (type=personal) 21

Agent element (work level) (type=corporate) VRA: work/agentset/agent[name[@type= corporate ]]/dates[@type= activity ]/ earliestdate CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E40 Legal Body -> P92B was brought into existence by -> E63 Beginning of Existence -> P4 has time-span -> E52 Time- Span -> P78 is identified by -> E50 Date VRA: work/agentset/agent[name[@type= corporate ]]/dates[@type= activity ]/ latestdate CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E40 Legal Body -> P93B was taken out of existence by -> E64 End of Existence -> P4 has time-span -> E52 Time-Span -> P78 is identified by -> E50 Date 22

Agent element (work level) (type=corporate) 23

Rule No MDL rules for the agent element XPath location paths R1 Work {X1} E24 {C1} CIDOC/CRM paths R2 $X1/agentSet {R1} $C1 -> P108B -> E12 {J1} R3 $R1/agent*name/@type = personal + {R5} $J1 -> P14 {S2} -> E21 {J5} R4 $R1/agent*name/@type = corporate + {R10} $J1 -> P14 {S3} -> Ε40 {J10} R5 $R1/agent*name/@type = family + {R15} $J1 -> P14 {S4} -> Ε74 {J15} R6 $R5 $R10 $R15/name* $J5 $J10 $J15 -> P131 -> E82 R7 $R5 $R10 $R15/culture* $J5 $J10 $J15 -> P107 -> E74 R8 $R5 $R10 $R15/role* $S2 $S3 $S4 -> P14.1 -> E55 R9 $R5/dates*@type= life + {T1} R10 $T1/earliestDate* $J5 -> P98 > E67 > P4 > E52 -> P78 -> E50 R11 $T1/latestDate* $J5 -> P100B > E69 > P4 > E52 -> P78 -> E50 R12 $R10/dates*@type= activity + {T2} R13 $T2/earliestDate* $J10 -> P92B -> E63 -> P4 -> E52 -> P78 -> E50 R14 $T2/latestDate* 24 $J10 -> P93B -> E64 -> P4 -> E52 -> P78 -> E50

Inscription element (work level) Element <inscription> Subelements: <author> <position> <text> Attribute: type Values: signature, mark, caption, date, text, translation, other 25

Inscription element (work level) 26

MDL rules for the inscription element Rule No XPath location paths R1 Work {X1} E24 {C1} CIDOC/CRM paths R2 $X1/inscriptionSet/inscription {Y1} $C1 -> P128 -> E37 {D1} R3 $Y1/author* $D1 -> P94B-> E65 -> P14 -> E39 -> P131-> E82 R4 $Y1/text* {Y3} $D1 -> P138 -> E33 {D3} R5 $Y3/@type* $D3 -> P2 -> E55 R6 $Y3/@xml:lang* $D3 -> P72 -> E56 R7 $Y1/position* $D1 -> P58 -> E46 27

Rule No MDL Rules for the material, measurements, title, rights elements XPath location paths R1 Work {X1} E24 {C1} CIDOC/CRM paths R2 $X1/materialSet/material* {Z1} $C1 -> P45 -> E57 {A1} R3 $Z1/@type* $A1 -> P2 -> E55 R4 $X1/measurementsSet/measurements* {W1} $C1 ->P43 -> E54 {B1} -> P90 -> E60 R5 $W1/@type* $B1 -> P2 -> E55 R6 $W1/@unit* $B1 -> P91 -> E58 R7 $X1/titleSet/title*{Q1} $C1 -> P102 {S1} -> E35 R8 $Q1/@type* $S1 -> P102.1 -> E55 R9 $X10/rightsSet/rights {F1} $C1 -> P104 -> E30 {G1} R10 $F1/@type* $G1 -> P2 -> E55 R11 $F1/rightsHolder* $G1 -> P75B -> E39 R12 $F1/notes* $G1 ->P3 -> E62 R13 $F1/text* $G1 -> P1 -> E75 28

Agent element (image level) VRA: work/agentset/agent/name[@type] CIDOC/CRM: E24 Physical Man-Made Thing -> P108B was produced by -> E12 Production -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type -> E55 Type VRA: image/agentset/agent/name[@type] CIDOC/CRM: E38 Image -> P94B was created by -> E65 Creation -> P14 carried out by -> E39 Actor -> Ρ131 is identified by -> E82 Actor Appellation -> P2 has type -> E55 Type In the image level, all the subelements of the element agent are represented accordingly to the mappings of the element agent in the work level, with the difference of : E38 Image -> P94B was created by -> E65 Creation 29

Inscription element (image level) VRA: work/inscriptionset/inscription/author CIDOC/CRM: E24 Physical Man-Made Thing -> P128 carries -> E37 Mark -> P94B was created by -> E65 Creation -> P14 carried out by -> E39 Actor > P131 is identified by > E82 Actor Appellation VRA: image/inscriptionset/inscription/author CIDOC/CRM: E38 Image -> P128 carries -> E37 Mark -> P94B was created by -> E65 Creation -> P14 carried out by -> E39 Actor > P131 is identified by > E82 Actor Appellation 30

The use of MDL Metadata exchange from the VRA paths to equivalent CIDOC/CRM paths Transformation of XPath queries posed on XML-based metadata into equivalent queries on the CIDOC/CRM ontology 31

Conclusions Mapping VRA elements to CIDOC/CRM paths proved to be a rather timeconsuming activity, which requires a deep conceptual work. CIDOC/CRM provides: very rich structuring mechanisms for metadata descriptions and an abstract but fine-grained conceptualization for events, objects, agents, things, etc. the combination of a wide range of classes and properties that generates a large number of conceptual expressions that need to be examined thoroughly (classes and properties hierarchies). The use of the classes E12 Production and E65 Creation reveal one of the main characteristics of CIDOC/CRM, event-orientation. The <type> attribute assigned to several elements and subelements defines different semantic mappings, making mapping even more complex The use of several global attributes (e.g. <xml:lang> attribute) expands further the CIDOC/CRM paths 32

Future Research Directions To validate our approach, we plan to test the proposed mappings by running real life examples and evaluate the results. Another direction to explore is the definition of the reverse semantic mappings from the CIDOC/CRM ontology to the VRA Core 4.0, in order to enrich the mapping procedure proposed by our research group 33

References Library of Congress. VRA Core: a data standard for the description of works of visual culture, 2011. http://www.loc.gov/standards/vracore/ International Council of Museums (ICOM). The CIDOC Conceptual Reference, 2010. http://www.cidoc-crm.org/ T. R. Gruber (1993). Toward Principles for the Design of Ontologies Used for Knowledge Sharing. In Guarino, N. and Poli, R. (eds.), Formal Ontology in Conceptual Analysis and Knowledge Representation. Kluwer Academic Th. Stasinopoulou et al. (2007). Ontology-based metadata integration in the cultural heritage domain. In Proceedings of the 10th International Conference on Asian Digital Libraries, ICADL-2007, Hanoi, Vietnam, Lecture Notes in Computer Science (LNCS). M. Gergatsoulis et al. (2010). Mapping Cultural Metadata Schemas to CIDOC Conceptual Reference Model. In S. Konstantopoulos et al. (eds.), SETN 2010, Athens, Greece, May 4-7, 2010, Proceedings. LNAI, Vol. 6040, pages 321-326, Springer-Verlag 34

Thank you for your attention!!!! Questions??? 35