Publishing Linked Data There is no One-Size-Fits-All Formula



Similar documents
Best practices for Linked Data

Evaluation experiment for the editor of the WebODE ontology workbench

Introduction to the Semantic Web

Publishing Linked Data Requires More than Just Using a Tool

GeoLinked Data. An application case/ Un caso de aplicación. Vilches Blázquez, Luis Manuel; Villazón-Terrazas, Boris; Corcho, O.; Gómez Pérez, Asunción

GetLOD - Linked Open Data and Spatial Data Infrastructures

IAAA Grupo de Sistemas de Información Avanzados

Portal Version 1 - User Manual

The use of Semantic Web Technologies in Spatial Decision Support Systems

THE SPATIAL DATA INFRASTRUCTURE OF SPAIN AS AN EXAMPLE OF SUCCESS IN EUROPE

European Forest Information and Communication Platform

Towards the Integration of a Research Group Website into the Web of Data

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Open Data Integration Using SPARQL and SPIN

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

A Java Tool for Creating ISO/FGDC Geographic Metadata

Joint Steering Committee for Development of RDA

Multilingual and Localization Support for Ontologies

Visual Analysis of Statistical Data on Maps using Linked Open Data

D EUOSME: European Open Source Metadata Editor (revised )

Publishing Relational Databases as Linked Data

Open Data. Asunción Gómez-Pérez Ontology Engineering Group Artificial Intelligence Department Universidad Politécnica de Madrid

UNIMARC, RDA and the Semantic Web

Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute

Information Technology for KM

Semantic Interoperability

Fraunhofer FOKUS. Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee Berlin, Germany.

Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Abstract Keywords Introduction

How To Write An Inspire Directive

A generic approach for data integration using RDF, OWL and XML

CatMDEdit Metadata editor

Evaluation experiment of ontology tools interoperability with the WebODE ontology engineering workbench

The Nordic way to International standardization ISO/TC 211

Fund Finder: A case study of database-to-ontology mapping

ADVANCED GEOGRAPHIC INFORMATION SYSTEMS Vol. II - Using Ontologies for Geographic Information Intergration Frederico Torres Fonseca

Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle?

Mining the Web of Linked Data with RapidMiner

Linked Statistical Data Analysis

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Semantic Method of Conflation. International Semantic Web Conference Terra Cognita Workshop Oct 26, Jim Ressler, Veleria Boaten, Eric Freese

Building Geospatial Ontologies from Geographic Database Schemas in Peer Data Management Systems

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

Lift your data hands on session

THE EUROPEAN DATA PORTAL

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches

Survey Results: Requirements and Use Cases for Linguistic Linked Data

Harmonizing Survey Deliverables Emerging Standards and Smart Data Exchange

Presente e futuro del Web Semantico

Enabling embedded maps

The Spatial Data Infrastructure of Spain as an example of success in Europe *

Information and documentation The Dublin Core metadata element set

STAR Semantic Technologies for Archaeological Resources.

CDI/THREDDS Interoperability: the SeaDataNet developments. P. Mazzetti 1,2, S. Nativi 1,2, 1. CNR-IMAA; 2. PIN-UNIFI

Business Process Models as Design Artefacts in ERP Development

Renate Gömpel. Germany on Track for International Standards: RDA

UK Location Programme

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Transcription:

Publishing Linked Data There is no One-Size-Fits-All Formula Asunción Gómez-Pérez Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net asun@fi.upm.es Acknowledgements: O.Corcho, D. Garijo, D. Vila, L.Vilches, B. Villazón Our partners at: BNE, IGN, Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0 LOV SYMPOSIUM: LINKING AND OPENING VOCABULARIES. 18th June, 2012

Table of content 1. The concept 2. Foundations 3. The process 4. Examples Libraries: http://datos.bne.es Geo: http://geo.linkeddata.es/ Metereology:http://aemet.linkeddata.es/ Travelling: http://webenemasuno.linkeddata.es/ 2

Complex queries using data from heterogeneous Web pages http://www.bne.es/ http://elviajero.elpais.com/ Cervantes enthusiast from Germany visiting Madrid and willing to know more about Cervantes work and life http://www.viaf.org/ http://www.aemet *Picture attribution: http://commons.wikimedia.org/wiki/user:gugerell 3

BD BNE BD VIAF BD AEMET BD IGN BD Prisa BD DBpedia Data Integration BNE Ubicado en Alcalá de Henares 1605 El Quijote Año de Publicación Autor birthplace Same as M. Cervantes M. Cervantes Alcalá de Henares M. Cervantes Year of publication creator Don Quixote 1960 Translated into Hebrew VIAF located Alcalá de Henares guía Tapas Siglo de Oro Alcalá de Henares Temperatura 20º 4

Table of content 1. The concept 2. Foundations 3. The process 4. Examples Libraries: http://datos.bne.es Geo: http://geo.linkeddata.es/ Metereology:http://aemet.linkeddata.es/ Travelling: http://webenemasuno.linkeddata.es/ 5

Linked Data: why it is important? Facilitate data integration From heterogeous sources In different formats Different granularity In different languages From different countries Slide adapted from 5min Introduction to Linked Data - Olaf Hartig

(S) models Unique identifiers: URI identify or name a resource Foundations Equivalence links to other datasets Same As Data navigation http://iflastandards.info/ns/fr/frbr/frbrer/c1005 Person Is creator of Cer http://iflastandards.info/ns/fr/frbr/frbrer/c1001 Work Is a Is a Cervantes http://datos.bne.es/resource/xx1718747 Is creator of Cer El Quijote http://datos.bne.es/resource/xx3383563 Same As Same As Cervantes Cervantes http://viaf.org/viaf/17220427 http://dbpedia.org/resource/miguel_de_cervantes

Aligning Models with Owl EquivalentClass Person Foundations http://schema.org/person http://iflastandards.info/ns/fr/frbr/frbrer/c1005 EquivalentClass Person birthplace Person http://xmlns.com/foaf/0.1/person Municipality http://dbpedia.org/resource/municipalities_of_spain EquivalentClass Municipio http://geo.linkeddata.es/ontology/municipio Is a Is a Alcalá de Henares http://dbpedia.org/page/alcal%c3%a1_de_henares Same As Alcalá de Henares http://geo.linkeddata.es/resource/alcalá de Henares Lessons learnt 1. Reuse existing models 2. Align the data and the concepts.

Table of content 1. The concept 2. Foundations 3. The process 4. Examples 9

Methodology Data sources analysis URI Design License definition Reunión bilateral CNIG OEG Proyecto OTALEX 10

Identification and selection of data sources Geographical Spanish Institute Statistical Spanish Institute Spanish National Libraries Metereological Office (AEMET) 11

1. Identification and selection of the data sources Geographic Spanish Institute Multilingual (Spanish, Vasc, Gallician, Catalan) Conceptualization mistmatches Granularity (scale concept) Domain vocabulary Inform. hidrográfica. Embalse, albufera, río, etc. Transportes. Vía desdoblada, Ferrocarril, Unidades Administrativas. Municipio. Particularaties Longitude and latitude Statistic Spanish Institute Monolingual Numerical information Particularaties Geo (textual level) and Temporal 12

1. Identification and selection of the data sources: Geographical information IGN-E

1. Identification and selection of the data sources Statistical information 14

Records in the MARC 21 format 3.9 million bibliographical records 4.2 million authority records Version: November, 2011 15

URI design Meaningful URIs versus Opaque URIs Separate TBox (ontology model) from ABox Base URI http://linkeddata.es/ http://datos.bne.es/ http://geo.linkeddata.es/ http://otalex.linkeddata.es/ OntologyTBox URIs) http://iflastandards.info/ns/fr/frbr/frbrer/c1005 http://phenomenontology.linkeddata.es/ontology/{concept property} http://phenomenontology.linkeddata.es/ontology/municipio We use the Data Cube Vocabulary and/or other vocabularies Data (ABox URIs) http://datos.bne.es/resource/xx1718747 http://geo.linkeddata.es/resource/{resource type}/{resource name} http://geo.linkeddata.es/resource/municipio/badajoz 16

Ontology Ontologies: A set of terms A set of explicit assumptions regarding the intended meaning of the terms. Almost always including concepts and their classification Almost always including properties between concepts Shared understanding of a domain of interest Ontologies expressed in OWL or (S), both based on The NeOn methodology helps to build ontologies 18

2. Vocabulary development Features Lightweight : Taxonomies and a few properties Consensuated vocabularies To avoid the mapping problems Multilingual Linked data are multilingual The NeOn methodology can help to Re-enginer Non ontological resources into ontologie Pros: use domain terminology already consensuated by domain experts Withdraw in heavyweight ontologies those features that you don t need Reuse existing vocabularies 19

The Ontology for BNE: based on IFLA vocabularies

Geolinkeddata ontology hydrographical phenomena (rivers, lakes, etc.) haslat/long W3C Vocabulary WGS84 4 WGS84 Geo Positioning: an vocabulary haslat/long hasstatisticaldata O. Statistics SCOVO scv:dimension scv:item scv:dataset UNESCO EGM / ERM GeoNames hydrontology 4 Ontology for OGC Geography Markup Language hasgeometry haslocation/islocated GML 4 GML hasgeometry FAO FAO Geopolitical ontology on Names and international code systems for territories and groups O. Time W3C Time Ontology Legend Vocabulary for instants, intervals, durations, etc. 4 Classes 33 33 Object Properties 44 44 Data Properties 318 318 Thesaurus reused Following the INSPIRE (INfrastructure for SPatial InfoRmation in Europe) recommendation. hydrontology,scovo, FAO Geopolitcal, WGS84, GML, and Time

3. of BNE From the Data sources Geographic information (Databases) Statistic information (.xsl) Geospatial information Biobliographic information (MARC 21) Different technologies for generation NOR20 (from excell, XML, text files, ) R20 and ODEMapster (from Databases) Geometry2 and SPh2 (for Geo data) Marimba for Libraries

Libraries: Marimba uses the ontology to generate BNE

Marimba links with other resources: VIAF, DNB, SUDOC, LIBRIS, DBpedia BNE

Marimba links with other resources: VIAF, DNB, SUDOC, LIBRIS, DBpedia http://d-nb.info/gnd/11851993x DNB http://viaf.org/viaf/17220427 VIAF Same As Same As http://dbpedia.org/resource/miguel_de_cervantes http://datos.bne.es/resource/xx1718747 Same As DBpedia Same As BNE Same As http://www.idref.fr/026774771/id SUDOC http://libris.kb.se/resource/auth/45369 LIBRIS

Publicación Data publication Metadata publicacion using VOID To facilitate the discovery Register in CKAN your dataset Use to sitemap4rdf to generate the site map Upload the site map to Google and Sindice

Especification Web Interface generation SPARQL queries select distinct COUNT(?Obras) where { http://datos.bne.es/resource/xx1718747 URI Cervantes Is author <http://iflastandards.info/ns/fr/frbr/frbrer/p2010>?obras } http://linkeddata3.dia.fi.upm.es/bne-demo

Table of content 1. The concept 2. Foundations 3. The process 4. Examples Libraries: http://datos.bne.es http://linkeddata3.dia.fi.upm.es/bne-demo Geo: http://geo.linkeddata.es/ Metereology: http://aemet.linkeddata.es/ Travelling: http://webenemasuno.linkeddata.es/ 29

Estacion MADRID,RETIRO 21 :40 26/5/201 1 Djr media del viento: 276 grados Recorrido del viento: 13 Hm V el. media del viento: 2.2 nnls l!ti l.ü!!l.rul.-ª!!ti 1 semana O ir. de la v. max. del viento 251 grados l!ti l.ü!!l.rul.-ª Temperatura del aire 18.5 grados C. l!ti l.ü!!l.rul.-ª Humedad relativa: 75 % l!ti l.ü!!l.rul.-ª Sanuago Composle Pon!evedr ogo L p Temp. del pto. de rocio: 13.9 grados C. Vel max del viento 4. 7 mis Precjpjtacjon: O litros/m2 ~ 938. 4 h Pa l!ti l.ü!!l.rul.-ª l!ti l.nm.20.2 l!ti l.nm.20.2 Pres. reducida al nivel del mar 1 013.6 hpa!!ti 1.ü!!l.rul.-ª Brog11 O CIIev.. 0 o Fofe Porto 0 O Pore<IH o Sln!e Mono da Fe,.,..,'l:'. Nantes o,._ Tours 0 o DV10m on MADRID,RETIRO Capas El Viajero Filtr porf" ch" No hay fotos disponibes o Las chicas de Artón Martfn o Reflejos versalle~cos en 8 Jn paseo por Madrid - o Visitando El E scorial..:.. O <luo<do 0 OCovllhl _,?.,...- PombJI O COmtwl LOV-HIVE (e) d::~e!. Symposium. 18th June 2012 0

There is no One-Size-Fits-All Formula Phase BNE IGN AEMET PRISA INE Modeling DC hydrontology Wgs84 time SSN ontology SIOC Scovo Data cube generation MARiMbA geometry2rdf NOR2O CSV parser CSV parser NOR2O generation DNB VIAF LIBRIS DBPEDIA Silk Silk Silk DBPEDIA DBPEDIA Geolinkeddata.es Geonames Geolinkeddata.es NOR2O Geolinkeddata.es Pubby sitemap4rdf map4rdf SPARQL 31

URI Follow existing design guidelines for new URIs Reuse existing URIs from authoritative sources Models Reuse existing models when available Create new models from authoritative sources Do not forget to align your model with existing models Link Vertical domains usually require specific tools for generation Generic link discovery tools performs well in vertical domains Link to other data sets using Discovery Equivalence links (sameas) Typed links bne:cervantes Person Use sitemap4rdf to allow search engines to find your data Use an iterative-incremental life cycle in your development Lessons learnt Learn about Linked Data with UPM official courses in one week sameas birthplace Dbpedia:cervantes Municipality 32

Publishing Linked Data There is no One-Size-Fits-All Formula Asunción Gómez-Pérez Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net asun@fi.upm.es Acknowledgements: O.Corcho, D. Garijo, D. Vila, L.Vilches, B. Villazón Our partners at: BNE, IGN, Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0 LOV SYMPOSIUM: LINKING AND OPENING VOCABULARIES. 18th June, 2012