Data Integration and Fusion using RDF
|
|
|
- Dwight McCormick
- 9 years ago
- Views:
Transcription
1 Mustafa Jarrar Lecture Notes, Web Data Management (MCOM7348) University of Birzeit, Palestine 1 st Semester, 2013 Data Integration and Fusion using RDF Dr. Mustafa Jarrar University of Birzeit [email protected] 1
2 Watch this lecture and download the slides from Thanks to Anton Deik for helping me preparing this lecture 2
3 Example from the Government Domain Consider this simplified example from the Government domain. Consider three governmental agencies that record information about companies. In this example, we will integrate the three databases by transforming each one into RDF and then concatenating the resultant RDF tables into one table. After that, we investigate the concatenated data and link the different resources. Data integration is simply achieved through concatenation of RDF graphs and linking different resources. It is also achieved when building and executing the queries over the concatenated dataset. Companies DB in Ministry of Justice Companies DB in Chamber of Commerce Companies DB in Ministry of Economy 3
4 Ministry of Justice Ministry of Justice records some information about companies in addition to the advocates that represent the companies. Company Advocate 4
5 Ministry of Justice: To RDF Company Advocate To RDF 5
6 Chamber of Commerce Chamber of Commerce records information about companies in addition to information about companies owners. Company Owner Company_Owner 6
7 Chamber of Commerce: To RDF To RDF 7
8 Ministry of Economy Ministry of Economy records information about companies, their owners, and their advocates. Company Owner Lawyer 8
9 Ministry of Economy: To RDF To RDF 9
10 Integration of RDF Data As simple as S P O S P O S P O 10
11 In our example 11
12 Linking resources How are same entities described in different datasets linked? By linking the Global Identifier, that is, the URI**! Let s have a look: :YH852 owl:sameas : :YH852 owl:sameas :4354JU - Links the company called Palestine Antiques in the three databases. - This is called entity resolution/ disambiguation. :H782YU owl:sameas :L85652r - Links the lawyer called Tony Deik recorded in the ministry of Justice and the ministry of national economy. - This is called entity resolution/ disambiguation. ** Note that in our example we used colons to distinguish URIs. For example :JK452, :H782YU, :Country, and :Name are all URIs. For example: :H782YU might actually be something like: 12
13 Data Integration and Fusion Concatenating RDF graphs and linking entities in different datasets forms an integrated view where applications see all datasets as one integrated database. Source: Christian Bizer 13
14 Practical Session 14
15 Practical Session Description: From previous practical sessions: The central management of students profiles by the ministry of education is becoming an urgent need in the last years. Many students in Palestine move from one university to another, and they need to transfer their academic records. Also, the ministry of higher education needs to certify the diplomas and mark sheets of students. Moreover, there is a need to centrally manage/monitor students financial aids. Therefore, the ministry of higher education has decided to build a national student registry, such that, each semester every university has to send the academic record of every student to the ministry of education. The ministry will then update and integrate the academic records according to the data combined from all universities into the national student registry. The ministry wants to use RDF to integrate this data. Thus, each university must map its relational data (or data in any other model) into RDF, and at the ministry this data is integrated and fused. Map the universities relational data into RDF and integrate and fuse it. 15
16 Practical Session Each two students form a group. Each group must be composed of students from different universities (in their first level degrees). Students are expected to use three different mark sheets from different universities to construct 3 different hypothetical relational data schemes of students records. Students must populate the three databases (pertaining to the 3 different data schemes) with sample data. Students must integrate and fuse all data using RDF. Students are highly recommended to use the ontologies developed in previous practical sessions when mapping and integrating RDF data. Students must write at least three SPARQL queries on the integrated RDF data that involves data from all 3 sources Students must work this practical session using Oracle Semantic Technologies. After finalizing their work, each group will be asked to present their work to all students, so to collect comments and feedback. The final delivery include: (i) Snapshots of the three hypothetical databases and schemes taken from Oracle DB. (ii) The RDF mapping of each database (SPO tables). (iii) The integrated final RDF showing how entities were disambiguated. (iv) The executed SPARQL queries and their results. Note that this final delivery should have the form of a report where discussion of the various steps are expected to be clear. 16
LDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany [email protected],
BPMN 2.0 Descriptive Constructs
Reference: Mustafa Jarrar: Lecture Notes on BPMN 2.0 Descriptive Constructs Birzeit University, Palestine, 2015 BPMN 2.0 Descriptive Constructs Mustafa Jarrar Birzeit University, Palestine [email protected]
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner Petar Ristoski, Christian Bizer, and Heiko Paulheim University of Mannheim, Germany Data and Web Science Group {petar.ristoski,heiko,chris}@informatik.uni-mannheim.de
E6895 Advanced Big Data Analytics Lecture 4:! Data Store
E6895 Advanced Big Data Analytics Lecture 4:! Data Store Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big Data Analytics,
Semantic Interoperability
Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from
A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud.
A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud. Tejas Bharat Thorat Prof.RanjanaR.Badre Computer Engineering Department Computer
Towards a Sales Assistant using a Product Knowledge Graph
Towards a Sales Assistant using a Product Knowledge Graph Haklae Kim, Jungyeon Yang, and Jeongsoon Lee Samsung Electronics Co., Ltd. Maetan dong 129, Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 443-742,
SPARQL UniProt.RDF. Get these slides! Tutorial plan. Everyone has had some introduction slash knowledge of RDF.
SPARQL UniProt.RDF Everyone has had some introduction slash knowledge of RDF. Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of Bioinformatics Get these slides! https://sites.google.com/a/jerven.eu/jerven/home/
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
STAR Semantic Technologies for Archaeological Resources. http://hypermedia.research.glam.ac.uk/kos/star/
STAR Semantic Technologies for Archaeological Resources http://hypermedia.research.glam.ac.uk/kos/star/ Project Outline 3 year AHRC funded project Started January 2007, finish December 2009 Collaborators
Big RDF Data Partitioning and Processing using hadoop in Cloud
Big RDF Data Partitioning and Processing using hadoop in Cloud Tejas Bharat Thorat Dept. of Computer Engineering MIT Academy of Engineering, Alandi, Pune, India Prof.Ranjana R.Badre Dept. of Computer Engineering
DISCOVERING RESUME INFORMATION USING LINKED DATA
DISCOVERING RESUME INFORMATION USING LINKED DATA Ujjal Marjit 1, Kumar Sharma 2 and Utpal Biswas 3 1 C.I.R.M, University Kalyani, Kalyani (West Bengal) India [email protected] 2 Department of Computer
Open Data Integration Using SPARQL and SPIN
Open Data Integration Using SPARQL and SPIN A Case Study for the Tourism Domain Antonino Lo Bue, Alberto Machi ICAR-CNR Sezione di Palermo, Italy Research funded by Italian PON SmartCities Dicet-InMoto-Orchestra
Index. Registry Report
2013.1-12 Registry Report 01 02 03 06 19 21 22 23 24 25 26 27 28 29 31 34 35 Index Registry Report 02 Registry Report Registry Report 03 04 Registry Report Registry Report 05 06 Registry Report Registry
Publishing Relational Databases as Linked Data
Publishing Relational Databases as Linked Data Oktie Hassanzadeh University of Toronto March 2011 CS 443: Database Management Systems - Winter 2011 Outline 2 Part 1: How to Publish Linked Data on the Web
Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute
Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute Abstract Quality catalogue data is essential for effective resource discovery. Consistent
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo
DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo Expected Outcomes You will learn: Basic concepts related to ontologies Semantic model Semantic web Basic features of RDF and RDF
Publishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
Visual Analysis of Statistical Data on Maps using Linked Open Data
Visual Analysis of Statistical Data on Maps using Linked Open Data Petar Ristoski and Heiko Paulheim University of Mannheim, Germany Research Group Data and Web Science {petar.ristoski,heiko}@informatik.uni-mannheim.de
Linked Statistical Data Analysis
Linked Statistical Data Analysis Sarven Capadisli 1, Sören Auer 2, Reinhard Riedl 3 1 Universität Leipzig, Institut für Informatik, AKSW, Leipzig, Germany, 2 University of Bonn and Fraunhofer IAIS, Bonn,
ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG
ON DEMAND ACCESS TO BIG DATA THROUGHSEMANTIC TECHNOLOGIES Peter Haase fluid Operations AG fluid Operations (fluidops) Linked Data & SemanticTechnologies Enterprise Cloud Computing Software company founded
Applying Semantic Web Technologies in Service-Oriented Architectures
Applying Semantic Web Technologies in Service-Oriented Architectures 24 August 2015 Semantic Web for Air Transportation (SWAT) Luis Bermudez - OGC Charles Chen - Skymantics Copyright 2015 Open Geospatial
Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies
Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study Amar-Djalil Mezaour 1, Julien Law-To 1, Robert Isele 3, Thomas Schandl 2, and Gerd Zechmeister
Short Paper: Enabling Lightweight Semantic Sensor Networks on Android Devices
Short Paper: Enabling Lightweight Semantic Sensor Networks on Android Devices Mathieu d Aquin, Andriy Nikolov, Enrico Motta Knowledge Media Institute, The Open University, Milton Keynes, UK {m.daquin,
Executive Summary. of the new Italian legislation on innovative startups
Executive Summary of the new Italian legislation on innovative startups 20 th October 2014 Italian Ministry of Economic Development Minister s Technical Secretariat A new industrial policy for economic
Creating an RDF Graph from a Relational Database Using SPARQL
Creating an RDF Graph from a Relational Database Using SPARQL Ayoub Oudani, Mohamed Bahaj*, Ilias Cherti Department of Mathematics and Informatics, University Hassan I, FSTS, Settat, Morocco. * Corresponding
We have big data, but we need big knowledge
We have big data, but we need big knowledge Weaving surveys into the semantic web ASC Big Data Conference September 26 th 2014 So much knowledge, so little time 1 3 takeaways What are linked data and the
Experiences from a Large Scale Ontology-Based Application Development
Experiences from a Large Scale Ontology-Based Application Development Ontology Summit 2012 David Price, TopQuadrant Copyright 2012 TopQuadrant Inc 1 Agenda Customer slides explaining EPIM ReportingHub
http://opendata.comune.fi.it
The Environmental Observation Web and its Service Applications within the Future Internet OPENDATA IN CITY OF FLORENCE Gianluca Vannuccini Head of the IT Infrastructure Development Office IT Department
ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT
ISSN 1392 124X INFORMATION TECHNOLOGY AND CONTROL, 2005, Vol.34, No.4 ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT Marijus Bernotas, Remigijus Laurutis, Asta Slotkienė Information
A generic approach for data integration using RDF, OWL and XML
A generic approach for data integration using RDF, OWL and XML Miguel A. Macias-Garcia, Victor J. Sosa-Sosa, and Ivan Lopez-Arevalo Laboratory of Information Technology (LTI) CINVESTAV-TAMAULIPAS Km 6
a Data Science initiative @ Univ. Piraeus [GR]
a Data Science initiative @ Univ. Piraeus [GR] The Data Science Lab members June 2015 What is Data Science source: quora.com! Looking at data! Tools and methods used to analyze large amounts of data! Anything
Scope. Cognescent SBI Semantic Business Intelligence
Cognescent SBI Semantic Business Intelligence Scope...1 Conceptual Diagram...2 Datasources...3 Core Concepts...3 Resources...3 Occurrence (SPO)...4 Links...4 Statements...4 Rules...4 Types...4 Mappings...5
THE SEMANTIC WEB AND IT`S APPLICATIONS
15-16 September 2011, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2011) 15-16 September 2011, Bulgaria THE SEMANTIC WEB AND IT`S APPLICATIONS Dimitar Vuldzhev
Drupal. http://www.flickr.com/photos/funkyah/2400889778
Drupal 7 and RDF Stéphane Corlosquet, - Software engineer, MGH - Drupal 7 core RDF maintainer - SemWeb geek Linked Data Ventures, MIT, Oct 2010 This work is licensed under a Creative
LiDDM: A Data Mining System for Linked Data
LiDDM: A Data Mining System for Linked Data Venkata Narasimha Pavan Kappara Indian Institute of Information Technology Allahabad Allahabad, India [email protected] Ryutaro Ichise National Institute of
Department of Defense. Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD
Department of Defense Human Resources - Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD Federation Defined Members of a federation agree to certain standards
The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar
The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data Ravi Shankar Open access to clinical trials data advances open science Broad open access to entire clinical
Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...
Cloud-based Spatial Data Infrastructures for Smart Cities Geospatial World Forum 2015 Hans Viehmann Product Manager EMEA ORACLE Corporation Smart Cities require Geospatial Data Providing services to citizens,
Bigdata Model And Components Of Smalldata Structure
bigdata Flexible Reliable Affordable Web-scale computing. bigdata 1 Background Requirement Fast analytic access to massive, heterogeneous data Traditional approaches Relational Super computer Business
Sieve: Linked Data Quality Assessment and Fusion
Sieve: Linked Data Quality Assessment and Fusion Pablo N. Mendes, Hannes Mühleisen, Christian Bizer Web Based Systems Group Freie Universität Berlin Berlin, Germany, 14195 [email protected] ABSTRACT
Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015
E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing
Master Program SUSTAINABLE ENGINEERING IN PRODUCTION
Master Program SUSTAINABLE ENGINEERING IN PRODUCTION Sustainable engineering in Production Sustainable development is defined as Meeting the needs of the present without compromising the ability of future
BYODs & FAIR Data Stewardship
BYODs & FAIR Data Stewardship Luiz Olavo Bonino [email protected] www.elixir-europe.org Summary FAIR Data stewardship Approach in NL BYOD FAIR Data tooling ecosystem Way of working (FAIR) Data Stewardship
LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA
LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA Milos Jovanovik, Bojan Najdenov, Dimitar Trajanov Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University Skopje,
BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe
BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE Big Data Europe The Big Data Aggregator The Big Data Aggregator: o A general-purpose architecture for processing Big Data o An implementation
Lift your data hands on session
Lift your data hands on session Duration: 40mn Foreword Publishing data as linked data requires several procedures like converting initial data into RDF, polishing URIs, possibly finding a commonly used
Semantic Web Technologies and Data Management
Semantic Web Technologies and Data Management Li Ma, Jing Mei, Yue Pan Krishna Kulkarni Achille Fokoue, Anand Ranganathan IBM China Research Laboratory IBM Software Group IBM Watson Research Center Bei
SmartLink: a Web-based editor and search environment for Linked Services
SmartLink: a Web-based editor and search environment for Linked Services Stefan Dietze, Hong Qing Yu, Carlos Pedrinaci, Dong Liu, John Domingue Knowledge Media Institute, The Open University, MK7 6AA,
HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering
HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering Chang Liu 1 Jun Qu 1 Guilin Qi 2 Haofen Wang 1 Yong Yu 1 1 Shanghai Jiaotong University, China {liuchang,qujun51319, whfcarter,yyu}@apex.sjtu.edu.cn
excellent graph matching capabilities with global graph analytic operations, via an interface that researchers can use to plug in their own
Steve Reinhardt 2 The urika developers are extending SPARQL s excellent graph matching capabilities with global graph analytic operations, via an interface that researchers can use to plug in their own
! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)
! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
A Semantic web approach for e-learning platforms
A Semantic web approach for e-learning platforms Miguel B. Alves 1 1 Laboratório de Sistemas de Informação, ESTG-IPVC 4900-348 Viana do Castelo. [email protected] Abstract. When lecturers publish contents
Linking Maritime Datasets to Dutch Ships and Sailors Cloud - Case studies on Archangelvaart and Elbing. J.A. Entjes July 10th, 2015
Linking Maritime Datasets to Dutch Ships and Sailors Cloud - Case studies on Archangelvaart and Elbing J.A. Entjes July 10th, 2015 Table of Contents Introduction Research Questions Approach and Methodology
Building a Mobile Applications Knowledge Base for the Linked Data Cloud
Building a Mobile Applications Knowledge Base for the Linked Data Cloud Primal Pappachan 1, Roberto Yus 2, Prajit Kumar Das 3, Sharad Mehrotra 1, Tim Finin 3, and Anupam Joshi 3 1 University of California,
Evaluating SPARQL-to-SQL translation in ontop
Evaluating SPARQL-to-SQL translation in ontop Mariano Rodriguez-Muro, Martin Rezk, Josef Hardi, Mindaugas Slusnys Timea Bagosi and Diego Calvanese KRDB Research Centre, Free University of Bozen-Bolzano
Principles of Database. Management: Summary
Principles of Database Management: Summary Pieter-Jan Smets September 22, 2015 Contents 1 Fundamental Concepts 5 1.1 Applications of Database Technology.............................. 5 1.2 Definitions.............................................
OWL: Path to Massive Deployment. Dean Allemang Chief Scien0st, TopQuadrant Inc. [email protected]
OWL: Path to Massive Deployment Dean Allemang Chief Scien0st, TopQuadrant Inc. [email protected] Number of pages Web-Scale Deployment Amount of Data Awareness I m a Web Developer Have you heard
Application of ontologies for the integration of network monitoring platforms
Application of ontologies for the integration of network monitoring platforms Jorge E. López de Vergara, Javier Aracil, Jesús Martínez, Alfredo Salvador, José Alberto Hernández Networking Research Group,
Using Open Source software and Open data to support Clinical Trial Protocol design
Using Open Source software and Open data to support Clinical Trial Protocol design Nikolaos Matskanis, Joseph Roumier, Fabrice Estiévenart {nikolaos.matskanis, joseph.roumier, fabrice.estievenart}@cetic.be
An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials
ehealth Beyond the Horizon Get IT There S.K. Andersen et al. (Eds.) IOS Press, 2008 2008 Organizing Committee of MIE 2008. All rights reserved. 3 An Ontology Based Method to Solve Query Identifier Heterogeneity
Open Data collection using mobile phones based on CKAN platform
Proceedings of the Federated Conference on Computer Science and Information Systems pp. 1191 1196 DOI: 10.15439/2015F128 ACSIS, Vol. 5 Open Data collection using mobile phones based on CKAN platform Katarzyna
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications Gaël de Chalendar CEA LIST F-92265 Fontenay aux Roses [email protected] 1 Introduction The main data sources
GetLOD - Linked Open Data and Spatial Data Infrastructures
GetLOD - Linked Open Data and Spatial Data Infrastructures W3C Linked Open Data LOD2014 Roma, 20-21 February 2014 Stefano Pezzi, Massimo Zotti, Giovanni Ciardi, Massimo Fustini Agenda Context Geoportal
Towards the Integration of a Research Group Website into the Web of Data
Towards the Integration of a Research Group Website into the Web of Data Mikel Emaldi, David Buján, and Diego López-de-Ipiña Deusto Institute of Technology - DeustoTech, University of Deusto Avda. Universidades
Fraunhofer FOKUS. Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee 31 10589 Berlin, Germany. www.fokus.fraunhofer.
Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee 31 10589 Berlin, Germany www.fokus.fraunhofer.de 1 Identification and Utilization of Components for a linked Open Data Platform
