KD2R: a Key Discovery method for semantic Reference Reconciliation
|
|
|
- Elisabeth Norton
- 10 years ago
- Views:
Transcription
1 KD2R: a Key Discovery method for semantic Reference Reconciliation Danai Symeonidou, Nathalie Pernelle and Fatiha Saϊs LRI (University Paris-Sud XI) February, 8th 2013
2 2 Linked Open Data cloud (LOD) LOD contains all the RDF sources in the Web links between them Same as is the most important type of link: combine information given in different data sources The number of already existing links is very small How to create links automatically?
3 3 Reference Reconciliation Problem Dataset1 Dataset2 FirstName: Michael LastName: Jackson SSN: Job: Singer FirstName: Michael LastName: Jackson SSN: Job: Singer FirstName: Michael LastName: Jackson SSN: Job: Teacher
4 4 Reference Reconciliation Problem Dataset1 Dataset2 FirstName: Michael LastName: Jackson SSN: Job: Singer SameAs FirstName: Michael LastName: Jackson SSN: Job: Singer FirstName: Michael LastName: Jackson SSN: Job: Teacher
5 5 Reference Reconciliation Problem Dataset1 Dataset2 FirstName: Michael LastName: Jackson SSN: Job: Singer FirstName: Michael LastName: Jackson SSN: Job: Teacher SameAs SameAs FirstName: Michael LastName: Jackson SSN: Job: Singer
6 6 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts London UK O SOURCE2 Name Located incountry TicketPrice 21 Tate Britain London England Free 22 Royal Academy of Arts London England Free
7 7 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts London UK O SOURCE2 Name Located incountry TicketPrice 21 Tate Britain London England Free 22 Royal Academy of Arts London England Free
8 8 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts London UK O SOURCE2 Name Located incountry TicketPrice 21 Tate Britain London England Free Sim Royal Academy of Arts London England Free
9 9 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds UK UK O Sim(12, Royal Academy London 22) = 0.5 England Free of Arts SOURCE2 Name Located incountry TicketPrice Sim Tate Britain London England Free 22 Royal Academy of Arts London England Free
10 10 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts London UK O SOURCE2 Name Located incountry TicketPrice Name KEY 21 Tate Britain London England Free 22 Royal Academy of Arts London England Free
11 11 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts London UK O SOURCE2 Name Located incountry TicketPrice 21 Tate Britain London England Free Sim. Using keys 1 22 Royal Academy of Arts London England Free
12 12 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O Sim(12, Royal Academy 22) UK = 1 England è SameAs O of Arts SOURCE2 Name Located incountry TicketPrice Sim. 1 Using keys 21 Tate Britain London England Free 22 Royal Academy of Arts London England Free
13 13 Reference Reconciliation Problem How do we decide if two identifiers refer to the same real world entity??? SOURCE1 Name Located incountry TicketPrice 11 Madame Tussauds London UK O 12 Royal Academy of Arts UK England O Solution è Use keys to reconcile data SOURCE2 Name Located incountry TicketPrice Sim. 1 Using keys 21 Tate Britain London England Free 22 Royal Academy of Arts London England Free
14 14 Reference Reconciliation with or without key constraints No knowledge given about the properties: all the properties have the same importance. Knowledge given by an expert: Specific expert rules [Arasu and al. 09, Low and al. 01, Volz and al. 09 (Silk)] Example: max(jaro(phone-number,phone-number), jaro-winkler(ssn,ssn)) > 0.88 Key constraints [Saïs, Pernelle and Rousset 09] Example: haskey( ()((museumname, museumaddress)) ² Problem: when data sources contain numerous data and/or complex ontologies ² Some keys are not obvious to find by the expert. ² Erroneous keys can be given by the expert. Aim: automatic discovery of a complete set of keys from RDF data
15 15 Key discovery methods Supervisedè Learn keys using a set of reconciled data Unsupervisedè No additional information are given Property-based è Guided by the properties Suchanek et al (only single keys) Attencia et al (CWA) Instance-based è Guided by the instances Symeonidou et al (multi keys, OWA)
16 16 Key definition RDF data conform to an OWL2 RL ontology Key for a class expression: a combination of (inverse) properties which identifies uniquely an entity. HasKey( CE ( OPE 1... OPE m ) ( DPE 1... DPE n ) ) x, y, z 1,..., z m, w 1,..., w n : if x (CE) C and ISNAMED O (x) and y (CE) C and ISNAMED O (y) and ( x, z i ) (OPE i ) OP and ( y, z i ) (OPE i ) OP and ISNAMED O (z i ) for each 1 i m and ( x, w j ) (DPE j ) DP and ( y, w j ) (DPE j ) DP for each 1 j n then x = y If we consider haskey(city (Inverse(IsInCity)()) as a key and we have in the dataset : isincity(restaurant1,city1), isincity(restaurant1, city2), isincity(restaurant2,city2) Then we will infer that city1 = city2
17 17 Key Discovery Problem in OWA A set of RDF data sources: each data source conform to an OWL 2 ontology Multivalued properties may exist. Open world assumption (incomplete data) name firstname hasfriend i1 Atencia Manuel i2,i3 i2 Atencia Madalina i3 David Jerôme i2, i4 i4 Chein Michel How to discover keys when we don t know if : i1 =?= i2 =?=i3 =?=i4 hasfriend(i1,i4), hasfriend(i2, i3).?? firstname(i1, Elodie)?
18 18 Key Discovery Problem: our assumptions Unique Name Assumption (UNA): Two distinct URIs refer to two different real world entities. In the LOD, we consider the data sources generated from relational databases or those build in a way the UNA is fulfilled (Yago) i1 <> i2<> i3 <> i4 Two literals that are syntactically different are semantically different (e.g. Napoleon Bonaparte <> Napoleon ) Heuristic 1 - Pessimistic: Not instantiated property è all the values are possible Example: hasfriend(i2, i3), hasfriend(i2, i4) are possible. Instantiated property è only given values are considered Example: not hasfriend(i1, i4)
19 19 Key Discovery Problem: our assumptions A set of property expressions {pe1,, pe n } is a non key for the class c in a data source s i if: Example: {name}, {hasfriend} is a non key A set of property expressions {pe1,, pe n } is a key for the class c in a data source s i if: Example: {firstname}, {name, firstname}, {firstname, hasfriend} are keys {hasfriend, name} are neither a key nor a non key, it is called undetermined key.
20 20 Key Discovery Problem: our assumptions Heuristic 2 -Optimist : Not instantiated property è value not one of the already existing ones Example: not hasfriend(i2, i3), not hasfriend(i2, i1), not hasfriend(i2, i4). Instantiated property è only given values are considered Example: not hasfriend(i1, i4) The same definition for non keys A set of property expressions {pe1,, pe n } is a key for the class c in a data source s i if: pe j, Zpe j (X,Z) Wpe j (Y,W ) or Example : {firstname}, {name, firstname}, {firstname, hasfriend} are keys
21 21 KD2R approach Find all minimal keys that are valid w.r.t the previous definition, in all the considered data sources Scalability Do not check all the combinations of properties Partially scan the data Find first the set of maximal non keys and undetermined keys (inspired from Gordian [Y. Sismanis and al. 2006]) è derive keys from this set. Unlike Gordian, KD2R: is ontology based: subsumption relation is exploited to inherit keys considers multi-valued properties and incomplete information.
22 22 KD2R approach Topological sort of the classes (subsumption). The keys are obtained by selecting the minimal keys of the Cartesian product (w.r.t mappings) of the minimal key sets discovered in the sources S1, S2. Example: K1 = {{name, firstname}, {hasfriend}} K2 = {{firstname}} K 1-2 = { {name, firstname}, {hasfriend, firstname}}
23 23 KD2R approach: Key Finder The set of maximal non keys and undetermined keys is computed on a prefix-tree (a compact representation of the data of one class) Key derivation: Computation of the complement set of each non key and undetermined key Computation of the Cartesian product of the complement sets Selection of the minimal keys. Time complexity: quadratic in terms of number of discovered keys.
24 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} Node cell located City1 {M1} Null City 3 {M3} City 4 {M4} contains Null {M1} P4 P5 Null {M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg rue de Lille {M3} Marylebone Road {M4} Each level represents an attribute of a class Each node describes instances that share the same father-cell value. Each cell contains a value and a list of identifiers (URI List)
25 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France England {M4} located City1 {M1} Null City 4 {M4} contains Null {M1} P4 P5 Null {M4} Name Archaeological {M1} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg Marylebone Road {M4}
26 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} located City1 {M1} Null City 4 {M4} contains Null {M1} P4 P5 Null {M4} Name Archaeological {M1} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg Marylebone Road {M4}
27 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} located City1 {M1} Null City 3 {M3} City 4 {M4} contains Null {M1} P4 P5 Null {M4} Name Archaeological {M1} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg Marylebone Road {M4}
28 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} located City1 {M1} Null City 3 {M3} City 4 {M4} contains Null {M1} P4 P5 Null {M3} Null {M4} Name Archaeological {M1} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg Marylebone Road {M4}
29 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} located City1 {M1} Null City 3 {M3} City 4 {M4} contains Null {M1} P4 P5 Null {M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg Marylebone Road {M4}
30 Pessimistic: Prefix-tree Creation - Step1 incountry located contains museumname museumaddress 1 Greece City Archaeological 44 Pa:ssion Street 2 France S1_p4, S1_p5 19 rue Beaubourg 3 France City Musee d orsay 62, rue de Lille 4 England City Madame Tussauds Marylebone Road incountry Greece {M1} France {M2, M3} England {M4} located City1 {M1} Null City 3 {M3} City 4 {M4} contains Null {M1} P4 P5 Null {M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg rue de Lille {M3} Marylebone Road {M4}
31 Pessimistic: Prefix-tree Creation Step2 incountry located contains Greece {M1} City1 {M1} Null {M1} P4 France {M2, M3} Null City 3 {M3} P5 Null {M3} England {M4} City 4 {M4} Null {M4} Merging the cells of a node Merging nodes Name Archaeological {M1} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} Final Prefix Tree contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4}
32 UNKeyFinder Wax(S1_m2), museumname(s1_m2, Wax ), Prefix tree creation UNKey Finder Maximal undetermined keys and non keys Input: One dataset, one class, a set of known keys Output: set of maximal non keys and undetermined keys Examination of each possible subset of attributes. Recursive method The traversal is top down and left first è When URI List >1 : More than two instances share the same value for a specific subset of attributes The subset of attributes belongs to a UNKey Different prunings: Key Monitonicity Detection of paths describing one entity Use existing inherited keys to avoid exploring sub-trees in the prefix-tree. Non Key anti-monitonicity Use the already computed non keys to avoid exploring sub-trees in the prefix-tree.
33 UNKeyFinder Example We call the UNKeyFinder for the highlighted node Since the URI List is 1 we stop Pruning step (key Monotonicity) incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
34 UNKeyFinder Example We call the UNKeyFinder for the highlighted node incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
35 UNKeyFinder Example We call the UNKeyFinder for the highlighted node incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
36 UNKeyFinder Example We call the UNKeyFinder for the node In the next step we follow the left child of the highlighted node incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
37 UNKeyFinder Example We call the UNKeyFinder for the highlighted node Cell with URI List = 1 Pruning step (1) Cell Musee d orsay with URI List = 1 Pruning step (1) Now we have to merge the children of the node and call UNKeyFinder for the merged node incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
38 UNKeyFinder Example We call the UNKeyFinder for the highlighted node incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address
39 UNKeyFinder Example Since there a cell with URIList> 1 the curunkey is a UNKey incountry Greece { M1} France {M2, M3} England {M4} located City 1{M1} City3{M2, M3} City4{M4} contains Null {M1} P4 {M2, M3} P5 {M2, M3} Null {M4} Name Archaeological {M1} Musee d orsay {M3} Musee d orsay {M3} Madame Tussauds {M4} Address 44 Pa:ssion Street {M1} rue Beaubourg rue de Lille {M3} rue Beaubourg rue de Lille {M3} Marylebone Road {M4} incountry, located, contains, Name, Address incountry, located, contains
40 40 Experiments: OAEI 10 datasets Datasets RDF files #instances Restaurants Dataset Person Dataset Restaurant1.rdf 339 Restaurant2.rdf 1390 Person11.rdf 1000 Peson12.rdf 1000 Person21.rdf 1200 Experiments executed to compare: KD2R keys Expert keys Datasets Classes Property set Restaurants (2 files) Person (3files) Restaurant Address Person Address name, phonenumber, hascategory, hasaddress street, city, Inverse(hasAddress) givenname, state, surname, dateofbirth, socsecurityid, phonenumber, age, hasaddress street, housenumber, postcode, isinsuburb
41 Person Dataset 41 Person dataset consists of 2000 instances of the classes Person and Address.
42 Restaurant Dataset 42 Restaurant dataset describes 1729 instances (classes Restaurant and Address).
43 ChefMoz Dataset instances (class Restaurant) instances of the class Restaurant.
44 Dbpedia Dataset 44 Dbpedia Person è 6 discovered keys instances RDF triples Natural Places è 21 discovered keys instances RDF triples Subclasses of Natural Places Lake è 6 discovered keys BodyOfWater è 17 discovered keys
45 45 Conclusion Approach that discover composite keys in RDF datasets different ontologies (aligned) Unique Name Assumption Experiments: Discovered keys improve the data linking KD2R is scalable thanks to the pruning techniques Ex. Dbpedia Natural Places 5% of data explored
46 46 Future work DAVI approach Keys with N exceptions Key with N number of instances that violate of the definition of the key Conditional keys.
47 QUESTIONS??? 47
48 THANK YOU!!! 48
SAKey: Scalable Almost Key discovery in RDF data
SAKey: Scalable Almost Key discovery in RDF data Danai Symeonidou 1, Vincent Armant 2, Nathalie Pernelle 1, and Fatiha Saïs 1 1 Laboratoire de Recherche en Informatique, University Paris Sud, France, 2
How To Create A Web System
Linked OpenData: Scientific Challenges and Applications Ioana Manolescu-Goujot Leo team INRIA Saclay / U. Paris Sud-11 / CNRS http://team.inria.fr/leo Plan 1. The original Web vision: short recall 2. First
Unique column combinations
Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search
Chapter 8 The Enhanced Entity- Relationship (EER) Model
Chapter 8 The Enhanced Entity- Relationship (EER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Outline Subclasses, Superclasses, and Inheritance Specialization
Database Design Methodology
Database Design Methodology Three phases Database Design Methodology Logical database Physical database Constructing a model of the information used in an enterprise on a specific data model but independent
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner Petar Ristoski, Christian Bizer, and Heiko Paulheim University of Mannheim, Germany Data and Web Science Group {petar.ristoski,heiko,chris}@informatik.uni-mannheim.de
RDF y SPARQL: Dos componentes básicos para la Web de datos
RDF y SPARQL: Dos componentes básicos para la Web de datos Marcelo Arenas PUC Chile & University of Oxford M. Arenas RDF y SPARQL: Dos componentes básicos para la Web de datos Valladolid 2013 1 / 61 Semantic
Semantic Interoperability
Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from
Optimizing Description Logic Subsumption
Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization
Chapter 2: Entity-Relationship Model. E-R R Diagrams
Chapter 2: Entity-Relationship Model What s the use of the E-R model? Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E-R Diagram Extended E-R Features Design of an E-R Database Schema
Characterizing Knowledge on the Semantic Web with Watson
Characterizing Knowledge on the Semantic Web with Watson Mathieu d Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, and Enrico Motta Knowledge Media Institute (KMi), The Open
Ampersand and the Semantic Web
Ampersand and the Semantic Web The Ampersand Conference 2015 Lloyd Rutledge The Semantic Web Billions and billions of data units Triples (subject-predicate-object) of URI s Your data readily integrated
We have big data, but we need big knowledge
We have big data, but we need big knowledge Weaving surveys into the semantic web ASC Big Data Conference September 26 th 2014 So much knowledge, so little time 1 3 takeaways What are linked data and the
Sorting Hierarchical Data in External Memory for Archiving
Sorting Hierarchical Data in External Memory for Archiving Ioannis Koltsidas School of Informatics University of Edinburgh [email protected] Heiko Müller School of Informatics University of Edinburgh
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
Linked Medieval Data: Semantic Enrichment and Contextualisation to Enhance Understanding and Collaboration
: Semantic Enrichment and Contextualisation to Enhance Understanding and Collaboration Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library and Information Science [email protected]
QASM: a Q&A Social Media System Based on Social Semantics
QASM: a Q&A Social Media System Based on Social Semantics Zide Meng, Fabien Gandon, Catherine Faron-Zucker To cite this version: Zide Meng, Fabien Gandon, Catherine Faron-Zucker. QASM: a Q&A Social Media
XML Data Integration
XML Data Integration Lucja Kot Cornell University 11 November 2010 Lucja Kot (Cornell University) XML Data Integration 11 November 2010 1 / 42 Introduction Data Integration and Query Answering A data integration
The Manuscript as Cultural Heritage: Digitisation ++
The Manuscript as Cultural Heritage: Digitisation ++ A Digital Humanities Point of View Not so much a conservation point of view slides on http://www.slideshare.net/gradmans Prof. Dr. Stefan Gradmann Humboldt-Universität
HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering
HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering Chang Liu 1 Jun Qu 1 Guilin Qi 2 Haofen Wang 1 Yong Yu 1 1 Shanghai Jiaotong University, China {liuchang,qujun51319, whfcarter,yyu}@apex.sjtu.edu.cn
DISCOVERING RESUME INFORMATION USING LINKED DATA
DISCOVERING RESUME INFORMATION USING LINKED DATA Ujjal Marjit 1, Kumar Sharma 2 and Utpal Biswas 3 1 C.I.R.M, University Kalyani, Kalyani (West Bengal) India [email protected] 2 Department of Computer
Publishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
Definition of the CIDOC Conceptual Reference Model
Definition of the CIDOC Conceptual Reference Model Produced by the ICOM/CIDOC Documentation Standards Group, continued by the CIDOC CRM Special Interest Group Version 4.2.4 January 2008 Editors: Nick Crofts,
Robust Module-based Data Management
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. V, NO. N, MONTH YEAR 1 Robust Module-based Data Management François Goasdoué, LRI, Univ. Paris-Sud, and Marie-Christine Rousset, LIG, Univ. Grenoble
A Secure Mediator for Integrating Multiple Level Access Control Policies
A Secure Mediator for Integrating Multiple Level Access Control Policies Isabel F. Cruz Rigel Gjomemo Mirko Orsini ADVIS Lab Department of Computer Science University of Illinois at Chicago {ifc rgjomemo
LDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany [email protected],
SmartLink: a Web-based editor and search environment for Linked Services
SmartLink: a Web-based editor and search environment for Linked Services Stefan Dietze, Hong Qing Yu, Carlos Pedrinaci, Dong Liu, John Domingue Knowledge Media Institute, The Open University, MK7 6AA,
BIRCH: An Efficient Data Clustering Method For Very Large Databases
BIRCH: An Efficient Data Clustering Method For Very Large Databases Tian Zhang, Raghu Ramakrishnan, Miron Livny CPSC 504 Presenter: Discussion Leader: Sophia (Xueyao) Liang HelenJr, Birches. Online Image.
XML Data Integration in OGSA Grids
XML Data Integration in OGSA Grids Carmela Comito and Domenico Talia University of Calabria Italy [email protected] Outline Introduction Data Integration and Grids The XMAP Data Integration Framework
Relational Database Design
Relational Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design schema in appropriate
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study Amar-Djalil Mezaour 1, Julien Law-To 1, Robert Isele 3, Thomas Schandl 2, and Gerd Zechmeister
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this
Evaluating Semantic Web Service Tools using the SEALS platform
Evaluating Semantic Web Service Tools using the SEALS platform Liliana Cabral 1, Ioan Toma 2 1 Knowledge Media Institute, The Open University, Milton Keynes, UK 2 STI Innsbruck, University of Innsbruck,
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
Logic and Reasoning in the Semantic Web (part I RDF/RDFS)
Logic and Reasoning in the Semantic Web (part I RDF/RDFS) Fulvio Corno, Laura Farinetti Politecnico di Torino Dipartimento di Automatica e Informatica e-lite Research Group http://elite.polito.it Outline
WIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA. by Zareen Saba Syed
WIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA by Zareen Saba Syed Thesis submitted to the Faculty of the Graduate School of the University of Maryland in partial fulfillment of the requirements
Semantic Web Standard in Cloud Computing
ETIC DEC 15-16, 2011 Chennai India International Journal of Soft Computing and Engineering (IJSCE) Semantic Web Standard in Cloud Computing Malini Siva, A. Poobalan Abstract - CLOUD computing is an emerging
12 The Semantic Web and RDF
MSc in Communication Sciences 2011-12 Program in Technologies for Human Communication Davide Eynard nternet Technology 12 The Semantic Web and RDF 2 n the previous episodes... A (video) summary: Michael
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant
Chapter 2: Entity-Relationship Model! Entity Sets! Relationship Sets! Design Issues! Mapping Constraints! Keys! E-R Diagram! Extended E-R Features! Design of an E-R Database Schema! Reduction of an E-R
Semantics of UML class diagrams
1 Otto-von-Guericke Universität Magdeburg, Germany April 26, 2016 Sets Definition (Set) A set is a collection of objects. The basic relation is membership: x A (x is a member of A) The following operations
Data Validation with OWL Integrity Constraints
Data Validation with OWL Integrity Constraints (Extended Abstract) Evren Sirin Clark & Parsia, LLC, Washington, DC, USA [email protected] Abstract. Data validation is an important part of data integration
HybIdx: Indexes for Processing Hybrid Graph Patterns Over Text-Rich Data Graphs Technical Report
HybIdx: Indexes for Processing Hybrid Graph Patterns Over Text-Rich Data Graphs Technical Report Günter Ladwig Thanh Tran Institute AIFB, Karlsruhe Institute of Technology, Germany {guenter.ladwig,ducthanh.tran}@kit.edu
CSC 742 Database Management Systems
CSC 742 Database Management Systems Topic #4: Data Modeling Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Phases of Database Design Requirement Collection/Analysis Functional Requirements Functional Analysis
13 RDFS and SPARQL. Internet Technology. MSc in Communication Sciences 2011-12 Program in Technologies for Human Communication.
MSc in Communication Sciences 2011-12 Program in Technologies for Human Communication Davide Eynard nternet Technology 13 RDFS and SPARQL 2 RDF - Summary Main characteristics of RDF: Abstract syntax based
Omega Automata: Minimization and Learning 1
Omega Automata: Minimization and Learning 1 Oded Maler CNRS - VERIMAG Grenoble, France 2007 1 Joint work with A. Pnueli, late 80s Summary Machine learning in general and of formal languages in particular
3. The Junction Tree Algorithms
A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin [email protected] 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )
Full and Complete Binary Trees
Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full
IV. The (Extended) Entity-Relationship Model
IV. The (Extended) Entity-Relationship Model The Extended Entity-Relationship (EER) Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of EER Diagrams
Database Management System
UNIT -6 Database Design Informal Design Guidelines for Relation Schemas; Functional Dependencies; Normal Forms Based on Primary Keys; General Definitions of Second and Third Normal Forms; Boyce-Codd Normal
Name-based Approach to Build a Hub for Biodiversity LOD
Name-based Approach to Build a Hub for Biodiversity LOD Yoshitaka Minami a,, Hideaki Takeda a*, Fumihiro Kato a, Ikki Ohmukai a, Noriko Arai a, Utsugi Jinbo b, Shoko Kawamoto c, Satoshi Kobayashi d, and
Techniques to Produce Good Web Service Compositions in The Semantic Grid
Techniques to Produce Good Web Service Compositions in The Semantic Grid Eduardo Blanco Universidad Simón Bolívar, Departamento de Computación y Tecnología de la Información, Apartado 89000, Caracas 1080-A,
A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation
PLDI 03 A Static Analyzer for Large Safety-Critical Software B. Blanchet, P. Cousot, R. Cousot, J. Feret L. Mauborgne, A. Miné, D. Monniaux,. Rival CNRS École normale supérieure École polytechnique Paris
Binary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary
Automating Big Data Management, by DISIT Lab Distributed [Systems and Internet, Data Intelligence] Technologies Lab Prof. Ph.D. Eng.
Automating Big Data Management, by DISIT Lab Distributed [Systems and Internet, Data Intelligence] Technologies Lab Prof. Ph.D. Eng. Paolo Nesi Dipartimento di Ingegneria dell Informazione, DINFO Università
Semantic Description of Distributed Business Processes
Semantic Description of Distributed Business Processes Authors: S. Agarwal, S. Rudolph, A. Abecker Presenter: Veli Bicer FZI Forschungszentrum Informatik, Karlsruhe Outline Motivation Formalism for Modeling
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, and Bhavani Thuraisingham University of Texas at Dallas, Dallas TX 75080, USA Abstract.
COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model
COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model The entity-relationship (E-R) model is a a data model in which information stored
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Joint Steering Committee for Development of RDA
Page 1 of 11 To: From: Subject: Joint Steering Committee for Development of RDA Gordon Dunsire, Chair, JSC Technical Working Group RDA models for authority data Abstract This paper discusses the models
Object-Process Methodology as a basis for the Visual Semantic Web
Object-Process Methodology as a basis for the Visual Semantic Web Dov Dori Technion, Israel Institute of Technology, Haifa 32000, Israel [email protected], and Massachusetts Institute of Technology,
No More Keyword Search or FAQ: Innovative Ontology and Agent Based Dynamic User Interface
IAENG International Journal of Computer Science, 33:1, IJCS_33_1_22 No More Keyword Search or FAQ: Innovative Ontology and Agent Based Dynamic User Interface Nelson K. Y. Leung and Sim Kim Lau Abstract
Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD
Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD Boris Mocialov (H00180016) MSc Software Engineering Heriot-Watt University, Edinburgh April 5, 2015 1 1 Introduction The purpose
Visual Analysis of Statistical Data on Maps using Linked Open Data
Visual Analysis of Statistical Data on Maps using Linked Open Data Petar Ristoski and Heiko Paulheim University of Mannheim, Germany Research Group Data and Web Science {petar.ristoski,heiko}@informatik.uni-mannheim.de
Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees
Learning Outcomes COMP202 Complexity of Algorithms Binary Search Trees and Other Search Trees [See relevant sections in chapters 2 and 3 in Goodrich and Tamassia.] At the conclusion of this set of lecture
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
A Logical Approach to NoSQL Databases
A Logical Approach to NoSQL Databases Francesca Bugiotti, Luca Cabibbo, Paolo Atzeni, Riccardo Torlone Università Roma Tre, Italy {bugiotti,cabibbo,atzeni,torlone}@dia.uniroma3.it ABSTRACT Although NoSQL
Semantic Variability Modeling for Multi-staged Service Composition
Semantic Variability Modeling for Multi-staged Service Composition Bardia Mohabbati 1, Nima Kaviani 2, Dragan Gašević 3 1 Simon Fraser University, 2 University of British Columbia, 3 Athabasca University,
CRM dig : A generic digital provenance model for scientific observation
CRM dig : A generic digital provenance model for scientific observation Martin Doerr, Maria Theodoridou Institute of Computer Science, FORTH-ICS, Crete, Greece Abstract The systematic large-scale production
XV. The Entity-Relationship Model
XV. The Entity-Relationship Model The Entity-Relationship Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of E-R Diagrams and Business Rules The
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. [email protected] Mrs.
LiDDM: A Data Mining System for Linked Data
LiDDM: A Data Mining System for Linked Data Venkata Narasimha Pavan Kappara Indian Institute of Information Technology Allahabad Allahabad, India [email protected] Ryutaro Ichise National Institute of
A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition
32 A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition Ion SMEUREANU, Andreea DIOŞTEANU Economic Informatics Department, Academy of
