Module 16. Semantic Search

Similar documents
Slide 7. Jashapara, Knowledge Management: An Integrated Approach, 2 nd Edition, Pearson Education Limited Nisan 14 Pazartesi

Mining Text Data: An Introduction

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Weblogs Content Classification Tools: performance evaluation

How To Make Sense Of Data With Altilia

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

Text Analytics Software Choosing the Right Fit

Databases in Organizations

How To Use Data Analysis To Get More Information From A Computer Or Cell Phone To A Computer

Text Analytics Evaluation Case Study - Amdocs

In principle, SAP BW architecture can be divided into three layers:

Search and Information Retrieval

Artificial Intelligence & Knowledge Management

Delivering Smart Answers!

A Data Browsing from Various Sources Driven by the User s Data Models

Taxonomies in Practice Welcome to the second decade of online taxonomy construction

ETL Tools. L. Libkin 1 Data Integration and Exchange

IBM Cognos Training: Course Brochure. Simpson Associates: SERVICE associates.co.uk

Search Result Optimization using Annotators

Pulsar TRAC. Big Social Data for Research. Made by Face

Natural Language Processing in the EHR Lifecycle

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

POLAR IT SERVICES. Business Intelligence Project Methodology

Data Warehouse: Introduction

OLAP Visualization Operator for Complex Data

The Value of Taxonomy Management Research Results

Metadata Repositories in Health Care. Discussion Paper

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Knowledge as a Service for Agriculture Domain

A Design and implementation of a data warehouse for research administration universities

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

Open Ontology Repository Initiative

AIIM ECM Certificate Programme

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

A Knowledge Base Approach for Genomics Data Analysis

HOW TO DO A SMART DATA PROJECT

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company

Chapter 11 Mining Databases on the Web

SemWeB Semantic Web Browser Improving Browsing Experience with Semantic and Personalized Information and Hyperlinks

Presente e futuro del Web Semantico

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

SQL Server 2012 Business Intelligence Boot Camp

Turning Big Data into a Big Opportunity

Chapter 6. Attracting Buyers with Search, Semantic, and Recommendation Technology

Web 3.0 image search: a World First

Internet of Things, data management for healthcare applications. Ontology and automatic classifications

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Monitoring Genebanks using Datamarts based in an Open Source Tool

Personalized Business Intelligence

THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE

QAD Business Intelligence Data Warehouse Demonstration Guide. May 2015 BI 3.11

SharePoint Training DVD Videos

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

ETL TESTING TRAINING

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining

SWIFT: A Text-mining Workbench for Systematic Review

The Ontological Approach for SIEM Data Repository

Flattening Enterprise Knowledge

TopBraid Insight for Life Sciences

Data Mart/Warehouse: Progress and Vision

The use of Semantic Web Technologies in Spatial Decision Support Systems

Effects of global risk in transition countries

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Microsoft Data Warehouse in Depth

Ganzheitliches Datenmanagement

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

Implementing Data Models and Reports with Microsoft SQL Server

Self-Service Business Intelligence

CONCEPTCLASSIFIER FOR SHAREPOINT

Semantic annotation and retrieval of services in the Cloud.

Recommending Web Pages using Item-based Collaborative Filtering Approaches

Improving Knowledge Discovery. By Combining Text-Mining (TDM) And Link-Analysis Techniques

Fluency With Information Technology CSE100/IMT100

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

Semantic EPC: Enhancing Process Modeling Using Ontologies

SPATIAL DATA CLASSIFICATION AND DATA MINING

Information Management Metamodel

Structured Content: the Key to Agile. Web Experience Management. Introduction

PONTE: A Context-Aware Approach hfor Automated Clinical Trial Protocol Design

EHR Search Engine SIBM. SJ. Darmoni, R. Lelong, C. Cabot, B. Dahamna

Systematic Engineering in Designing Architecture of Telecommunications Business Intelligence System

Data-Gov Wiki: Towards Linked Government Data

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Jason Essig DBMS Consulting OCUG 2009 New Orleans 06 October 2009 CTMS Focus Group Session 18

Distance Learning and Examining Systems

Data Warehousing and Data Mining

HELP DESK SYSTEMS. Using CaseBased Reasoning

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

An industry perspective on deployed semantic interoperability solutions

Microsoft Implementing Data Models and Reports with Microsoft SQL Server

Get More Value from Your Reference Data Make it Meaningful with TopBraid RDM

4.2. Topic Maps, RDF and Ontologies Basic Concepts

Transcription:

Module 16 Semantic Search

Module 16 schedule 9.45-11.00 xxx Xxx 11.00-11.15 Coffee break 11.15-12.30 xxx Xxx 12.30-14.00 14.00-16.00 Lunch Break xxx xxx

Module 16 outline Traditional approaches to search and retrieval Semantic annotation & search Overview of KIM and LifeSKIM platforms Demos #3

Traditional approaches to search and retrieval

IR models Boolean (set-theoretic) Documents and queries are represented as sets (of terms/keywords) Retrieval is based on set intersection Advantages Easy to implement Disadvantages Difficult to rank results no term weighting #5

IR models (2) Algebraic Documents and queries are represented as vectors in a multidimensional space (one dimension per term/keyword) Retrieval is based on vector similarities Cosine similarity Advantages Simple model Ranking & Term weights Disadvantages Documents with similar topic but different vocabulary are not associated #6

Precision & Recall Precision Measure of the quality of results What % of the retrieved documents are relevant to the query? Recall Measure of the completeness of results What % of the documents which are relevant to the query are retrieved? #7

Classical IR limitations Example Query Documents about a telecom companies in Europe related to John Smith from Q1 or Q2/2010 Document containing At its meeting on the 10th of May, the board of Vodafone appointed John G. Smith as CTO will not match Classical IR will fail to recognise that Vodafone is a mobile operator, and mobile operator is a type of telecom Vodafone is in the UK, which is part of Europe => Vodafone is a telecom company in Europe 5 th of May is in Q2 and John G. Smith may be the same as John Smith #8

Semantic Annotation & Search

Semantic Annotation Semantic annotation (of text) The process of linking text fragments to structured information Organisations, Places, Products, Human Genes, Diseases, Drugs, etc. Combines Text Mining (Information Extraction) with Semantic Technologies Benefits of semantic annotations Improves the text analysis process by employing Ontologies and knowledge from external Knowledge Bases / structured data sources #10

Semantic Annotation (2) Benefits of semantic annotations (cont.) Provides unambiguous (global) references for entities discovered in text Different from tagging Provide the means for semantic search Together or independently of the original text Improved data integration Documents from different data sources can share the same semantic concepts #11

Example #12

Example (2) Demo of a GATE annotated document about Asthma and chronic obstructive pulmonary disease Annotations of Genes Each annotation is linked to an ontology class Each annotation is linked to an ontology instance #13

Semantic Annotations Document indoc 5 15 start end Annotation1 about Entity1 type Class1 indoc Annotation2 about Entity2 type Class2 indoc indoc Annotation3 Annotation4 about about Entity3 type #14

Semantic Search Semantic Search In addition to the terms/keywords, explore the entity descriptions found in text Make use of the semantic relations that exist between these entities Example Query Documents about a telecom companies in Europe related to John Smith from Q1 or Q2/2010 Document containing At its meeting on the 10th of May, the board of Vodafone appointed John G. Smith as CTO will not match #15

Semantic Search (2) Classical IR will fail to recognise that Vodafone is a mobile operator, and mobile operator is a type of telecom Vodafone is in the UK, which is part of Europe => Vodafone is a telecom company in Europe 5 th of May is in Q2 John G. Smith may be the same as John Smith #16

Types of Semantic Search What semantics? Lexical semantics Named entities Factual knowledge Ontologies / taxonomies Hybrid approaches #17

Types of Semantic Search (2) Types of queries Occurrence Co-occurrence Structured queries Faceted search Pattern-matching #18

Types of Semantic Search (2) Structured queries Query entities in the Knowledge Base Very expressive and flexible Pattern queries A set of predefined structured queries where some search criteria is already pre-specified Faceted search & navigation Extracted entities are organised into facets (intelligent columns) Easy to find documents that contain information about specific types of entities #19

Ontologies for semantic search #20

#21 Structured query in KIM Show me all people who were mentioned as spokesmen in IBM

Structured query example Demo of a structured query with KIM Go to http://ln.ontotext.com Select STRUCTURE Build a query for: Persons (unspecified name) who have a Position of type Job Position (unspecified name) within an Organisation which is a Company which name starts with IBM Select Entities Documents mentioning the entities #22

Pattern query example (2) Demo of a structured query with KIM Go to http://ln.ontotext.com Select PATTERNS Build a query for: Organisations (unspecified name) located in Montreal Select Entities Documents mentioning the entities #23

Faceted search in KIM #24

Faceted search in KIM document results #25

Faceted search example Demo of a faceted navigation with KIM Go to http://ln.ontotext.com Select Facets Restrict Organisations to McGill University Restrict Locations to Montreal Select researcher from Related Entities (document results displayed on bottom of page) #26

Overview of KIM and LifeSKIM

The KIM Platform A platform offering services and infrastructure for: Automatic semantic annotation of text Text-mining and ontology population Semantic indexing and retrieval of content Query and navigation across heterogeneous text and data Based on an Information Extraction technology built on top of GATE Offers unparalleled heterogeneous querying facilities #28

KIM platform (2) Visual Interface 3rd party App Document & Metadata Aggregator or Crawler Multi-paradigm Search/Retrieval Population Service Semantic Annotation Semantic Indexing & Storing Semantic Index #29

LifeSKIM & Linked Data #30

LifeSKIM / Linked Data ETL Data Source Identification Flat files OBO files XML RDBMS RDF Special tailored transformer OBO to SKOS converter Custom XSLT RDBMS to RDF formatter RDF warehouse Reasoner Instance Mappings Semantic Annotations #31

Timelines for entity popularity in KIM Timelines for entity occurrences over some period of time Can be used & extended for sentiment analysis #32

Timelines in KIM #33

Timelines example Demo of timeline with KIM Go to http://ln.ontotext.com Select Timelines Build a monthly timeline comparing mentions of Concordia, McGill and University of Montreal Time period: max Granularity: month Based on: occurences #34