Why RDF for Healthcare Interoperability?



Similar documents
The Yosemite Project for Healthcare Information Interoperability

The Yosemite Project A Roadmap for Healthcare Information Interoperability

Comparing the Yosemite Project and ONC Roadmaps for Healthcare Information Interoperability

IDRT: Integration and Maintenance of Medical Terminologies in i2b2

Semantic Interoperability

Network Graph Databases, RDF, SPARQL, and SNA

IBM Watson and Medical Records Text Analytics HIMSS Presentation

LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA

RDF Resource Description Framework

SMART Apps. Rob Tweed M/Gateway Developments

12 The Semantic Web and RDF

Linked Statistical Data Analysis

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

Terminology Services in Support of Healthcare Interoperability

Open Source egovernment Reference Architecture Osera.modeldriven.org. Copyright 2006 Data Access Technologies, Inc. Slide 1

How To Build A Cloud Based Intelligence System

DISCOVERING RESUME INFORMATION USING LINKED DATA

A.I. in health informatics lecture 5 standards & ontolologies

We have big data, but we need big knowledge

HL7 V2 Implementation Guide Authoring Tool Proposal

Health Care Information System Standards

HL7 FHIR & IHE MHD yet more choices

3M Health Information Systems

Setting the World on FHIR

Taming Big Data Variety with Semantic Graph Databases. Evren Sirin CTO Complexible

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA

Relational Database Basics Review

Open Data Integration Using SPARQL and SPIN

Using the MetaMap Java API

HL7 & HL7 CDA: The Implementation of Thailand s Healthcare Messaging Exchange Standards Nawanan Theera-Ampornpunt, M.D., Ph.D.

STAR Semantic Technologies for Archaeological Resources.

To define and explain different learning styles and learning strategies.

MarkLogic Semantics in Healthcare and Life Sciences for LIDER COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

OSLC Primer Learning the concepts of OSLC

Department of Defense. Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD

Experiences from a Large Scale Ontology-Based Application Development

Creating Knowledge From Interlinked Data From Big Data to Smart Data

How To Use An Orgode Database With A Graph Graph (Robert Kramer)

FHIM Model Content Overview

Healthcare Terminologies and Classifications:

An industry perspective on deployed semantic interoperability solutions

RDF Support in Oracle Oracle USA Inc.

Heterogeneous databases mediation

A generic approach for data integration using RDF, OWL and XML

ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG

Integration Information Model

Standardised and Flexible Health Data Management with an Archetype Driven EHR System (EHRflex)

Using Big Data in Healthcare

Principal MDM Components and Capabilities

A Repository of Semantic Open EHR Archetypes

Lift your data hands on session

Role of EMRs in Patient Registries & Other Post- Marketing Research

What s next for openehr. Sam Heard Thomas Beale

Theodoros. N. Arvanitis, RT, DPhil, CEng, MIET, MIEEE, AMIA, FRSM

Industry 4.0 and Big Data

DLDB: Extending Relational Databases to Support Semantic Web Queries

A Framework for Collaborative Project Planning Using Semantic Web Technology

Electronic Health Record (EHR) Standards Survey

> Semantic Web Use Cases and Case Studies

Semantic Web Success Story

SMART-on-FHIR Genomics: Enabling Precision Medicine by Bridging Clinical and Genomic Information

VistA Evolution Presentation to World Vista

Getting Started Guide

Siemens Future HANNOVER MESSE Internet of Things and Services Guido Stephan

Multi-Agent Based Semantic Web Service Model and Ontological Description for Medical Health Planning

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

BUSINESS VALUE OF SEMANTIC TECHNOLOGY

Designing a Semantic Repository

13 RDFS and SPARQL. Internet Technology. MSc in Communication Sciences Program in Technologies for Human Communication.

XML for Manufacturing Systems Integration

SNOMED-CT

The Semantic Web for Application Developers. Oracle New England Development Center Zhe Wu, Ph.D. 1

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Data Normalization Reconsidered

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary: Enabling effective health information exchange

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

Big Data Analytics. Rasoul Karimi

Semantic Web based e-learning System for Sports Domain

GetLOD - Linked Open Data and Spatial Data Infrastructures

Connecticut Department of Public Health Electronic Laboratory Reporting HL7 v2.5.1 Message Validation Tool User Guide

Transcription:

Why RDF for Healthcare Interoperability? Part 2 of Yosemite Series David Booth, Ph.D. HRG / Rancho BioSciences david@dbooth.org 23-Jul-2015 Latest version of these slides: http://yosemiteproject.org/2015/webinars/why-rdf/

About the Speaker David Booth, PhD, is a senior software architect at Hawaii Resource Group and at Rancho BioSciences, using Semantic Web technology to make clinical healthcare data interoperable between diverse systems. He previously worked at KnowMED, using Semantic Web technology for healthcare quality-of-care and clinical outcomes measurement, and at PanGenX, applying Semantic Web technology to genomics in support of personalized medicine. Before that he worked on Cleveland Clinic's SemanticDB project, which uses RDF and other semantic technologies to perform cardiovascular research. Prior to that was a software architect at HP Software. He was also a W3C Fellow from 2002 to 2005, where he worked on Web Services standards before becoming involved in Semantic Web technology. He holds a Ph.D. in Computer Science from UCLA. 2

3

4

Outline What is RDF? Why RDF (in general)? Why RDF as a universal information representation for healthcare? 5

What is RDF? "Resource Description Framework" But think "Reusable Data Framework" Language for representing information International standard by W3C Mature: 10+ years Used in many domains, including biomedical and pharma 6

RDF Assertions (a/k/a "Triples") PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 7

RDF Assertions (a/k/a "Triples") Subject Predicate or Property Object or Value PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 8

RDF Assertions (a/k/a "Triples") Equivalent English sentence: Patient319 has full name "John Doe". PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 9

RDF Assertions (a/k/a "Triples") Equivalent English sentence: Patient319 has a systolic blood PREFIX ex: <http://.../data/> pressure observation obs_001. PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 10

RDF Assertions (a/k/a "Triples") PREFIX ex: <http://.../data/> Equivalent English sentence: PREFIX v: <http://.../vocab/> has. a value of 120. ex:patient319 v:fullname Obs_001 "John Doe" ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 11

RDF Assertions (a/k/a "Triples") PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullnameequivalent "John Doe". English sentence: has units ex:patient319 v:systolicbpobs_001 ex:obs_001. of mmhg. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. 12

RDF Assertions (a/k/a "Triples") PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. ex:obs_001 v:units v:mmhg. Sets of assertions form an RDF graph... 13

RDF Graph PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. RDF graph ex:obs_001 v:units v:mmhg. 14

RDF Graph PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. RDF graph ex:obs_001 v:units v:mmhg. 15

RDF Graph PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. RDF graph ex:obs_001 v:units v:mmhg. 16

RDF Graph PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. RDF graph ex:obs_001 v:units v:mmhg. 17

RDF Graph PREFIX ex: <http://.../data/> PREFIX v: <http://.../vocab/> ex:patient319 v:fullname "John Doe". ex:patient319 v:systolicbp ex:obs_001. ex:obs_001 v:value 120. RDF graph ex:obs_001 v:units v:mmhg. 18

Why RDF (in general)? #5: RDF is self describing RDF uses URIs as identifiers #4: RDF is easy to map from other data representations RDF data is made of assertions #3: RDF captures information not syntax RDF is format independent #2: Multiple data models and vocabularies can be easily combined and interrelated RDF is multi-schema friendly #1: RDF enables smarter data use and automated data translation RDF enables inference 19

#5: RDF is self describing Uses URIs as identifiers http://www.drugbank.ca/drugs/db00945 20

#5: RDF is self describing Uses URIs as identifiers http://www.drugbank.ca/drugs/db00945 drugbank: DB00945 Often abbreviated in RDF: PREFIX drugbank: <http://www.drugbank.ca/drugs/> drugbank:db00945.... 21

#5: RDF is self describing Uses URIs as identifiers http://www.drugbank.ca/drugs/db00945 22

Why is this important? Terms, data models, vocabularies, etc., can be linked to definitions Definition can be found by any party Reduces ambiguity Aids in bootstrapping new terms toward standardization Supports standards and innovation 23

#4: RDF is easy to map from other data representations RDF is made up of lots of small, atomic statements, called assertions or triples Easy to represent any data Easy to incorporate any data model Hierarchical, relational, graph, etc. 24

Hierarchical data model in RDF 25

Relational data model in RDF People Addresses ID fname addr ID City State 7 Bob 18 18 Concord NH 8 Sue 19 19 Boston MA See W3C Direct Mapping of Relational Data to RDF: http://www.w3.org/tr/rdb-direct-mapping/ 26

Why does this matter? Easy to map any data format to RDF E.g., XML, JSON, CSV, SQL tables, etc. Enables RDF to act as a universal information representation 27

#3: RDF captures information not syntax RDF is format independent There are multiple RDF syntaxes: Turtle, N-Triples, JSON-LD, RDF/XML, etc. The same information can be written in different formats Any data format can be mapped to RDF 28

Different source formats, same RDF HL7 v2.x OBX 1 CE 3727 0^BPsystolic, sitting 120 mmhg Maps to FHIR <Observation xmlns="http://hl7.org/fhir"> <system value="http://loinc.org"/> <code value="3727 0"/> <display value="bpsystolic, sitting"/> <value value="120"/> <units value="mmhg"/> </Observation> Maps to RDF graph 29

Why does this matter? Emphasis is on the meaning (where it should be) RDF acts as a universal information representation Different formats can be exchanged with the same meaning 30

RDF as a universal information representation RDF FHIR <Observation...> <system value="http://loinc.org"/> <code value="3727 0"/>... </Observation> HL7 v2.x OBX 1 CE 3727 0^BPsystolic, sitting 120 mmhg 31

Why does this matter? Helps avoid the bike shed effect in standards, a/k/a Parkinson's Law of Triviality Standards committees often spend hours arguing over syntax and naming -- irrelevant to computable information content 32

Bike shed effect a/k/a Parkinson's Law of Triviality Organizations spend disproportionate time on trivial issues. -- C.N. Parkinson, 1957 1. Nuclear Plant 2. Bike Shed Cost: $28,000,000 Discussion: 2.5 minutes Cost: $1,000 Discussion: 45 minutes 33

@@ TODO: Add slide showing two computers thinking RDF @@ TODO: Maybe also add a slide showing how data can be added without breaking clients 34

#2: Multiple data models and vocabularies can be easily combined and interrelated RDF is multi-schema friendly* Multiple data models/schemas and vocabularies can peacefully co-exist, semantically connected *A/k/a schema-promiscuous, schema-flexible, schema-less, etc. 35

Multi-schema friendly Green Model Red Model HomePhone Town ZipPlus4 FullName Country Blue Model Address FirstName LastName Email hasfirst sameas haslast City ZipCode subclassof Multiple models peacefully co-exist 36

Multi-schema friendly Blue app sees Blue model Green Model Red Model HomePhone Town ZipPlus4 FullName Country City Blue Model Address FirstName LastName ZipCode 37 Email

Multi-schema friendly Red app sees Red model Green Model Red Model HomePhone Town ZipPlus4 FullName Country City Blue Model Address FirstName LastName ZipCode 38 Email

Multi-schema friendly Green app sees Green model Green Model Red Model HomePhone Town ZipPlus4 FullName Country City Blue Model Address FirstName LastName ZipCode 39 Email

Why is this important? Different formats, data models and vocabularies can be: used together harmoniously semantically linked New ones (or new versions) can be gracefully incorporated Healthcare vocabularies are revised ~3-8% per year 40

Different views for different systems 41

Different views for different systems 42

Different views for different systems 43

#1: RDF enables smarter data use and automated data translation RDF enables inference Inference derives new assertions from old "Entailments" Query for v:heartvalve surgeries can find v:mitralvalve surgeries 44

Inference example If you know:?x a v:mitralvalve. v:mitralvalve rdfs:subclassof v:heartvalve. 45

Inference example Infer If you know:?x a v:mitralvalve. v:mitralvalve rdfs:subclassof v:heartvalve. You can infer:?x a v:heartvalve. 46

Inference example: sameas Green Model Red Model HomePhone Town ZipPlus4 FullName Country Blue Model Address FirstName LastName hasfirst sameas haslast City ZipCode subclassof If you know: Town You can infer: City (or vice versa) 47 Email

Inference example: composition Green Model Red Model HomePhone Town ZipPlus4 FullName Country Blue Model Address FirstName LastName hasfirst sameas haslast City ZipCode subclassof If you know: FirstName + LastName You can infer: FullName But not necessarily vice versa 48 Email

Why is this important? Smarter data use Query for v:heartvalve surgeries can find v:mitralvalve surgeries Automated data translation Red Model data + Blue Model data => Green Model data 49

Translation as inference 50

Translation as inference Translate 51

Why RDF as a universal information representation for healthcare? 52

53

Standard Vocabularies in UMLS AIR ALT AOD AOT BI CCC CCPSS CCS CDT CHV COSTAR CPM CPT CPTSP CSP CST DDB DMDICD10 DMDUMD DSM3R DSM4 DXP FMA HCDT HCPCS HCPT HL7V2.5 HL7V3.0 HLREL ICD10 ICD10AE ICD10AM ICD10AMAE ICD10CM ICD10DUT ICD10PCS ICD9CM ICF ICF-CY ICPC ICPC2EDUT ICPC2EENG ICPC2ICD10DUT ICPC2ICD10ENG ICPC2P ICPCBAQ ICPCDAN ICPCDUT ICPCFIN ICPCFRE ICPCGER ICPCHEB ICPCHUN ICPCITA ICPCNOR ICPCPOR ICPCSPA ICPCSWE JABL KCD5 LCH LNC_AD8 LNC_MDS30 MCM MEDLINEPLUS MSHCZE MSHDUT MSHFIN MSHFRE MSHGER MSHITA MSHJPN MSHLAV MSHNOR MSHPOL MSHPOR MSHRUS MSHSCR MSHSPA MSHSWE MTH MTHCH MTHHH MTHICD9 MTHICPC2EAE MTHICPC2ICD10AE MTHMST MTHMSTFRE MTHMSTITA NAN NCISEER NIC NOC OMS PCDS PDQ PNDS PPAC PSY QMR RAM RCD RCDAE RCDSA RCDSY SNM SNMI SOP SPN SRC TKMT ULT UMD USPMG UWDA WHO WHOFRE WHOGER WHOPOR WHOSPA Over 100! 54

Each standard is an island 55

How RDF helps standardize the standards Meaning of each standard can be captured in RDF (& family) RDF as a universal information representation Encourages clarity and consistency Enables semantic linkage between standards Allows distributed extensibility and late linkage Facilitates standards convergence 56

Bridging healthcare standards 57

Bridging healthcare standards 58

Summary: Why RDF? Standardizing the Standards How RDF helps: Captures information content, regardless of data format Allows multiple data models to co-exist, interlinked Enables linkage between standards Facilitates standards convergence 59

Summary: Why RDF? Crowdsourcing Translations How RDF helps: Facilitates automated data translation, for: Legacy data formats and models Versioning standards Diverse use cases Common basis for crowdsourcing translation rules 60

Summary: Why RDF? The "best available candidate" for a universal information representation See Yosemite Manifesto 61

Upcoming webinars Aug 6 - drugdocs: Using RDF to produce one coherent, definitive dataset about drugs, Conor Dowling, Caregraf Part 3 of Yosemite Series Sept 3 - Linked VistA: VA Linked Data Approach to Semantic Interoperability, Rafael Richards, Veterans Affairs - Part 4 of Yosemite Series Sept 17 - Clinical data in FHIR RDF: Intro and Representation, Josh Mandel, Children's Hospital Informatics Program at Harvard-MIT, and David Booth, HRG - Part 5 of Yosemite Series Others to be announced 62

Join the announcements list http://yosemiteproject.org/ 63

Join the announcements list http://yosemiteproject.org/ 64

Questions? 65

BACKUP SLIDES 66

De jure versus de facto standards De facto standards evolve faster than de jure standards RDF supports both 67