IDRT: Integration and Maintenance of Medical Terminologies in i2b2



Similar documents
IDRT: Platform Architecture And Tools to Support The Re-use of Routine Clinical Data For Research

The Integrated Data Repository Toolkit (IDRT)

The Yosemite Project A Roadmap for Healthcare Information Interoperability

The Yosemite Project for Healthcare Information Interoperability

Meaningful Use Stage 2 Update: Deploying SNOMED CT to provide decision support in the EHR

Why RDF for Healthcare Interoperability?

Bringing Semantics to the i2b2 Framework

From Fishing to Attracting Chicks

TRANSFoRm: Vision of a learning healthcare system

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary: Enabling effective health information exchange

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Using Open Source software and Open data to support Clinical Trial Protocol design

Find the signal in the noise

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary: Standardizing lab data to LOINC for meaningful use

What is the Certified Health Record Analyst (CHDA)?

PharmaSUG Paper CD13

Environmental Health Science. Brian S. Schwartz, MD, MS

IBM Watson and Medical Records Text Analytics HIMSS Presentation

Daniel Diekmann ID Information und Dokumentation im Gesundheitswesen Leiter Geschäftsbereich Medizin

Use of the Research Patient Data Registry at Partners Healthcare, Boston

Oracle Data Integrator 12c: Integration and Administration

Programa de Actualización Profesional ACTI Oracle Database 11g: SQL Tuning Workshop

Meaningful Use Stage 2 Certification: A Guide for EHR Product Managers

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Smart Dataset-XML Viewer: Web Services

Health Data Analysis Specialty Track Curriculum Competencies

Employing SNOMED CT and LOINC to make EHR data sensible and interoperable for clinical research

Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager

Electronic Health Record. Standards, Coding Systems, Frameworks, and Infrastructures

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

Udo Siegmann member of e3c, CDISC Sen. Dir. Acc. Management PAREXEL

Automate Data Integration Processes for Pharmaceutical Data Warehouse

Report and Dashboard Template User Guide

What is your level of coding experience?

Problem-Centered Care Delivery

Data Management for Biobanks

Understanding Diagnosis Assignment from Billing Systems Relative to Electronic Health Records for Clinical Research Cohort Identification

Master Data Services. SQL Server 2012 Books Online

Data Integration with Talend Open Studio Robert A. Nisbet, Ph.D.

Oracle Data Integrator: Administration and Development

CREATING AND APPLYING KNOWLEDGE IN ELECTRONIC HEALTH RECORD SYSTEMS. Prof Brendan Delaney, King s College London

Classifying Adverse Events From Clinical Trials

Community Edition. Master Data Management 3.X. Administrator Guide

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Health Care Information System Standards

Microsoft Data Warehouse in Depth

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

SAP BODS - BUSINESS OBJECTS DATA SERVICES 4.0 amron

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Implementing a Data Warehouse with Microsoft SQL Server

Practical Implementation of a Bridge between Legacy EHR System and a Clinical Research Environment

3M Health Information Systems

THE STIMULUS AND STANDARDS. John D. Halamka MD

Table of Contents Chapter 1 - Getting Started with Oracle Data Relationship Management (DRM) 1

Development of an open metadata schema for Prospective Clinical Research (openpcr)

Pediatric Research Database

Data Management for Large Studies Robert R. Kelley, PhD. Thursday, September 27, 2012

CODING. Neighborhood Health Plan 1 Provider Payment Guidelines

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Data processing goes big

Clinical and research data integration: the i2b2 FSM experience

Implementing a Data Warehouse with Microsoft SQL Server

How to extract transform and load observational data?

«How we did it» Implementing CDISC LAB, ODM and SDTM in a Clinical Data Capture and Management System:

Theodoros. N. Arvanitis, RT, DPhil, CEng, MIET, MIEEE, AMIA, FRSM

Oracle Data Integrator 11g: Integration and Administration

Software Architecture Document

BRIDGing CDASH to SAS: How Harmonizing Clinical Trial and Healthcare Standards May Impact SAS Users Clinton W. Brownley, Cupertino, CA

Cerner i2b2 User s s Guide and Frequently Asked Questions. v1.3

Designing ETL Tools to Feed a Data Warehouse Based on Electronic Healthcare Record Infrastructure

Electronic Submission of Regulatory Information, and Creating an Electronic Platform for Enhanced Information Management

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Practical application of SAS Clinical Data Integration Server for conversion to SDTM data

SNOMED CT in the EHR Present and Future. with the patient at the heart

SAP HANA Live & SAP BW Data Integration A Case Study

ICD-9-CM to MedDRA Mapping How Well Do the. Disclaimer

Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5.

D83167 Oracle Data Integrator 12c: Integration and Administration

Implementing a Data Warehouse with Microsoft SQL Server

Data Domain Discovery in Test Data Management

The i2b2 Hive and the Clinical Research Chart

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (

Implementing a Data Warehouse with Microsoft SQL Server 2012

Managing DICOM Image Metadata with Desktop Operating Systems Native User Interface

Standards and their role in Healthcare ICT Strategy. 10th Annual Public Sector IT Conference

Data Standards and the National Cardiovascular Research Infrastructure (NCRI)

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Get More Value from Your Reference Data Make it Meaningful with TopBraid RDM

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

OLAP Cube Manual deployment and Error resolution with limited licenses and Config keys

D3.1.1 Initial Overall PONTE Architecture - Interface definition and Component design

Transcription:

1. European i2b2 Academic User Group Meeting IDRT: Integration and Maintenance of Medical Terminologies in i2b2 TMF-funded project "Integrated Data Repository Toolkit Matthias Löbe, Sebastian Stäubert, Alfred Winter Institute for Medical Informatics (IMISE), Universität Leipzig

2010 i2b2 Prototype Building a research DW is a strategic aim Strategic like long-range effort Not so strategic in defining milestones and success criteria Business data warehouse already exists (since 2006) Used for financial controlling Not accessible for researchers Data islands What do we want to do with a research warehouse? Center for clinical trials (2010): facilitate patient recruitment! Find eligible subjects by querying for inclusion criteria Or at least support cohort estimation by retrospective queries 6-month evaluation using i2b2 Nice GUI, easy to understand, scales to large datasets Problems with getting data into it => IDRT project I vegota mission for thetwoof you Seite 2

University Hospital Leipzig Big Picture 3LGM-Leipzig EHR data Clinical Trials Unit Long-term Cohorts Raiseyourhandifyoucan find the Data Warehouse! Diseasespecific Registries Seite 3

Motivation Research database will store all research data untouched and versioned Data warehouse will provide an integrated, up-to-date view on research-specific hypotheses (To-be-developed) Metadata Repository will store complex semantic annotations on data elements Seite 4

Why Standard Terminologies? Research data are only interpretable by those who made them In our use case, metadata are links to standard terminologies Seite 5

Typical Domains and Terminologies Patient demographics Diagnoses and observations Procedures and surgeries Laboratory Drugs and medication Biobank samples Encounters and admissions Billing and cost Patient reported data AOD AOT BioC CBO CCS CDC CDISC CDT COH COSTAR CRCH CSP CST CTCAE CTEP DCP DICOM DTP DXP ELC FDA GDRG GO HCPCS HL7V3.0 HUGO ICD10AE ICD10 ICD9CM ICDO ICH ICPC2ICD10ENG ICPC JAX KEGG LNC LNC MBD MCM MDBCAC MDR MEDLINEPLUS MED MGED MSH MTHFDA MTHHH MTHICD9 MTHICPC2ICD107B MTHICPC2ICD10AE MTHMST MTHSPL MTH NCBI NCI-GLOSS NCI-HL7 NCIMTH NCISEER NCI NDFRT OMIM OPS PDQ PMA PNDS QMR RADLEX RAM RENI RXNORM SNOMEDCT SPN SRC TNM UCUM UMD USPMG UWDA VANDF ZFIN Data elements in clinical research as well as in patient care often use codes from controlled vocabularies to define their permissible values i2b2-included terminologies are not used in Germany Seite 6

IDRT Architecture Loading Concept Dimension Table Observation Fact Table Patient Dimension Table i2b2 Transformation Target Metadata Mapping Talend Open Studio Oracle Database Modelling Staging Standard Metadata Source Metadata Source Fact Data Oracle Database PIDgen Extraction Individual extractors for each terminology XML CSV... Configurable extractors for each type of datasource ODM SQL CSV Talend Open Studio Java Datasources ICD-10, OPS, ICD-O, TNM 21, DRG, LOINC... CDMS EHR 21 Standard Terminologies Operative Systems Seite 7

A Clinical Form in Real Life Data Entry on Case Report Form in OpenClinica Attribute: an abstract concept which describes a object s characteristic Item Group CRF Can be part of a medical classification Value: a value that is being assigned to an attribute Example: true, 38, 176 Attribute 38 176 Value Data Element Source: S. Mate Seite 8

i2b2 Web Client Hierarchy for navigation and content selection Folder Leaf Seite 9

i2b2 Star Schema Entity-Attribute-Value Seite 10

i2b2 Ontology Hierarchy for navigation and content selection Hierarchy Clinical Data Concepts Subjects Seite 11

IDRT: Realization of integration Job in Talend Open Studio Mission: Import wide-spread medical terminologies for diagnosis, procedures, laboratory observations and administrative code systems like diagnosis-related groups Results: Job in Talend Open Studio generating i2b2 ontologies: 21 code lists German DRG ICD-10-GM OPS LOINC TNM ICD-O Seite 12

Example 1: LOINC import Step (1): Using the normative files from Regenstrief as input Only three prerequisites: 1. LOINC_Root: a manual-created file containing the root node for the LOINC hierarchy 2. LOINC_MH-Input: a CSV file containing the LOINC multi-axial hierarchy 3. LOINC_CONCEPT-Input: a CSV file containing the actual LOINC concepts The only Talend component used is the mighty tmap Seite 13

Example 1: LOINC import Step (2): Inside the tmap joining concepts and hierarchy Seite 14

Example 2: ICD-O-3 import Step (1): Transform from ClaML into a simple XML Normative ClaML (Classification Markup Language) input file XSL-Stylesheets for generating i2b2 folder and leafs XML intermediate format Seite 15

Example 2: ICD-O-3 import Step (2): Map simple XML to i2b2 database columns XML intermediate format (continued) Schema mapping to i2b2 Seite 16

Finally: Writing to Oracle Seite 17

But users won t see much of this Job is encapsulated in an dedicated import tool Import is done with just one click Only terminologies with corresponding factual data are imported Seite 18

Results and Discussion All terminologies were imported successfully i2b2 has only simple needs (Concept: label + code, Concepts relate to each other <broader> or <narrower>) Talend Open Studio is fast and flexible enough to do the job Import (125.000 concepts) takes one minute on our server and about 20 seconds on a subnotebook Problems Modifier: Primary/secondary diagnosis is not distinguished Versions: Most terminologies are updated once or twice a year, codes retire and reincarnate Format: Some terminologies come as colored Excel files with textual annotations, requiring a manual cleaning Seite 19

Outlook More terminologies will be imported at request Planning for IDRT2: Fine-grained mapping capabilities (Göttingen i2b2 ontology editor) Connection to the National Metadata Repository (reading i2b2 ontologies from harmonized datasets) Seite 20