Using Open Source software and Open data to support Clinical Trial Protocol design



Similar documents
PONTE Presentation CETIC. EU Open Day, Cambridge, 31/01/2012. Philippe Massonet

D3.1.1 Initial Overall PONTE Architecture - Interface definition and Component design

LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA

PONTE: A Context-Aware Approach hfor Automated Clinical Trial Protocol Design

Hubble: Linked Data Hub for Clinical Decision Support

LinkedCT: A Linked Data Space for Clinical Trials

LDIF - Linked Data Integration Framework

Publishing Linked Data Requires More than Just Using a Tool

Classifying Adverse Events From Clinical Trials

A collaborative platform for knowledge management

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

TopBraid Insight for Life Sciences

Developing Web 3.0. Nova Spivak & Lew Tucker Tim Boudreau

An integrated. EHR system

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

Semantic Interoperability

Clinical Mapping (CMAP) Draft for Public Comment

Ernesto Ongaro BI Consultant February 19, The 5 Levels of Embedded BI

TRANSFoRm: Vision of a learning healthcare system

Benjamin Heitmann Digital Enterprise Research Institute, National University of Ireland, Galway

CREATING AND APPLYING KNOWLEDGE IN ELECTRONIC HEALTH RECORD SYSTEMS. Prof Brendan Delaney, King s College London

Linked Statistical Data Analysis

Evangelia Mitsopoulou, St George s University of London Panagiotis Bamidis, Aristotle University of Thessaloniki Daniela Giordano, University of

Linked Longitudinal Medical Record. Susie Stephens Co chair W3C Health Care & Life Science Interest Group

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

IDRT: Integration and Maintenance of Medical Terminologies in i2b2

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Andreas Harth, Katja Hose, Ralf Schenkel (eds.) Linked Data Management: Principles and Techniques

COLINDA - Conference Linked Data

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG )

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

Exploiting Ontology based search and EHR Interoperability to facilitate Clinical Trial Design

An Introduction to Linked Data

Federated Query Processing over Linked Data

Benchmarking the Performance of Storage Systems that expose SPARQL Endpoints

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

Short Paper: Enabling Lightweight Semantic Sensor Networks on Android Devices

Clinical Knowledge Manager. Product Description 2012 MAKING HEALTH COMPUTE

Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge

AGRIS: an RDF-aware system in the agricultural domain

TopBraid Life Sciences Insight

Electronic Submission of Regulatory Information, and Creating an Electronic Platform for Enhanced Information Management

Terminology Services in Support of Healthcare Interoperability

Fraunhofer FOKUS. Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee Berlin, Germany.

ISTEC.MIP Measurement Data Integration Platform

Health Information Exchange Language - Bostaik

EUR-Lex 2012 Data Extraction using Web Services

Semantic Knowledge Management System. Paripati Lohith Kumar. School of Information Technology

Supporting Change-Aware Semantic Web Services

Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint

THE EHR4CR PLATFORM AND SERVICES

How To Build A Cloud Based Intelligence System

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

The Ontological Approach for SIEM Data Repository

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

Internet of Things. Reply Platform

Linked Open Data A Way to Extract Knowledge from Global Datastores

SEMANTICS ENABLED PROACTIVE AND TARGETED DISSEMINATION OF NEW MEDICAL KNOWLEDGE

Semantic Web Applications

Open Data Integration Using SPARQL and SPIN

Data-Gov Wiki: Towards Linked Government Data

Pilot. Pathway into the Future for. Delivery. April 2010 Bron W. Kisler, CDISC Senior Director

MANDARAX + ORYX An Open-Source Rule Platform

GetLOD - Linked Open Data and Spatial Data Infrastructures

TECHNICAL Reports. Discovering Links for Metadata Enrichment on Computer Science Papers. Johann Schaible, Philipp Mayr

Acronym: Data without Boundaries. Deliverable D12.1 (Database supporting the full metadata model)

Addressing Self-Management in Cloud Platforms: a Semantic Sensor Web Approach

Customer experiences in implemen0ng SKOS- based vocabulary management systems, Ralph Hodgson, TopQuadrant. CWI, Amsterdam, April 3, 2014

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

This course provides students with the knowledge and skills to develop ASP.NET MVC 4 web applications.

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION

A HUMAN RESOURCE ONTOLOGY FOR RECRUITMENT PROCESS

An industry perspective on deployed semantic interoperability solutions

Chapter. Solve Performance Problems with FastSOA Patterns. The previous chapters described the FastSOA patterns at an architectural

Building COBOL applications for Microsoft Azure. Jim Lane Senior Solution Engineer

Comparison of Triple Stores

Knowledge-based Collaboration in Construction Industry

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary: Standardizing lab data to LOINC for meaningful use

MarkLogic Semantics in Healthcare and Life Sciences for LIDER COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Leveraging existing Web frameworks for a SIOC explorer to browse online social communities

BYODs & FAIR Data Stewardship

technische universiteit eindhoven WIS & Engineering Geert-Jan Houben

DC2AP Metadata Editor: A Metadata Editor for an Analysis Pattern Reuse Infrastructure

Achille Felicetti" VAST-LAB, PIN S.c.R.L., Università degli Studi di Firenze!

DISCOVERING RESUME INFORMATION USING LINKED DATA

How to extract transform and load observational data?

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Semantically Steered Clinical Decision Support Systems

Oct 15, Internet : the vast collection of interconnected networks that all use the TCP/IP protocols

Open Data collection using mobile phones based on CKAN platform

It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

CERN Document Server

DDI Lifecycle: Moving Forward Status of the Development of DDI 4. Joachim Wackerow Technical Committee, DDI Alliance

Executive Summary for deliverable D7.1: Establish specification for data acquisition and standards used including a concept for local interfaces

Disributed Query Processing KGRAM - Search Engine TOP 10

Managing enterprise applications as dynamic resources in corporate semantic webs an application scenario for semantic web services.

Extending SOA Infrastructure for Semantic Interoperability

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data

Transcription:

Using Open Source software and Open data to support Clinical Trial Protocol design Nikolaos Matskanis, Joseph Roumier, Fabrice Estiévenart {nikolaos.matskanis, joseph.roumier, fabrice.estievenart}@cetic.be CETIC Centre of Excellence in Information and Communication Technologies Med-e-Tel Conference Luxembourg, 10 th April 2014

Adoption: Free, libre and Open Source Semantic web Software and data commercial & scientific support. Openness is useful for inter-linking. Medical domain strong adoption, e.g. Biomedical Ontology Portal. Openness is useful when you need trust in data. Requirements for our project: Re-use existing components (libraries, software,...) Ensure that the published modifications by tiers are kept open, even for SaaS License: Affero GPL v3

Supporting Clinical Trial Protocol Design The project goal is to assist the CTP design using Open source software Open linked Data We have developed open source components Clinical Trial Protocol Repository Semantic Mapper Linked Data Application

Clinical Trial Protocol Repository

Ontologies integration for Search Engines Medical domain is split in many expertise fields different health-related ontologies, models, coding systems, protocols, etc. PONTE aims at covering all the domains of clinical trial design integration of many point-of-views LOINC, KEGG Compounds & Lipids, NCI Common Terminology Criteria for Adverse Events (CTCAE) v.4, ICD-10-CM, Animals ontology from GO3R project, etc. Resulting ontology is a hierarchy 49500 concepts Developed in Web Ontology Language (OWL) & translated into OBO

Design CTP Ontology Based on standards (DICOM, ICD-10-CM, Chebi, ATC, LOINC) for study/trial design Driven by input and feedback from medical partners Validated with medical experts in workshops and demo events Used as the backbone of PONTE Platform Provides high and low level structure of the CTP document Is linked with the eligibility criteria ontology Ontology Metrics:

CTP Repository Architecture CTPRepository Web Service (Open source libraries and container) RDF repository (OpenRDF Sesame) Querying, reasoning operations XML database (BaseX, custom implementation) Caching XML documents CTP Editing Interface EHR Communication Decision Support CTP Sections, Hospitals Criteria, hospitals Patient Information CTPRepository RDF Repository XML Database

XML Java Model RDF Triples

Semantic Mapper

Semantic Term Code Mapper Service dedicated to the mapping of different vocabularies/classification schemes Example : Vocabulary ICD-9-CM ICD-10-CM Code 41071 I21.4 Name Subendocardial infarction, initial episode of care Non-ST elevation (NSTEMI) myocardial infarction

Vocabularies Subsets Vocabularies managed by the mapper : Disorders : ICD-9-CM, ICD-10-CM Pharmacological substances: IOPR, ChEBI, ATC Genders : CNR, DICOM Marital Status : CNR, Ponte And others

Mappings come from : Architecture manual work for small vocabularies mapping files (GEM:General Equivalence Mappings) Technologies MySQL, relational database with abstaction layer and result caching SOAP service with Java implementation

Linked Data Application

Linked Open Data Open and freely available data Value of data increases the more it is interlinked with other data RDF to structure the data HTTP URIs to publish Semantic references such as owl:sameas to semantically associate and link Linked Data Benefits: By following the links, humans can browse, search engines can search/crawl Query traversal Extend query by following links in the results Gather and aggregate results over distributed data sources

The Linked Open Data Cloud Linked Open Data communities Governments, media, academic institutes Medical and life sciences: Clinical research (PubMed, clinicaltrials.gov, GeneOntology) Disease (Diseasome) Drug (DrugBank, DailyMed, Kegg, Sider)

Consuming Linked Data SotA The Bio2RDF project has created a framework on demand data for mash-ups The FedBench project has a benchmark framework analysing the Linked Data querying efficiency and performance SQUIN engine and model for traversal based query execution over Linked Data SPARQLeR designed for finding semantic associations in RDF bases. PHP and Javascript libraries for Linked Data mashup arc2, Graphite, EasyRDF, Moriarty Services for publishing linked data (Virtuoso)

Linked Data Application The LDApp interface allows the clinician to enter the question from one of the main perspectives: Disease, Drug, Target and Clinical Trial. Mechanisms to query these LOD sources Offers query expansion across multiple sources and navigation through them aggregates the retrieved information

Application User Interface.

How it works SELECT DISTINCT?d1?l1?i WHERE {?i a drugbank:drug_interactions.?i drugbank:interactiondrug1?d1.?i drugbank:interactiondrug2?d2.?d1 rdfs:label?l1.?d2 rdfs:label?l2. Filter (?l1="liothyronine"?l2="liothyronine") }

Evaluation Evaluation workshop with medical partners The expansion on trials queries Results from 2 data sources. Most results were characterised as relevant. The expansion on disease and drug targets Results up to 3 data sources. Results were relevant Aggregation of linked drugs and trials was very helpful Response times for queries Queries to drugbank and linkedct: 5 to 10 seconds. Expansions As single source searches. Retrieving of linked instances is usually very fast.

Questions