Building a Collaborative Informatics Platform for Translational Research: Prof. Yike Guo Department of Computing Imperial College London



Similar documents
A leader in the development and application of information technology to prevent and treat disease.

Dr Alexander Henzing

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Find the signal in the noise

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

An EVIDENCE-ENHANCED HEALTHCARE ECOSYSTEM for Cancer: I/T perspectives

The Evolution of Data Platforms in IMI. Anthony Rowe, Janssen R&D IT 08 April 2016 Med-e-Tel Luxembourg

Towards a National Health Knowledge Infrastructure (NHKI): The role of Semantics-based Knowledge Management vkashyap1@partners.org

ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS

Netherlands escience Center

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE

<Insert Picture Here> The Evolution Of Clinical Data Warehousing

Translational research facilitating experimental medicine in dementia in the UK

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Integrated Biomedical and Clinical Research Informatics for Translational Medicine and Therapeutics

CASC Fall Meeting 2014

Big Data for Better Innovation. Professor Yike Guo Director, Data Science Institute Imperial College London

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

Genomic medicine in Australia. Professor Warwick Anderson Chief Executive Officer National Health and Medical Research Council

Perspectives for Effective Cancer Drug Development The EORTC SPECTA program (Screening Patients for Efficient Clinical Trial Access)

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Anforderungen der Life-Science Industrie an die Hochschulen. Hans Widmer Novartis Institutes for BioMedical Research

Preparing the scenario for the use of patient s genome sequences in clinic. Joaquín Dopazo

BIOBANK QUALITY. Office:

Big Data An Opportunity or a Distraction? Signal or Noise?

Big Data Trends A Basis for Personalized Medicine

Matteo di Tommaso FDA-PhUSE March 2013 Vice President, Research Business Technology Chair, PRISME Forum

Running Agilent GeneSpring MPP on the Cloud

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

What s New in Pathway Studio Web 11.1

Analyzing NGS data with clinical data: open source software for translational medicine

Enabling Technologies for Collaborative Research in Health. Ann Martin BigData2015 Munsbach, Luxembourg

Big data in cancer research : DNA sequencing and personalised medicine

Nanomedicine in Horizon 2020

Energy Efficiency Embedded Service Lifecycle: Towards an Energy Efficient Cloud Computing Architecture

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

The Impact of PaaS on Business Transformation

Industry-Academia Challenges and Pathways

EHR Recruitment Process and Methods Workgroup. John Buse, UNC-CH On behalf of Marie Rape, Chunhua Weng and Peter Embi

OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis

Clinical Research Infrastructure

Executive Summary for deliverable D7.1: Establish specification for data acquisition and standards used including a concept for local interfaces

How Can Institutions Foster OMICS Research While Protecting Patients?

Hacking Brain Disease for a Cure

Human Brain Project -

> Semantic Web Use Cases and Case Studies

School of Biosciences: MRC Phenome Centre-Birmingham

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

KNOWLEDGENT WHITE PAPER. Big Data Enabling Better Pharmacovigilance

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Translational research infrastructure in Neurosciences /Bruxelles

Regulatory Issues in Genetic Testing and Targeted Drug Development

Personalised Medicine what s in for Rare Diseases?

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

ITT Advanced Medical Technologies - A Programmer's Overview

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

Public Health - A Model ForGovernment andGovernance

One Research Court, Suite 200 Rockville, MD Tel: Fax:

Guidelines for applicants

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

i2b2 Clinical Research Chart

From Data to Foresight:

ehealth ELSA: The perspective of the VPH research community

Personalized Medicine: Humanity s Ultimate Big Data Challenge. Rob Fassett, MD Chief Medical Informatics Officer Oracle Health Sciences

Attacking the Biobank Bottleneck

Using Big Data to Advance Healthcare Gregory J. Moore MD, PhD February 4, 2014

Navigating England s clinical research infrastructure. How NOCRI helps simplify access to accelerate discoveries

Global and Discovery Proteomics Lecture Agenda

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences

Degree Level Expectations, Learning Outcomes, Indicators of Achievement and the Program Requirements that Support the Learning Outcomes

Biotechnology and Life Science Marketing Services Mailing List and Data Card Order Form

2019 Healthcare That Works for All

Smarter Research. Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research

Brain-CODE. Ontario Brain Institute s Integration Platform. April 15 th, 2014

The M.U.R.D.O.C.K. Study

Integrating pharmacological data

University Medical Centres

Data mining on the EPIRARE survey data

TRANSLATIONAL BIOINFORMATICS 101

Personalized medicine in China s healthcare system

Accelerating Development and Approval of Targeted Cancer Therapies

Pharmacology skills for drug discovery. Why is pharmacology important?

The National Institute of Genomic Medicine (INMEGEN) was

IMPLEMENTING BIG DATA IN TODAY S HEALTH CARE PRAXIS: A CONUNDRUM TO PATIENTS, CAREGIVERS AND OTHER STAKEHOLDERS - WHAT IS THE VALUE AND WHO PAYS

IO Informatics The Sentient Suite

ProteinQuest user guide

A Cost Effective Way to De Risk Biomarker Clinical Trials: Early Development Considerations

BIG DATA BREATHES LIFE INTO NEXT-GEN PHARMA R&D

A Strategy for Plant Breeding Data Management in International Agricultural Research

Protein Protein Interaction Networks

THE ROLE OF BIG DATA IN HEALTH AND BIOMEDICAL RESEARCH. John Quackenbush Dana-Farber Cancer Institute Harvard School of Public Health

Un (bref) aperçu des méthodes et outils de fouilles et de visualisation de données «omics»

Ethical and Public Health Implications of Genomic and Personalised Medicine. Dr Ingrid Slade

1. WHY ARE ELECTRONIC MEDICAL RECORDS IMPORTANT FOR PERSONALIZED MEDICINE?

Data Management for Biobanks

Transcription:

Building a Collaborative Informatics Platform for Translational Research: An IMI Project Experience Prof. Yike Guo Department of Computing Imperial College London

Living in the Era of BIG Big Data : Massive amounts of information derived from dry /wet lab investigations, feasibility studies and clinical trials. Big Science: Research silos are evaporating with the merging of scientific methods. Traditional hypothesis-testing studies will couple with data-driven research. Big Collaboration : As evidence accumulates, personalized medicine will become a reality, and patient-specific cancer interventions will become available. Teams of disease specialists, researchers and bioinformaticiansworking in concert in a virtual frontier.

Collaboration Platform is the Key for Doing Big Science with Big Data

New Science Economy Model: Public-Private Partnership forresearch

The Requirements for PPP-based Translational Research IT Infrastructure : hosted and shared by a community Data management: shared with configurable access control by various subgroups of a community Software : PaaSfor development community Analysis : collaborative analysis Knowledge management: Dynamic integration of knowledge based on semantics

PPP Example : EU IMI

U-BIOPRED Project: An IMI Project Innovative Medicines Imitative (IMI) Worlds largest public private partnership 2 Billion Euro: 1 Billion from EU, 1 Billion from EFPIA U-BIOPRED: Unbiased Biomarkers for predicting respiratory disease outcomes 40 Party Collaboration between 10 Pharma& 30 Academic Medical Centres Knowledge Management Work Package Imperial Data Co-ordinator, JnJ TranSMART AZ, Roche, UCB, CNRS

UBIOPRED Knowledge Product : Handprints 1. Reaching international consensus on diagnostic criteria 2. Creating adult/pediatriccohorts and biobanks 3. Creating novel biology handprints by combining molecular, histological, clinical and patient-reported data 4. Validating such handprints in relation to exacerbations and disease progression 5. Refining the handprints by using preclinical and human exacerbation models 6. Predicting efficacy of gold-standard and novel interventions 7. Refining the diagnostic criteria and phenotypes 8. Establishing a platform for exchange, eduction and dissemination

A Typical Querying Process Query e.g. age Sub-set of «X» matched patient Specific «Y» criteria e.g. diagnostic Sub-set of patients «Z» measurements originating from different partners e.g. proteomics Integrated analysis methods 3 pathways Omics Data export to analysis tool Export Omicsdata-matrix and context to tab. files Search for connection to inflammation regulating drugs Query for disease association Create own pathway Upload exported files into Application (cloud/systems) Perform analysis Export analytsis results into KR Use public and proprietary omics data to infer network extensions and connect disjunct networks Select patients which provided sample associated the results and check their physiologic data for significant differences.

The Challenge in Building the KM for UBIOPRED The consortium formed as a loosely coupled virtual organisation Such a VO conducts highly collaborative and multidisciplinary scientific activities Data is generated by many people in the VO with a wide range of connected modalities Knowledge is build through an iterative knowledge production process where data need to be incrementally collected, flexibly integrated, systematically analysed and interactively reported Project has a period but knowledge need to be managed forever Very big task with very little money

U-BIOPRED KM System Design Goal and KM Key Features Cohort-based information integration to support clinical driven integrative biomarker study Cloud based infrastructure to facilitate collaborative translational research An integrated framework supports the management of a collaborative knowledge production process (curation, analysis, content enhancement, reporting, modeling and simulation) A sustainable environment with longevity and total ownership : workflow-based PaaS for integrative analytics: CurationETL workbenches Analytics workbenches A scalable platform adaptable for future science/technology development (such as NGS technology; medical image et.al )

UBIOPRED Collaborative TR Platform Analytical Register : Research Collaboration Management TranSMART: Cohort-based Omics Data Management IDBS InforSense Workflow: Analytical Engine + Development PaaS TSmart DB Sample Tracking Gene Pattern&R Analytical Registry EWB for Assay Data Management InforSense KDE Workflow Virtual Machines IC Cloud (Virtualisation Layer) Physical Layer (Imperial Based Data Centre) 30/04/2012 Deploying TranSmart to UBIOPRED 12

UBIOPRED TR Platform Components Content Building: Curation (ETL/Ontology) Omics Experiment Management (ELN/LIMS) Collaborative Study Management (Collaboration Management) System Biology Study (Modelling and Simulation) UBIOPRED TR Platform Study Data Management (transmart) Biomarker Data Analysis (Analytical Workflow)

Curation: Building Study Contents and Background Knowledge transmart-based UBIOPRED Study Data Management System

OmicsExperiment Management Omicsexperiments (Proteomics, lipidomics experiments (WP7) and pre-clinical study (WP6) are managed with EWB of IDBS

Study Data Management Study oriented curated data together with patient information are integrated and warehoused in transmartfor analysis (WP8) From John Shonet.al transmart, AMIA TBI 2012

Coefficient of variation test Statistical tests (e.g., KS test, permutation test) Workflow-based Analytics: A Proteomics Example Q-TOF mass spectrometry experiment Mass Spectra MS MS data preprocessing 9000 8000 10000 70009000 60008000 7000 5000 6000 4000 5000 30004000 3000 2000 2000 1000 1000 0 m/z peak data Peak Selection Selected peaks Intensity Calibration Baseline Correction Noise Reduction and Smoothing Peak Detection Peak Alignment 8000 6000 4000 2000 0 0 500 1000 1500 2000 2500 3000 Use Monte Carlo method to randomly select other biomarker detection models and alternative parameter settings Intensity 0 Smoothed MS Peaks Smoothed MS Peaks 1070 1075 1080 1085 m/z -1000 0 500 1000 1500 2000 2500 3000 Intensity Normalizationm/z Cross-validation CV training set CV testing set Supervised learning (e.g., SVM, discriminant analysis) Classification model Label prediction Model performance

Workflows in InforSense

Workflow Deployed as Cloud Applications Cloud Scientist Develop, Deploy and Share Experiments on Cloud Experiment Experiment Experiment Scientist Consume Experiments on Cloud Experiment Deployment Elastic Scheduler OS VM OS VM OS VM Develop, Deploy and Share Experiments on Cloud Hypervisor Physical Infrastructure Scientist Scientist

A Development Cycle of Cloud Application Building via Workflow

Analytical Register: Enables Collaborative Knowledge Management

Mapping Research Collaboration in UBIOPRED UBIOPRED Work Package Cohort Investigation Procedure Researcher Study Organisation

Analytical Register: A Study View

Analytical Register: Work Package View

Analytical Register: A Cohort View

Analytical Register: Interfacing to Cohort Selection

Analytical Register Cohort View

Analytical Register: An Investigation View

Analytical Register: Investigation Drill Down

Analytical Register Further Drill Down: Method

Analytical Register: Interfacing to Sample Tracking

Analytical Register: Fully Integrated with TranSMART

UBIOPRED System Biology Study 1. We have developed p38 pathway model in SBML and perform simulation throughput MATLAB.

UBIOPRED System Biology Study 2. We have proposed a novel glucocorticoid signalling pathway model in SBML.

UBIOPRED System Biology Study 3. We have proposed some hypotheses to suggest possible crosstalk between GC and p38 pathways. 4. We then construct the integrated pathway model in SBML. 5. Test the stability of this integrated model.

Extending UBIOPRED to Support Multiple Projects Project 1 Project n Virtual Container DB Instance Virtual Container Virtual Container Virtual Container DB Instance DB Instance DB Instance

etriks: European translational research and knowledge management services

Conclusion: the end of the begining UBIOPRED knowledge management system supports cross-institutional large scale translational research UBIOPRED knowledge management system support the entire knowledge production process For PPP-based TR projects, collaborative research support in a VO environment is circuital Modern technology such as cloud and workflow provide the technical foundation for its implementation The development is moving forward to support IMI translational projects in general by etriksproject