Data Integration and Decision-Making For Biomarkers Discovery, Validation and Evaluation. D. POLVERARI, CTO October 06-07 2008

Similar documents
An EVIDENCE-ENHANCED HEALTHCARE ECOSYSTEM for Cancer: I/T perspectives

Data and beyond

BIOINFORMATICS Supporting competencies for the pharma industry

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

Analance Data Integration Technical Whitepaper

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR)

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Improve Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Business Process Management & Workflow Solutions

A leader in the development and application of information technology to prevent and treat disease.

> Semantic Web Use Cases and Case Studies

Big Data and the Data Lake. February 2015

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Delivering the power of the world s most successful genomics platform

ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN):

Integrating Bioinformatics, Medical Sciences and Drug Discovery

Designing Dashboards and Scorecards for End-User Needs. Jim Hadley

Analance Data Integration Technical Whitepaper

Toronto 26 th SAP BI. Leap Forward with SAP

The main problems of business

Big Data Trends A Basis for Personalized Medicine

ProteinQuest user guide

Luncheon Webinar Series May 13, 2013

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

Find the signal in the noise

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

<Insert Picture Here> The Evolution Of Clinical Data Warehousing

Task definition PROJECT SCENARIOS. The comprehensive approach to data integration

IO Informatics The Sentient Suite

Data Warehouse Architecture for Financial Institutes to Become Robust Integrated Core Financial System using BUID

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

How To Choose A Business Intelligence Toolkit

Dr Alexander Henzing

SAP Healthcare Analytics Solutions Provide physicians and researchers access to patient data from various systems in realtime

DATA QUERY: ADVANCED DATA MODELLING

ON DEMAND ACCESS TO BIG DATA THROUGH SEMANTIC TECHNOLOGIES. Peter Haase fluid Operations AG

ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG

SILOBREAKER ENTERPRISE SOFTWARE SUITE

Preparing the scenario for the use of patient s genome sequences in clinic. Joaquín Dopazo

PONTE Presentation CETIC. EU Open Day, Cambridge, 31/01/2012. Philippe Massonet

Big Data Visualization for Genomics. Luca Vezzadini Kairos3D

A Workflow Service for Biomedical Applications

How To Make Data Streaming A Real Time Intelligence

Web-Based Genomic Information Integration with Gene Ontology

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Enterprise Search. Simplified. January 2013

Exploring the Synergistic Relationships Between BPC, BW and HANA

SAP Business One and SAP HANA

MOBILE ARCHITECTURE FOR DYNAMIC GENERATION AND SCALABLE DISTRIBUTION OF SENSOR-BASED APPLICATIONS

ERPConnect Services: Integrating your Nintex Workflow solutions with SAP. Theobald Software Inc.

SAP S/4HANA Embedded Analytics

Presenting data: how to convey information most effectively Centre of Research Excellence in Patient Safety 20 Feb 2015

Sage X3 for Food & Beverage

QLIKVIEW DATA FLOWS TECHNICAL BRIEF

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

Driving Innovation in Licensing Through Competitive Intelligence and Big Data Analytics

Oracle PharmaGRID Response. Dave Pearson Oracle Corporation UK

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Data Integration of Bioinformatics and Web-Based Software Development

Research Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper. By Erich A. Gombocz

Big Data and Analytics in Government

Oracle BI Suite Enterprise Edition For Discoverer Users. Mark Rittman, Rittman Mead Consulting

An Introduction to Genomics and SAS Scientific Discovery Solutions

Big Data and Text Mining

Decision Support and Business Intelligence Systems. Chapter 1: Decision Support Systems and Business Intelligence

<Insert Picture Here> Oracle BI Standard Edition One The Right BI Foundation for the Emerging Enterprise

FROM DATA STORE TO DATA SERVICES - DEVELOPING SCALABLE DATA ARCHITECTURE AT SURS. Summary

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

SPATIAL DATA CLASSIFICATION AND DATA MINING

Ganzheitliches Datenmanagement

Three Open Blueprints For Big Data Success

Building Your Company s Data Visualization Strategy

Extending The Value of SAP with the SAP BusinessObjects Business Intelligence Platform Product Integration Roadmap

An agent-based layered middleware as tool integration

TURN YOUR DATA INTO KNOWLEDGE

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Distributed Database for Environmental Data Integration

MDM and Data Warehousing Complement Each Other

ProStix Business Intelligence (BI)

OpenText Content Hub for Publishers

Library page. SRS first view. Different types of database in SRS. Standard query form

Manufacturing CUSTOM CHEMICALS AND SERVICES, SUPPORTING SCIENTIFIC ADVANCES FOR HUMAN HEALTH

The Need for BIG DATA Processing

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

LOG MANAGEMENT AND SIEM FOR SECURITY AND COMPLIANCE

SAP Data Services 4.X. An Enterprise Information management Solution

Symantec Client Management Suite 7.6 powered by Altiris technology

SAP BusinessObjects Edge BI, Standard Package Preferred Business Intelligence Choice for Growing Companies

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Data Grids. Lidan Wang April 5, 2007

TopBraid Insight for Life Sciences

KNOWLEDGENT WHITE PAPER. Big Data Enabling Better Pharmacovigilance

Automating Healthcare Claim Processing

Transcription:

Data Integration and Decision-Making For Biomarkers Discovery, Validation and Evaluation D. POLVERARI, CTO October 06-07 2008

Data integration definition and aims Definition : Data integration consists in combining multiple sources of information to be viewed as one Aims : To Make data easily available for : Sharing Analysis and indicators computation Visualization Data integration aims to reduce decision-making step duration

Biomarkers business chain : from R&D to market dataflow R&D (academic & private) Clinical evaluation Regulatory framework and policy Market Biomarkers Feedbacks Decision-making Information

Biomarkers business chain : from R&D to market dataflow R&D (academic & private) Clinical evaluation R&D data Public data Decision-making R&D data Internal/private data Need of internal and external data for decision-making

Biomarkers business chain : from R&D to market dataflow R&D (academic & private) Clinical evaluation Regulatory framework and policy R&D data Public data Decision-making R&D data Internal/private data Clinical data (CRO, Pharma, Biotech) Clinical data also enhance the knowledge for further discoveries

Biomarkers business chain : from R&D to market dataflow R&D (academic & private) Clinical evaluation Regulatory framework and policy Market R&D data Public data Decision-making Evidence data R&D data Internal/private data Clinical data (CRO, Pharma, Biotech) Evidence base contains informations about commercialized products

Biomarkers business chain : from R&D to market dataflow PIR R&D (academic & private) Clinical evaluation Regulatory framework and policy Market Biomarkers Feedbacks R&D data Public data Decision-making efficiency enhancement DNAchips Evidence data Indicators update for decision-making compounds R&D data Internal/private data Clinical data (CRO, Pharma, Biotech) Feedbacks from commercialized products enhance Evidence DB and R&D : «Biomarker Vigilance»

Data integration issues Data volume and sources diversity (more than 1000 sources only in R&D) Number of Database published in Medline Wren et al. BMC Bioinformatics 2005 6(Suppl 2):S2 Entries growth in several life-science databases http://www.genome.ad.jp High frequency data update Data quality (error rates measures, curation workflow) Data heterogeneity

Data integration issues Data heterogeneity : Many concepts : genes, proteins, diseases, pathways, Social, financial, ethical, Many data formats (objects databases, relational databases, Text files, XML, ), Terms diversity (eg. gene names synonymy => ATP2B1 = PMCA1), Semantic diversity (eg. Population genetics : gene means allele), Syntax diversity (eg. Pr, prot, proteins, )

Data integration issues R&D (academic & private) Clinical evaluation Regulatory framework and policy Market Biomarkers Feedbacks R&D Data Clinical Data Regulatory data Evidence Data Each user is different! Each one needs common and specifics data in the decision making process.

How to access and share data to enhance decisionmaking efficiency?

Integration strategies : data warehouse Build Model : description of data relations and semantic All the data are extracted, transformed, loaded in this unique data model Data Warehouse main advantages : Request Interface - Centralized data storage and maintenance, - Data extraction is simplified. Data Warehouse main drawbacks : - Model complexity Data Model DATA WAREHOUSE - Model weakness - Available queries limited by designed model (Sql technology) DB1 DB2... DBn Ex : Archimed (hôpital universitaire de Genève) Data Warehouses aren t new, they used to be found in financial and industrial fields for years

Example of datawarehouse Integration in a french bank Only several million customers A model containing about 100 tables 10 years old and about 20 major model updates implying ca. 1 to 3 months activity stop each Query limited to simple extraction, or predefined queries Analysis studies can t be done directly through the dataware New analysis projects (commercial efficiency or customer comportemantal study) need at least ten to fifteen persons for one year Datawarehouses are now used for storage purpose Data Warehouses strategy is restricted to well-known data with no/few concepts and model evolution

Integration strategies : databases federation Sources used as Multiple autonomous remote databases User queries are rewritten in sub-queries in the language of each data source by the mediator Request Interface Federation main advantages : - Data sources stay under each team responsibility - Data sources stay independent (Remote or locally) Request mediation Layer connector 1 connector 2... connector n - In theory, a new source = a new connector - User can import his own data Federation main drawbacks : DB1 DB2... DBn - Queries are still limited by mediator semantic Ex : SRS, Tropical Biominer, etc. - Need developers to extend either connectors or mediator - Queries may be time consuming

Exemple of federated databases : SRS, BioWidsom Ltd - Sequence Retrieval System, a web based application dedicated to sequence data integration - Lot of connectors available for life sciences data sources - A user-friendly query interface - Creating new connectors implies good development skills - User queries are imposed by the mediator which cannot be modified - Integration of your own data is tricky - need to handle multiple technologies to extend base capabilities - Not optimized for large volume queries DBs Federation is really more adapted than datawarehouse but still too rigid to face heterogeneous data and user specificity

Integration strategies : Decison support system (DSS) DSS are rapidly growing in industrial and financial field Like federative databases the data sources stay independent The mediation layer is replaced by the decisional layer : - A workflow designer - User and general workflows (connectors and treatment), - Workflow execution engine DSS main advantages : Dashboard Actor 1 Dashboard Actor 2 Workflow 1 Workflow 2 Workflow n Decisional Engine - Graphically created workflow, - Experts don t need informatician to design solution, - Integration time is reduced by 3 to 30 times DB1 DB2 DB3... DBn - Optimized for massive amounts of data, - Now, Mediators can be user specific - No semantic model (Data Heap) DSS platform for Biomarkers

DSS interface example : Amadea connectors and operators list Complex workflow built by simple connector/operator drag and drop steps Connectors / Operator parameters Real-Time visualization of workflow modification (even with milion records) With the authorization of Dr. Cécile Fairhead from Pasteur Institute, Paris DSS allows Business experts to explore their data and to create integration workflows alone in few clics!

HKIS example HKIS : a European R&D and clinical network on cancer (Fr, Ger, Ita) using data from genomics, proteomics, bibliographics, clinical trials etc. c.a. few weeks from data integration to HKIS platform deployment Data integration and analysis wokflows design DECISION Web deployment to share data and analysis workflow

Recommandations Data providers side: Data have to stay stored in independent databases maintained, enriched and verified by each responsible team, Data providers have to structure their databanks based on an existing ontology, Links between data in other databases should be available if they exist, Data must be downloadable in a readable format without using a query interface, DSS side: DSS must make data available, whatever their model and semantic are, in order to allow users to integrate easily new data resources, DSS Workflows should be created without coding skills (Graphically edited) and allow easy integration of new analytics tools, Analytical workflows must be easily shared and used on-line via web portals, Results visualization and indicators have to be simple and adapted to the user discipline (researchers, jurists, etc)

Contacts & Informations : contact@atragene.com Tel. +33 (0)1 77 01 80 65 Fax : +33 (0)1 45 21 17 35

Integration strategies : Decison support system (DSS) DSS are rapidly growing in industrial and financial field Like federative databases the data sources stay independent The mediation layer is replaced by the decisional engine : -Specific data integration, -Analysis workflows on-the-fly design DSS main advantages : Dashboard Actor 1 Dashboard Actor 2 Workflow 1 Workflow 2 Workflow n Decisional Engine - Graphically created workflow, - Experts don t need informatician to design solution, - Integration time is reduced by 3 to 30 times - Optimized for massive amounts of data, DB1 DB2 DB3... DBn - Now, Mediators can be user specific - No semantic model (Data Heap) DSS platform for Biomarkers