Open PHACTS Data integration for all. Andrew Leach

Size: px
Start display at page:

Download "Open PHACTS Data integration for all. Andrew Leach"

Transcription

1 Open PHACTS Data integration for all Andrew Leach

2 Task, workflow and results Task: create a focussed set to identify leads against voltagegated potassium channels AUREUS search targets: voltage-gated potassium channels Apply filters (MW, clogp, Lipinski + remove undesirable target) ~1000 molecules Series for lead optimisation Similarity searches (RG, TP, Daylight) Cluster analysis ~10000 molecules selected IonWorks single shot screening 5 full curve actives (in at least one test occasion) 240 single shot hits progressed into full curve assay Stefan Senger, ca. 2004

3 We (may) know where the data is, but integrating is a pain, bespoke, and often only for experts Q: Identify all oxidoreductase inhibitors with an activity <100nM in both mouse and human Q: The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X. Q: For a given interaction profile, give me compounds similar to it. ChEMBL DrugBank Gene Ontology Wikipathways ChEBI Uniprot UMLS ConceptWiki ChemSpider Internal etc.

4 The Innovative Medicines Initiative Biggest public-private partnership in area of medicine Collaboration between European Commission and European Federation of Pharmaceutical Industries and Associations (EFPIA) Promotion of medical innovation in Europe Tackle key bottlenecks Recognises in kind contributions Focus on key problems Efficacy, Safety, Education & Training, Knowledge Management

5 Public Domain Drug Discovery Data Pharma are accessing, processing, storing & re-processing GSK Literature PatentsLiterature PubChem Genbank PatentsLiterature PubChem Genbank PatentsLiterature PubChem Genbank Patents PubChem Genbank Databases Databases Databases Databases Downloads Downloads Downloads Downloads AZ Pfizer Merck Firewalled Databases Data Integration Data Analysis Firewalled Databases Data Integration Data Analysis Firewalled Databases Data Integration Data Analysis Firewalled Databases Data Integration Data Analysis Why repeat at each company?

6 Information Tombs Built for primary use-case Tailored indexes Tailored GUIs Unique language & metadata Poor interoperability/integration In vivo Portfolio Literature HR Synthesis SAR Docs Safety Etc

7 Project Partners Pfizer Limited Coordinator Universität Wien Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca GlaxoSmithKline Esteve Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen OpenLink

8 A use-case driven approach, focussed on delivery for the real world Main architecture, technical implementation and primary capabilities driven by a set of prioritised research questions Based on the main research questions define prioritised data sources Develop three Exemplars to demonstrate the capabilites of the Open PHACTS System and to define interfaces and input/output standards

9 Work Streams Build: Service layer and resource integration Drive: Development of exemplar work packages & Applications Sustain: Community engagement and long-term sustainability Consumer Firewall Supplier Firewall OPS Service Layer Assertion & Meta Data Mgmt Transform / Translate Integrator Corpus 1 Target Dossier Db 2 Compound Dossier Db 3 Db 4 Pharmacological Networks Std Public Vocabularies Business Rules Corpus 5 Work Stream 2: Exemplar Drug Discovery Informatics tools Develop exemplar services to test OPS Service Layer Target Dossier (Data Integration) Pharmacological Network Navigator (Data Visualisation) Compound Dossier (Data Analysis) Work Stream 1: Open Pharmacological Space (OPS) Service Layer Standardised software layer to allow public DD resource integration Define standards and construct OPS service layer Develop interface (API) for data access, integration and analysis Develop secure access models Existing Drug Discovery (DD) Resource Integration

10 Platform Explorer Apps API Standards

11 Prioritised research questions Number sum Nr of 1 Question All oxido,reductase inhibitors active <100nM in both human and mouse Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound? Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives For a given interaction profile, give me compounds similar to it The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X. Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not) A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature Give me all active compounds on a given target with the relevant assay data Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease) Identify all known protein-protein interaction inhibitors Kamal Azzaoui et al, DDT in press 2013

12 Pathways Interactions Proteins Genes Transcripts ` Pharmacological Activities Clinical Drug Applications Biological Processes Pathological Processes Drugs Compounds Chemicals Diseases Indications

13 Open PHACTS will be built upon semantic technologies and standards, providing an opportunity to: Demonstrate that semantic technologies can perform to the same degree as existing systems Provide an open platform to address common drug discovery questions; expose pharma s use-cases and knowledge Create a pre-competitive infrastructure that can be sustained and expanded into new areas; providing the platform for future collaboration Why Semantic Technologies? Rapidly developing technology, powerful algorithms for integration and querying of data schema free Open standards facilitating sharing public, private, commercial A community of developers, leverage work going on elsewhere

14 User Interfaces & Applications Linked Data API Linked Data Cache Identity Mapping Service Identity Resolution Service Domain Specific Services Data Key architecture components

15 Core Platform Open PHACTS Explorer 1 st Gen Apps Partner Apps Oct App Framework Identity Resolution Service (ConceptWiki) Identifier Management Service (BridgeDb+) Adenosine receptor 2a P12374 EC CS4532 Linked Data API (RDF/XML, TTL, JSON) Semantic Workflow Engine (LARKC) Data Cache (Triple Store) Chemistry Normalisation & Q/C ChemSpider Domain Specific Services VoID VoID VoID VoID VoID Data Import Nanopub Nanopub Nanopub Public Ontologies Db Db Public Content Db Commercial Db User Annotations

16 Building Quality High quality chemical names and synonyms. Leverage ChemSpider and Concept wiki curation, Q/C and mapping ChemSpider Validation and Standardization Platform (CVSP) for flagging chemical representation issues Basic curation interface for editing concept terms available through Concept Wiki Data quality issues detected in data sources reported back to depositors for their evaluation

17 STANDARD_TYPE STANDARD_UNITS COUNT(*) Quantitative Data Challenges IC50 nm STANDARD_TYPE UNIT_COUNT IC50 ug.ml IC AC50 7 IC50 ug/ml 2038 Activity 421 IC50 ug ml EC50 39 IC50 mg kg IC50 46 IC50 molar ratio 178 ID50 42 IC50 ug 117 Ki 23 IC50 % 113 Log IC50 4 IC50 um well-1 52 Log Ki 7 IC50 p.p.m. 51 Potency 11 IC50 ppm 36 log IC50 0 IC50 um-1 25 IC50 nm kg-1 25 IC50 milliequivalent 22 IC50 kj m-2 20 >5000 types Implemented using the Quantities, Dimension, Units, Types Ontology ( ~ 100 units

18 Chemistry within Open PHACTS The challenges associated with handling chemistry data require the support of a publicly accessible platform to integrate, standardise and host the data. ChemSpider, an online database from the Royal Society of Chemistry hosts the chemical compound collection underpinning Open PHACTS and is responsible for standardising the chemical compounds and providing both regular updates and ongoing data curation. To serve the Open PHACTS platform, a structure validation and standardisation platform (CVSP) has been developed to ensure chemical structures are normalised to rules derived from the FDA structure standardisation guidelines and modified based on input from the EFPIA members.

19 The many challenges of chemistry representation

20 Identities within Open PHACTS Open PHACTS integrates information from multiple different databases, many of which use unique identifiers. The Identity Mapping Service (IMS) ensures these identifiers are linked and available for use interchangeably throughout the Open PHACTS platform. To maintain vocabulary heterogeneity and provide interoperability, the ConceptWiki is used. The ConceptWiki is an open access system that accepts essentially unlimited numbers of synonyms, in multiple languages, and then maps all the terms correctly back to one unique concept identifier, alleviating vocabulary problems and identifier differences. Synonyms: Aspirin Dispril 2-Acetoxybenzoic acid Acetyl salicylic acid Salicylic acid, acetyl- DrugBank ID: APRD00264 ChEBI ID: CHEBI:15365 ChemSpider ID: 2157 FDA: Explorer IMS

21 Why Provenance Matters Using a community specification known as VoID (Vocabulary of Interlinked Datasets) Record version, author, derivations Builds trust with users know what you are querying (and why it might have changed) Provides mechanism to provide usage statistics back to providers, help them understand the value Easier to track errors and ensure quality Actively participating in community provenance programme (W3C)

22 What does Open PHACTS do? Currently integrated databases Number of Database triples (million) ACD Labs / ChemSpider ChEBI 0.91 ChEMBL_v ConceptWiki 3.74 DrugBank 0.52 Enzyme 0.07 Gene Ontology 0.85 SwissProt WikiPathways 0.14 TOTAL Open PHACTS draws together multiple sources of publiclyavailable pharmacological and chemical data, allowing public access to the information via the Open PHACTS Explorer, an intuitive interface.

23 Licensing: 3 public databases All are available as open RDF you can download right now. But: Drugbank OMIM Comparative Toxicogenomics Database

24 CUTTING THE GORDIAN KNOT What are the problems with licensing we had to address? To make the data and software generated by the project usable and reusable Multiplicity of unclear or non-standard licenses on original data sources Public can mean use but not redistribute, use in commercial environment, Legal position on use and reuse extremely unclear Different issues than just linking to data What is the legal status of integrated collections of the above, and of derived knowledge from such a collection? Appropriate software license selection Legal clarity for EFPIA and end users Approaches for commercial data integration, EFPIA in-house data AIM: to enable maximum possible dissemination and usability of the integrated data and architecture generated by the project - with approaches that will be applicable in other data integration projects

25 Data Licensing Solution Chose John Wilbanks as consultant A framework built around STANDARD well-understood Creative Commons licences and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms Declare expectations to users and data publishers One size won t fit all requirements

26 Open PHACTS and the scientific community Associated partners Support, information Exchange of ideas, data, technology Opportunities to demo at ctions, mostommunity webinars Need MoU MoU +Annexe Associated partners Development partnerships Development partnerships Influence on API developments Opportunities to demo ideas & use cases to core team Need MoU and annexe Consortium Consortium 28 current members

27 Example applications Advanced analytics ChemBioNavigator TargetDossier PharmaTrek UTOPIA Navigating at the interface of chemical and biological data with sorting and plotting options Interconnecting Open PHACTS with multiple target centric services. Exploring target similarity using diverse criteria Interactive Polypharmacology space of experimental annotations Semantic enrichment of scientific PDFs Predictions GARFIELD etox collector Prediction of target pharmacology based on the Similar Ensemble Approach Automatic extraction of data for building predictive toxicology models in etox project

28 ChemBioNavigator Matthias Rarey et al PharmaTrek Jordi Mestres et al

29 Call for expressions of interest Open PHACTS ENSO proposal Open PHACTS intends to submit a proposal for IMI ENSO funding. We are currently drafting our ENSO proposal and invite all EFPIA companies with an interest in Open PHACTS to contact us to discuss opportunities for involvement. The Open PHACTS Foundation Open PHACTS has a successor organisation, the Open PHACTS Foundation. Please register your interest with us for further information on membership and other opportunities to get involved within Open PHACTS. For more information and/or to register interest us at pmu@openphacts.org

30 Acknowledgements Stefan Senger Gerhard Ecker The OpenPHACTS consortium

31

32 SERVICES Application (Knowledge) Fact Visualisation e.g. Target Dossiers; SAR Visualisation Assertions e.g. Gene-to-Disease; Compound-to-Target; Compound-to-ADR Standards Ontology/taxonomy; Minimum information guide; Dictionaries; Interchange mapping Data Targets; Chemistry; Pharmacology; Literature; Patents After Barnes et al Nature Review Drug Discovery 2009 doi /nrd2944

33 Nanopublications Capturing scientific information in the Triple Store

The Open PHACTS Discovery Platform Semantic data integration for Medicinal Chemists

The Open PHACTS Discovery Platform Semantic data integration for Medicinal Chemists Pharmacoinformatics Research Group Department of Pharmaceutical Chemistry The Open PHACTS Discovery Platform Semantic data integration for Medicinal Chemists Gerhard F. Ecker Dept. of Pharmaceutical Chemistry,

More information

BIG DATA EUROPE. Integrating Big Data, Software & Communities for Addressing Europe s Societal Challenges

BIG DATA EUROPE. Integrating Big Data, Software & Communities for Addressing Europe s Societal Challenges BIG DATA EUROPE Integrating Big Data, Software & Communities for Addressing Europe s Societal Challenges Partners Mission Lower barrrier for using big data technologies o Required effort and resources

More information

Integrating pharmacological data

Integrating pharmacological data Integrating pharmacological data For scientists For software and application developers A semantic data integration infrastructure Open PHACTS is a 3-year project of the Innovative Medicines Initiative

More information

Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science

Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science Approaching a Pharma Big Data Problem: Requirements of the CI Informatics

More information

Big Data Europe

Big Data Europe BIG DATA EUROPE SC1 Hangout Big Data Challenge in Health www.big-data-europe.eu Empowering Communities with Data Technologies Agenda for Today Welcome! Brief into and background (OPF) Introduction to the

More information

TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos

TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos Featuring Syngenta's report on its successful pilot Webcast Agenda Overview of Problem and

More information

Big Data in Drug Discovery

Big Data in Drug Discovery Big Data in Drug Discovery David J. Wild Assistant Professor & Director, Cheminformatics Program Indiana University School of Informatics and Computing djwild@indiana.edu - http://djwild.info Epochs in

More information

Logical Semantic Warehouse - Developing Your Own Semantic Ecosystem Peter Lawrence, TopQuadrant

Logical Semantic Warehouse - Developing Your Own Semantic Ecosystem Peter Lawrence, TopQuadrant Logical Semantic Warehouse - Developing Your Own Semantic Ecosystem Peter Lawrence, TopQuadrant Semantic Ecosystem Solution Value Chain Enrich... searching and locating information using EVN to manage

More information

Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine

Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine Brochure More information from http://www.researchandmarkets.com/reports/2719842/ Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine Description: The free/open source

More information

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences It s not information overload, it s filter failure. Clay Shirky Life Sciences organizations face the challenge

More information

De novo design in the cloud from mining big data to clinical candidate

De novo design in the cloud from mining big data to clinical candidate De novo design in the cloud from mining big data to clinical candidate Jérémy Besnard Data Science For Pharma Summit 28 th January 2016 Overview the 3 bullet points Cloud based data platform that can efficiently

More information

Dr Alexander Henzing

Dr Alexander Henzing Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander

More information

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative

More information

How to create and interpret the predictive analysis of a compound

How to create and interpret the predictive analysis of a compound How to create and interpret the predictive analysis of a compound Platform with suite of tools Predict & understand biological effects of small molecules & compounds Predict targets and metabolites, potential

More information

Pivot Park Screening Centre participates in novel 196 million pan-european drug discovery platform

Pivot Park Screening Centre participates in novel 196 million pan-european drug discovery platform PRESS RELEASE Pivot Park Screening Centre participates in novel 196 million pan-european drug discovery platform Pivot Park Screening Centre in Oss will play an important role in a new pan-european drug

More information

LDIF - Linked Data Integration Framework

LDIF - Linked Data Integration Framework LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,

More information

dixa a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University

dixa a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University dixa a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University Current protocol for chemical safety testing Short Term Tests for Genetic Toxicity Bacterial Reverse

More information

ChemCloud - Chemical e-science Information Cloud. Adrian Paschke, Freie Universitaet Berlin Stephan Heineke, FIZ CHEMIE

ChemCloud - Chemical e-science Information Cloud. Adrian Paschke, Freie Universitaet Berlin Stephan Heineke, FIZ CHEMIE ChemCloud - Chemical e-science Information Cloud Adrian Paschke, Freie Universitaet Berlin Stephan Heineke, FIZ CHEMIE 1 About FIZ CHEMIE 1830 Founding of Pharmaceutisches Centralblatt Reestablished 1981

More information

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database Dina Vishnyakova 1,2, 4, *, Julien Gobeill 1,3,4, Emilie Pasche 1,2,3,4 and Patrick Ruch

More information

Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and

Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Supplemental Information Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Models Antony J. Williams 1*, John Wilbanks 2 and Sean Ekins 3 1 Royal Society of Chemistry, 904 Tamaras

More information

Cheminformatics and its Role in the Modern Drug Discovery Process

Cheminformatics and its Role in the Modern Drug Discovery Process Cheminformatics and its Role in the Modern Drug Discovery Process Novartis Institutes for BioMedical Research Basel, Switzerland With thanks to my colleagues: J. Mühlbacher, B. Rohde, A. Schuffenhauer

More information

Recent Developments in Chemoinformatics Education. Val Gillet University of Sheffield

Recent Developments in Chemoinformatics Education. Val Gillet University of Sheffield Recent Developments in Chemoinformatics Education Val Gillet University of Sheffield Chemoinformatics as a Discipline Chemical Information Systems and Services have been established for many years Chemical

More information

> Semantic Web Use Cases and Case Studies

> Semantic Web Use Cases and Case Studies > Semantic Web Use Cases and Case Studies Case Study: Applied Semantic Knowledgebase for Detection of Patients at Risk of Organ Failure through Immune Rejection Robert Stanley 1, Bruce McManus 2, Raymond

More information

Oracle PharmaGRID Response. Dave Pearson Oracle Corporation UK

Oracle PharmaGRID Response. Dave Pearson Oracle Corporation UK Oracle PharmaGRID Response Dave Pearson Oracle Corporation UK Grid Concepts and Vision! Everything is a service! Resource virtualisation and sharing Hardware, storage, network, data, function, instruments

More information

LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA

LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA Milos Jovanovik, Bojan Najdenov, Dimitar Trajanov Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University Skopje,

More information

Call 2014: High throughput screening of therapeutic molecules and rare diseases

Call 2014: High throughput screening of therapeutic molecules and rare diseases Call 2014: High throughput screening of therapeutic molecules and rare diseases The second call High throughput screening of therapeutic molecules and rare diseases launched by the French Foundation for

More information

Improve Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security

Improve Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security SINEQUA FOR LIFE SCIENCES DRIVE INNOVATION. ACCELERATE RESEARCH. SHORTEN TIME-TO-MARKET. 6 Ways to Leverage Big Data Search & Content Analytics for a Pharmaceutical Company Improve Cooperation in R&D Catalyze

More information

Practical Image Management for

Practical Image Management for Practical Image Management for Pharma Experiences and Directions. Use of Open Source Stefan Baumann, Head of Imaging Infrastructure, Novartis Agenda Introduction Drug Development, Imaging Trial Overview

More information

BIOINFORMATICS Supporting competencies for the pharma industry

BIOINFORMATICS Supporting competencies for the pharma industry BIOINFORMATICS Supporting competencies for the pharma industry ABOUT QFAB QFAB is a bioinformatics service provider based in Brisbane, Australia operating nationwide and internationally. QFAB was established

More information

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LinkZoo: A linked data platform for collaborative management of heterogeneous resources LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research

More information

We use Reaxys intensively for hit identification, hit-to-lead and lead optimization.

We use Reaxys intensively for hit identification, hit-to-lead and lead optimization. CASE STUDY Dr. Fabio C. Tucci, COO of Epigen Biosciences We use Reaxys intensively for hit identification, hit-to-lead and lead optimization. CREATING NEW ASSETS Epigen Biosciences is a start-up pharmaceutical

More information

Hubble: Linked Data Hub for Clinical Decision Support

Hubble: Linked Data Hub for Clinical Decision Support Hubble: Linked Data Hub for Clinical Decision Support Rinke Hoekstra 1,3, Sara Magliacane 1 Laurens Rietveld 1, Gerben de Vries 2, Adianto Wibisono 2, and Stefan Schlobach 1 1 Department of Computer Science,

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

Big Data and Text Mining

Big Data and Text Mining Big Data and Text Mining Dr. Ian Lewin Senior NLP Resource Specialist Ian.lewin@linguamatics.com www.linguamatics.com About Linguamatics Boston, USA Cambridge, UK Software Consulting Hosted content Agile,

More information

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes

More information

Find the signal in the noise

Find the signal in the noise Find the signal in the noise Electronic Health Records: The challenge The adoption of Electronic Health Records (EHRs) in the USA is rapidly increasing, due to the Health Information Technology and Clinical

More information

6 ELIXIR Domain Specific Services

6 ELIXIR Domain Specific Services 6 ELIXIR Domain Specific Services Work stream leads: Alfonso Valencia (ES), Inge Jonassen (NO), Jose Leal (PT) Work stream members: Nils-Peder Willassen (NO), Finn Drablos (NO), Mark Viant (UK), Ferran

More information

Pharmacology skills for drug discovery. Why is pharmacology important?

Pharmacology skills for drug discovery. Why is pharmacology important? skills for drug discovery Why is pharmacology important?, the science underlying the interaction between chemicals and living systems, emerged as a distinct discipline allied to medicine in the mid-19th

More information

Workprogramme 2014-15

Workprogramme 2014-15 Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES

More information

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr Lecture 11 Data storage and LIMS solutions Stéphane LE CROM lecrom@biologie.ens.fr Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog

More information

Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge

Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge Data Visualization in Cheminformatics Simon Xi Computational Sciences CoE Pfizer Cambridge My Background Professional Experience Senior Principal Scientist, Computational Sciences CoE, Pfizer Cambridge

More information

Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk

Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk Text Mining for Health Care and Medicine Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk The Need for Text Mining MEDLINE 2005: ~14M 2009: ~18M Overwhelming information in textual,

More information

THOMSON REUTERS CORTELLIS FOR INFORMATICS. REUTERS/ Aly Song

THOMSON REUTERS CORTELLIS FOR INFORMATICS. REUTERS/ Aly Song THOMSON REUTERS CORTELLIS FOR INFORMATICS REUTERS/ Aly Song THOMSON REUTERS CORTELLIS FOR INFORMATICS 1 Table of Contents Table of Contents...1 The challenge... 2 The solution... 2 WHAT CAN YOU DO WITH

More information

EDITORIAL MINING FOR GOLD : CAPITALISING ON DATA TO TRANSFORM DRUG DEVELOPMENT. A Changing Industry. What Is Big Data?

EDITORIAL MINING FOR GOLD : CAPITALISING ON DATA TO TRANSFORM DRUG DEVELOPMENT. A Changing Industry. What Is Big Data? EDITORIAL : VOL 14 ISSUE 1 BSLR 3 Much has been written about the potential of data mining big data to transform drug development, reduce uncertainty, facilitate more targeted drug discovery and make more

More information

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many

More information

cheminformatics nomenclature activity binding based data sets knowledge thesauri article

cheminformatics nomenclature activity binding based data sets knowledge thesauri article PubChem chemical biology domain caspase activity standards cheminformatics nomenclature activity semantic enzyme reporter viability fluorescence binding based data sets programming knowledge search screening

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

www.iproteos.com Corporate Presentation November, 2013

www.iproteos.com Corporate Presentation November, 2013 www.iproteos.com Corporate Presentation November, 2013 The company Iproteos is an early-stage drug development company founded in 2011: Spin-Out from Institute for Research in Biomedicine (IRB Barcelona)

More information

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials Pharmaceutical leader deploys TIBCO Spotfire enterprise analytics platform across its drug discovery organization

More information

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan

More information

PerCuro-A Semantic Approach to Drug Discovery. Final Project Report submitted by Meenakshi Nagarajan Karthik Gomadam Hongyu Yang

PerCuro-A Semantic Approach to Drug Discovery. Final Project Report submitted by Meenakshi Nagarajan Karthik Gomadam Hongyu Yang PerCuro-A Semantic Approach to Drug Discovery Final Project Report submitted by Meenakshi Nagarajan Karthik Gomadam Hongyu Yang Towards the fulfillment of the course Semantic Web CSCI 8350 Fall 2003 Under

More information

TRANSFoRm: Vision of a learning healthcare system

TRANSFoRm: Vision of a learning healthcare system TRANSFoRm: Vision of a learning healthcare system Vasa Curcin, Imperial College London Theo Arvanitis, University of Birmingham Derek Corrigan, Royal College of Surgeons Ireland TRANSFoRm is partially

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Carlos Iglesias, Open Data Consultant.

Carlos Iglesias, Open Data Consultant. Carlos Iglesias, Open Data Consultant. contact@carlosiglesias.es http://es.linkedin.com/in/carlosiglesiasmoro/en @carlosiglesias mobile: +34 687 917 759 Open Standards enthusiast and Open advocate that

More information

Bio-IT World 2013 Best Practices Awards

Bio-IT World 2013 Best Practices Awards Published Resources for the Life Sciences 250 First Avenue, Suite 300, Needham, MA 02494 phone: 781-972-5400 fax: 781-972-5425 Bio-IT World 2013 Best Practices Awards Celebrating Excellence in Innovation

More information

Using Open Source software and Open data to support Clinical Trial Protocol design

Using Open Source software and Open data to support Clinical Trial Protocol design Using Open Source software and Open data to support Clinical Trial Protocol design Nikolaos Matskanis, Joseph Roumier, Fabrice Estiévenart {nikolaos.matskanis, joseph.roumier, fabrice.estievenart}@cetic.be

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information

D5.5 Initial EDSA Data Management Plan

D5.5 Initial EDSA Data Management Plan Project acronym: Project full : EDSA European Data Science Academy Grant agreement no: 643937 D5.5 Initial EDSA Data Management Plan Deliverable Editor: Other contributors: Mandy Costello (Open Data Institute)

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

enanomapper - A Database and Ontology Framework for Nanomaterials Design and Safety Assessment

enanomapper - A Database and Ontology Framework for Nanomaterials Design and Safety Assessment enanomapper - A Database and Ontology Framework for Nanomaterials Design and Safety Assessment ACS Meeting, Boston, USA, 18 August 2015 Presented by Barry Hardy (Douglas Connect) as Coordinator and in

More information

Chemical safety and big data: the industry s demands

Chemical safety and big data: the industry s demands Chemical safety and big data: the industry s demands Richard CURRIE Senior Technical Expert; Group Leader & Global Predictive and Computational Toxicology Lead Valid results Useful results Credit Money/Grants

More information

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8 THE BRITISH LIBRARY Unlocking The Value The British Library s Collection Metadata Strategy 2015-2018 Page 1 of 8 Summary Our vision is that by 2020 the Library s collection metadata assets will be comprehensive,

More information

EMBL-EBI Industry Programme Workshop, 26th to 27th November 2012. Data Infrastructure for Omics-based Chemical Safety.

EMBL-EBI Industry Programme Workshop, 26th to 27th November 2012. Data Infrastructure for Omics-based Chemical Safety. EMBL-EBI Industry Programme Workshop, 26th to 27th November 2012. Data Infrastructure for Omics-based Chemical Safety Danyel Jennen The systems toxicology approach Cf. Waters & Fostel. Toxicogenomics and

More information

How To Understand Protein-Protein Interaction And Inhibitors

How To Understand Protein-Protein Interaction And Inhibitors Protein-Protein Interactions and Inhibitors Alan Naylor Independent Consultant Optibrium Consultants Meeting Cambridge 27 th November 2012 Why PPI inhibitors? PPIs are involved in many biological / disease

More information

THE BIOTECH & PHARMACEUTICAL INDUSTRY

THE BIOTECH & PHARMACEUTICAL INDUSTRY THE BIOTECH & PHARMACEUTICAL INDUSTRY ESSENTIAL CAREERS INFORMATION CALUM LECKIE KATIE BISARO CAREERS CONSULTANTS What we will cover Sector overview Types of role Graduate recruitment trends and issues

More information

Towards a reference architecture for Semantic Web applications

Towards a reference architecture for Semantic Web applications Towards a reference architecture for Semantic Web applications Benjamin Heitmann 1, Conor Hayes 1, and Eyal Oren 2 1 firstname.lastname@deri.org Digital Enterprise Research Institute National University

More information

Integrating Bioinformatics, Medical Sciences and Drug Discovery

Integrating Bioinformatics, Medical Sciences and Drug Discovery Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: madanm1@rediffmail.com Bioinformatics

More information

OpenAIRE Research Data Management Briefing paper

OpenAIRE Research Data Management Briefing paper OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement

More information

Research Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper. By Erich A. Gombocz

Research Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper. By Erich A. Gombocz Research Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper By Erich A. Gombocz 2 Research Data Integration of Retrospective Studies for Prediction of Disease

More information

Supporting Change-Aware Semantic Web Services

Supporting Change-Aware Semantic Web Services Supporting Change-Aware Semantic Web Services Annika Hinze Department of Computer Science, University of Waikato, New Zealand a.hinze@cs.waikato.ac.nz Abstract. The Semantic Web is not only evolving into

More information

EBiSC the first European bank for induced pluripotent stem cells

EBiSC the first European bank for induced pluripotent stem cells Press Release EBiSC the first European bank for induced pluripotent stem cells Pharmaceutical companies who are members of the European Federation of Pharmaceutical Industries and Associations (EFPIA)

More information

Anforderungen der Life-Science Industrie an die Hochschulen. Hans Widmer Novartis Institutes for BioMedical Research

Anforderungen der Life-Science Industrie an die Hochschulen. Hans Widmer Novartis Institutes for BioMedical Research Anforderungen der Life-Science Industrie an die Hochschulen Hans Widmer Novartis Institutes for BioMedical Research There s nothing more extraordinary than a normal life 2 What does industry expect from

More information

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers. org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank

More information

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR)

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR) Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR) Enable Science in silico & Provide the Right Knowledge to the Right People at the Right Time to enable the

More information

Cheminformatics and Pharmacophore Modeling, Together at Last

Cheminformatics and Pharmacophore Modeling, Together at Last Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction

More information

Carlos Iglesias, Open Data Consultant.

Carlos Iglesias, Open Data Consultant. Carlos Iglesias, Open Data Consultant. contact@carlosiglesias.es http://es.linkedin.com/in/carlosiglesiasmoro/en @carlosiglesias mobile: +34 687 917 759 Open Standards enthusiast and Open advocate that

More information

TopBraid Life Sciences Insight

TopBraid Life Sciences Insight TopBraid Life Sciences Insight In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

Linked Statistical Data Analysis

Linked Statistical Data Analysis Linked Statistical Data Analysis Sarven Capadisli 1, Sören Auer 2, Reinhard Riedl 3 1 Universität Leipzig, Institut für Informatik, AKSW, Leipzig, Germany, 2 University of Bonn and Fraunhofer IAIS, Bonn,

More information

arxiv:1305.4455v1 [cs.dl] 20 May 2013

arxiv:1305.4455v1 [cs.dl] 20 May 2013 SHARE: A Web Service Based Framework for Distributed Querying and Reasoning on the Semantic Web Ben P Vandervalk, E Luke McCarthy, and Mark D Wilkinson arxiv:1305.4455v1 [cs.dl] 20 May 2013 The Providence

More information

BYODs & FAIR Data Stewardship

BYODs & FAIR Data Stewardship BYODs & FAIR Data Stewardship Luiz Olavo Bonino luiz.bonino@dtls.nl www.elixir-europe.org Summary FAIR Data stewardship Approach in NL BYOD FAIR Data tooling ecosystem Way of working (FAIR) Data Stewardship

More information

Modelling the integration of biobanks into healthcare systems International Biobanking Summit II: Future Directions Graz, 17 September 2013

Modelling the integration of biobanks into healthcare systems International Biobanking Summit II: Future Directions Graz, 17 September 2013 Modelling the integration of biobanks into healthcare systems International Biobanking Summit II: Future Directions Graz, 17 September 2013 Anthony J Brookes University of Leicester Biobank: A comprehensive

More information

How To Build A Cloud Based Intelligence System

How To Build A Cloud Based Intelligence System Semantic Technology and Cloud Computing Applied to Tactical Intelligence Domain Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 1 Abstract The tactical

More information

Ingenuity Pathway Analysis (IPA )

Ingenuity Pathway Analysis (IPA ) ProductProfile Ingenuity Pathway Analysis (IPA ) For the analysis and interpretation of omics data IPA is a web-based software application for the analysis, integration, and interpretation of data derived

More information

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer Research Data Alliance: Current Activities and Expected Impact SGBD Workshop, May 2014 Herman Stehouwer The Vision 2 Researchers and innovators openly share data across technologies, disciplines, and countries

More information

CHEM-E4140 Selectivity 12. Pharma Business

CHEM-E4140 Selectivity 12. Pharma Business CHEM-E4140 Selectivity 12. Pharma Business Prof. Ari Koskinen Laboratory of Organic Chemistry C318 Pharma Business Total volume ca 1100 G$ (Shell 421G$; Walmart 486G$; Toyota 252 G$). Annually approx 25

More information

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Roberto Todeschini Milano Chemometrics and QSAR Research Group - Dept. of

More information

The Ontological Approach for SIEM Data Repository

The Ontological Approach for SIEM Data Repository The Ontological Approach for SIEM Data Repository Igor Kotenko, Olga Polubelova, and Igor Saenko Laboratory of Computer Science Problems, Saint-Petersburg Institute for Information and Automation of Russian

More information

Building a Unified Drug Discovery Database

Building a Unified Drug Discovery Database Building a Unified Drug Discovery Database David M Parry Celltech 11 April 2002 Informatics Challenges with Mergers and Acquisitions 1 Celltech R&D Ltd Leading BioPharmaceutical Company 3 Research Sites

More information

CTC Technology Readiness Levels

CTC Technology Readiness Levels CTC Technology Readiness Levels Readiness: Software Development (Adapted from CECOM s Software Technology Readiness Levels) Level 1: Basic principles observed and reported. Lowest level of software readiness.

More information

Building a Collaborative Informatics Platform for Translational Research: Prof. Yike Guo Department of Computing Imperial College London

Building a Collaborative Informatics Platform for Translational Research: Prof. Yike Guo Department of Computing Imperial College London Building a Collaborative Informatics Platform for Translational Research: An IMI Project Experience Prof. Yike Guo Department of Computing Imperial College London Living in the Era of BIG Big Data : Massive

More information

Exploiting the Pathogen box

Exploiting the Pathogen box Exploiting the Pathogen box Dr Richard Gordon Director Strategic Health Innovation Partnerships 9 May 2014 www.ship.mrc.ac.za Background Worked with MMV in many areas Servicing Partner Consultant Collaborator

More information

Dendro: collaborative research data management built on linked open data

Dendro: collaborative research data management built on linked open data Dendro: collaborative research data management built on linked open data João Rocha da Silva João Aguiar Castro Faculdade de Engenharia da Universidade do Porto/INESC TEC, Portugal, {joaorosilva,joaoaguiarcastro}@gmail.com

More information

An industry perspective on deployed semantic interoperability solutions

An industry perspective on deployed semantic interoperability solutions An industry perspective on deployed semantic interoperability solutions Ralph Hodgson, CTO, TopQuadrant SEMIC Conference, Athens, April 9, 2014 https://joinup.ec.europa.eu/community/semic/event/se mic-2014-semantic-interoperability-conference

More information

Linked Longitudinal Medical Record. Susie Stephens Co chair W3C Health Care & Life Science Interest Group

Linked Longitudinal Medical Record. Susie Stephens Co chair W3C Health Care & Life Science Interest Group Linked Longitudinal Medical Record Susie Stephens Co chair W3C Health Care & Life Science Interest Group Outline Secondary Use of Health Care Data Introduction to the Semantic Web Translational Medicine

More information

Customer experiences in implemen0ng SKOS- based vocabulary management systems, Ralph Hodgson, TopQuadrant. CWI, Amsterdam, April 3, 2014

Customer experiences in implemen0ng SKOS- based vocabulary management systems, Ralph Hodgson, TopQuadrant. CWI, Amsterdam, April 3, 2014 LDBC Consor*um Fourth Technical User Community (TUC) mee*ng Customer experiences in implemen0ng SKOS- based vocabulary management systems, and other Seman0c- Technology- Driven Systems. Ralph Hodgson,

More information

Cloud and Big Data Standardisation

Cloud and Big Data Standardisation Cloud and Big Data Standardisation EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University of Amsterdam

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking

It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking Lutz Maicher and Benjamin Bock, Topic Maps Lab at University of Leipzig,

More information