Computational Platform for Compound Identification

Similar documents
Pesticide Analysis by Mass Spectrometry

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

GUIDELINES FOR THE VALIDATION OF ANALYTICAL METHODS FOR ACTIVE CONSTITUENT, AGRICULTURAL AND VETERINARY CHEMICAL PRODUCTS.

Agilent s Kalabie Electronic Lab Notebook (ELN) Product Overview ChemAxon UGM 2008 Agilent Software and Informatics Division Mike Burke

bioavailability active transport blood-brain barrier transport absorption volume of distribution drug binding to plasma proteins

Introduction. What Can an Offline Desktop Processing Tool Provide for a Chemist?

Integration of DiscoveryQuant Software into Automated In-Vitro ADME Assay Workflows

ForensicDB: A Web-accessible Spectral Database

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

MassHunter for Agilent GC/MS & GC/MS/MS

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Cheminformatics and Pharmacophore Modeling, Together at Last

A Navigation through the Tracefinder Software Structure and Workflow Options. Frans Schoutsen Pesticide Symposium Prague 27 April 2015

DMPK: Experimentation & Data

Dr Alexander Henzing

Mascot Integra: Data management for Proteomics ASMS 2004

Overview. Introduction. AB SCIEX MPX -2 High Throughput TripleTOF 4600 LC/MS/MS System

Thermo Scientific Compound Discoverer Software. A New Generation. of integrated solutions for small molecule structure ID

Daniel M. Mueller, Katharina M. Rentsch Institut für Klinische Chemie, Universitätsspital Zürich, CH-8091 Zürich, Schweiz

Metabolomic Profiling of Accurate Mass LC-MS/MS Data to Identify Unexpected Environmental Pollutants

Scientific Business Intelligence using Pipeline Pilot

Application Note # LCMS-62 Walk-Up Ion Trap Mass Spectrometer System in a Multi-User Environment Using Compass OpenAccess Software

Application of Structure-Based LC/MS Database Management for Forensic Analysis

GENERAL UNKNOWN SCREENING FOR DRUGS IN BIOLOGICAL SAMPLES BY LC/MS Luc Humbert1, Michel Lhermitte 1, Frederic Grisel 2 1

Unique Software Tools to Enable Quick Screening and Identification of Residues and Contaminants in Food Samples using Accurate Mass LC-MS/MS

Strategic Benefits of an Online Clinical Data Repository

is a knowledge based expert decision support tool for predicting the metabolic fate of chemicals in mammals.

Step-by-Step Analytical Methods Validation and Protocol in the Quality System Compliance Industry

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Protein Prospector and Ways of Calculating Expectation Values

Quantitative proteomics background

The Facts on Equine Drug Testing

THE CAMBRIDGE CRYSTALLOGRAPHIC DATA CENTRE (CCDC)

INTEGRATED TECHNOLOGY TO SEE THE WHOLE PICTURE. AxION iqt GC/MS/MS

Agilent Metabolomics Workflow

L. Yu et al. Correspondence to: Q. Zhang

Distribution Agreement

Isotope distributions

KINETIC DETERMINATION OF SELENIUM BY VISIBLE SPECTROSCOPY (VERSION 1.8)

A Laboratory Information. Management System for the Molecular Biology Lab

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

Apply with Resume to: Submenu Path Company/Careers/Current Openings/Job Type: Science

A Streamlined Workflow for Untargeted Metabolomics

Bruker ToxScreener TM. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS

Mass Frontier 7.0 Quick Start Guide

Guidance for Industry

Molecule Projections

Design of Experiments for Analytical Method Development and Validation

DRUG METABOLISM. Drug discovery & development solutions FOR DRUG METABOLISM

Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge

AN INTRODUCTION TO THE PHARSIGHT KNOWLEDGEBASE SUITE

Highly Potent APIs The right platform, people and procedures for successful development and manufacturing

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

An exciting opportunity in scientific knowledge base development

Quantitative Evaluation of Perfume Fleuressence Samples using the znose

FACULTY OF VETERINARY MEDICINE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Increasing Lab Efficiency by Automating Sample Test Workflows Using OpenLAB Enterprise Content Manager (ECM) and Business Process Manager (BPM)

AMB-PDM Overview v6.0.5

Waters Core Chromatography Training (2 Days)

Building a Unified Drug Discovery Database

Alterações empresariais sustentadas pelo conceito de engenharia do Produto Patrício Soares da Silva, MD, PhD

Viewpoint ediscovery Services

Thermo Scientific PepFinder Software A New Paradigm for Peptide Mapping

Oracle Siebel Marketing and Oracle B2B Cross- Channel Marketing Integration Guide ORACLE WHITE PAPER AUGUST 2014

Vendor: Brio Software Product: Brio Performance Suite

Simultaneous Metabolite Identification and Quantitation with UV Data Integration Using LightSight Software Version 2.2

Chapter 3. Mass Relationships in Chemical Reactions

AN INTRODUCTION TO THE PHARSIGHT KNOWLEDGEBASE SERVER (PKS )

A Complete Drug Stability Program. Stability Lab Information Manager SL IM

Technology Updates. Presenter: Venkat Krishnan. Presentation to: DHS Board. Date: April 18, Georgia Department of Human Services

Thermo Scientific SIEVE Software for Differential Expression Analysis

Keystone Exams: Chemistry Assessment Anchors and Eligible Content. Pennsylvania Department of Education

STATEMENT OF WORK. NETL Cooperative Agreement DE-FC26-02NT41476

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

13C NMR Spectroscopy

Selectivity in Analytical Chemistry Recommendations for its Use

The FDA/CDER Chemical Informatics Program

Increasing Quality While Maintaining Efficiency in Drug Chemistry with DART-TOF MS Screening

DFN Project Chemie.DE: Building an Internet Information Service for Chemistry

The First Quantitative Analysis of Alkylated PAH and PASH by GCxGC/MS and its Implications on Weathering Studies

QSAR. The following lecture has drawn many examples from the online lectures by H. Kubinyi

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise

Identification of Surfactants by Liquid Chromatography-Mass Spectrometry

C146-E087F. GCMS-QP2010 Plus. Shimadzu Gas Chromatograph Mass Spectrometer

Bio-IT World 2013 Best Practices Awards

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

Chemistry Instrumental Analysis Lecture 1. Chem 4631


Metabolomics Software Tools. Xiuxia Du, Paul Benton, Stephen Barnes

Mass Spectra of Select Benzyl- and Phenyl- Piperazine Designer Drugs

Amount of Substance.

Chapter Test B. Chapter: Measurements and Calculations

The Automated SPE Assay of Fipronil and Its Metabolites from Bee Pollen.

ICH Topic Q 1 A Stability Testing Guidelines: Stability Testing of New Drug Substances and Products

Importing pharmaceutical products to China

GENASIS System Architecture

Lead optimization services

Getting the most from this book...4 About this book...5

PeptidomicsDB: a new platform for sharing MS/MS data.

Transcription:

Computational Platform for Compound Identification Aurelien Monge*, Stephane Cano, Stephane Albrecht, Pavel Pospisil, Elyette Martin* Philip Morris International, R&D 19 th June 2013 ACD/Labs Symposium on Laboratory Intelligence (EUM), Neuchatel Switzerland

Introduction Chemoinformatics challenges at PMI: Compound identification from complex matrices Efficient and/or automatized compound registration (including registration of mixtures and stereochemical isomers) Managing spectral data Associating toxicity data to compounds and mixtures Inserting chemical data in corporate Scientific Data Warehouse Reporting chemical data in a relevant way Building R&D chemoinformatics platform using Rapid Development Tools Page: 2

PMI Chemoinformatics platform 3. Chemoinformatics Knowledge Base 1. Unique Compounds & Spectra Database Scientific Data Warehouse 2. Computer Assisted Structure Identification Page: 3

1. Unique Compounds & Spectra Database 3. Chemoinformatics Knowledge Base 1. Unique Compounds & Spectra Database Scientific Data Warehouse 2. Computer Assisted Structure Identification Page: 4

Concept of UCSD Data Caculation software Data Internet Data Data Data Data UCSD Internet Data Internet Unique Compound and Spectra Database: To assemble R&D chemical information into a central repository of chemical substances and analytical spectra. Page: 5

Concept of UCSD UCSD is an internal database with no external access Accelrys chemical cartridge with enhanced stereochemistry ACD/Labs for the analytical spectra part Web interface developed in Oracle Application Express and automatic processes in Accelrys Pipeline Pilot UCSD COMPOUND DB SPECTRAL DB Unique UCSD code (Accelrys cartridge) Unique SpecID (ACD/Labs) Page: 6

UCSD Roles 1. Submitter A chemist who can search, view data and also submit molecules and spectra. 2. Registrar A chemoinformatician that can search, view, submit and also check and register data. Viewer User can search and view registered data (print, export, list creation) Page: 7

UCSD Workflow Spectral database SpecDB_R For daily research SpecDB_U Read-only 1. Add new spectrum Submitter Compound database Submission Area Registration Area Page: 8

SpecDB_R ACD/ChemFolder Enterprise module Page: 9

UCSD Workflow Spectral database SpecDB_R For daily research SpecDB_U Read-only 1. Add new spectrum Submitter 2. Add new batches with SpecID Compound database Submission Area Registration Area 3. Validate new record Note: submitted batches can also correspond to known unknowns that once identified, can be later registered as known molecules. Page: 10

Concept of Molecule, Substance and Batch Assignment Unique means there is no redundancy, ensured by uniqueness of chemical structure and systematic and accurate registration process. Molecule = Neutral, unique chemical entity Every molecule is assigned a unique company code: e.g., PMI01234567 Substance = Molecule + Salt Batch = Substance physically present or projectrelevant as cited in literature Substance ID = e.g., PMI01234567-A PMI01234567-B PMI01234567-C Batch ID = e.g., BC000002152 BC000008641 BC000000560 BC000000320 BC000000122 BC000000154 BC000000176 BC000002504 BC000000894 BC000000121 BC000000158 BC000003174 BC000002598 Page: 11

UCSD Compound Part To each newly registered compound we developed automatic processes that: - attach SMILES strings, InChI codes and IUPAC names - calculate structure-derived physico-chemical and ADMET properties Naming and calculation of compounds after registration (uniqueness check) reduces further ambiguity See more in Martin et al., Journal of Cheminformatics 2012, 4:11 Page: 12

UCSD Compound Part See more in Martin et al., Journal of Cheminformatics 2012, 4:11 Page: 13

UCSD Workflow Spectral database SpecDB_R For daily research SpecDB_U Read-only Submitter 1. Add new spectrum 4. Automatic copy of spectrum for validated batches 5. Automatic copy of data from compound part Read Web interface Viewer 2. Add new batches with SpecID Compound database Submission Area Registration Area 3. Validate new record Registrar Page: 14

SpecDB_U Page: 15

In waiting for Spectrus Search by spectra similarity is not possible in ChemFolder module implementation of a script that copies each entry in Spectrum Database module Add Copy ChemFolder Enterprise Spectrum Database Page: 16

Process for bulk import INPUT for SpecDB_R CSV file OUTPUT Merge and import in the compound part via Pipeline Pilot protocol SD file Page: 17

2. Computer Assisted Structure Identification 3. Chemoinformatics Knowledge Base 1. Unique Compounds & Spectra Database Scientific Data Warehouse 2. Computer Assisted Structure Identification Page: 18

Computer Assisted Structure Identification Our in-house developed Computer Assisted Structure Identification platform is constituted of two software: Software 1 CASI-PA : assists the processing of MS data for given analyses: Automated non-target processing of MS data. Compound Identification Comparison of 2 samples Puff-by-Puff Page: 19

Puff-by-Puff Analysis of Cigarette Aerosols Analysis of the concentration of given chemical compounds per puff. Page: 20

Computer Assisted Structure Identification Our Computer Assisted Structure Identification platform is constituted of two software: Help for the processing of MS data for given analyses: Software 2: Automated non-target processing of MS data. Compound Identification Puff-by-Puff Comparison of two products Page: 21

Computer Assisted Structure Identification Automated platform to accelerate and standardize identification of chemical structures of aerosol constituents with high confidence power. Smoke of a conventional cigarette, measured by GCxGC-EI-TOF-MS Compound? A. Knorr et al., Anal. Chem. 2013, submitted for publication Page: 22

CASI Concept Sorted Hits Confirmed Hits Multi JDX MS file + KI experimental values + relative second retention time Hits 3 CASI Score KI matching 1 GCxGC-EI-TOF-MS 2 Search in Mass Spectra Databases (NIST MS Search) 2 nd column relative retention time matching Boiling Point matching 4 Ranking 5 Submission to UCSD database Page: 23

Predictive QSPR Models for KI and 2DrelRT Kovats Index 2DrelRT Validation n=60 r 2 = 0.981 r 2 = 0.855 GA linear regression, 7 Molecular Descriptors GA Support Vector Regression, 12 molecular descriptors Page: 24

Ranking of hits using CASI Score vs. NIST MS Match Factor CASI Score improves NIST MS Search ranking. Page: 25

CASI 2 Software Page: 26

3. Chemoinformatics Knowledge Base 3. Chemoinformatics Knowledge Base 1. Unique Compounds & Spectra Database Scientific Data Warehouse 2. Computer Assisted Structure Identification Page: 27

CIKB Scope Central platform to report on the presence and concentration of constituents in smoke aerosols and their known or measured toxicities to support the toxicological assessment. Information of interest to be combined with chemical analytical information of UCSD and CASI platforms: Product Lifecycle: Complete specification of developed and tested products. LIMS: Quantitative smoke characterization. In-vitro Assays: Standardized in vitro assays to assess toxicity. Toxicities: Known toxicological information from literature. Metabolites: Known metabolites from literature. Page: 28

Chemistry Based Reporting 1. Unique Compounds & Spectra Database 3. Chemoinformatics Knowledge Base 2. Computer Assisted Structure Identification Scientific Data Warehouse Product Lifecycle LIMS Assays Toxicities Metabolites Page: 29

Used tools Rapid Application Development tools were used to develop software to support the processing of analytical chemistry data (mass spectrometry) and thus to answer quickly to business needs. Web Application User friendly and no desktop installation required Methodology Iterative development Oracle Apex Used for database centric developments (e.g. chemical registration system). Grails Used to develop custom data processing application for analytical chemists. Accelrys Pipeline Pilot Used for complex manipulation of chemical structures and data. Users involved during all the process Accelrys Direct cardridge Used to manage chemical structures in compound database. ACD/Labs modules Used to manage spectra in spectral database. 21 CFR Part 11 requirements Oracle 11gR2 Database Management System R&D Computerized Systems Development best practices Page: 30

Conclusion UCSD developed to facilitate and automatize processes of the unambiguous registration of compounds and their analytical spectra CIKB in development to report on product chemical constituents and the associated toxicity and metabolic data Scientific Data Warehouse CASI platforms developed to increase the confidence in the identification of compounds using GC-MS methods and facilitates to users the comparison of analyses Page: 31

Acknowledgment Stephane Cano, Stephane Albrecht, Pavel Pospisil PMI R&D chemistry analytics teams PMI R&D HPC team PMI R&D SIS team Accelrys Symyx consultants ACD/Labs consultants Thank you for your attention. Page: 32