Introduction to mass spectrometry (MS) based proteomics and metabolomics

Similar documents
Aiping Lu. Key Laboratory of System Biology Chinese Academic Society

Introduction to Proteomics 1.0

Proteomics in Practice

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

Quantitative proteomics background

LC-MS/MS for Chromatographers

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Functional Data Analysis of MALDI TOF Protein Spectra

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Introduction to Proteomics

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Mass Spectrometry Based Proteomics

Supplementary Materials and Methods (Metabolomics analysis)

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

A Streamlined Workflow for Untargeted Metabolomics

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

Building innovative drug discovery alliances. Evotec Munich. Quantitative Proteomics to Support the Discovery & Development of Targeted Drugs

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

Global and Discovery Proteomics Lecture Agenda

Analysis of the Vitamin B Complex in Infant Formula Samples by LC-MS/MS

Waters Core Chromatography Training (2 Days)

La Protéomique : Etat de l art et perspectives

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

Metabolomics Software Tools. Xiuxia Du, Paul Benton, Stephen Barnes

Overview. Triple quadrupole (MS/MS) systems provide in comparison to single quadrupole (MS) systems: Introduction

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Introduction to Proteomics

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

Simultaneous qualitative and quantitative analysis using the Agilent 6540 Accurate-Mass Q-TOF

Application Note # LCMS-62 Walk-Up Ion Trap Mass Spectrometer System in a Multi-User Environment Using Compass OpenAccess Software

Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics

HRMS in Clinical Research: from Targeted Quantification to Metabolomics

Mass Spectra Alignments and their Significance

Chapter 14. Modeling Experimental Design for Proteomics. Jan Eriksson and David Fenyö. Abstract. 1. Introduction

Overview. Introduction. AB SCIEX MPX -2 High Throughput TripleTOF 4600 LC/MS/MS System

Pesticide Analysis by Mass Spectrometry

Mass Spectrometry Signal Calibration for Protein Quantitation

Statistical Analysis Strategies for Shotgun Proteomics Data

Using Natural Products Application Solution with UNIFI for the Identification of Chemical Ingredients of Green Tea Extract

NUVISAN Pharma Services

MarkerView Software for Metabolomic and Biomarker Profiling Analysis

Laboration 1. Identifiering av proteiner med Mass Spektrometri. Klinisk Kemisk Diagnostik

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

The Open2Dprot Proteomics Project for n-dimensional Protein Expression Data Analysis

Thermo Scientific PepFinder Software A New Paradigm for Peptide Mapping

What Do We Learn about Hepatotoxicity Using Coumarin-Treated Rat Model?

ProteinPilot Report for ProteinPilot Software

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

Computational Analysis of LC-MS/MS Data for Metabolite Identification

Background Information

Week 9: MS in Space and Proteomics

Protein Protein Interaction Networks

Learning Objectives:

Introduction To Real Time Quantitative PCR (qpcr)

# LCMS-35 esquire series. Application of LC/APCI Ion Trap Tandem Mass Spectrometry for the Multiresidue Analysis of Pesticides in Water

UHPLC/MS: An Efficient Tool for Determination of Illicit Drugs

Guide to Reverse Phase SpinColumns Chromatography for Sample Prep

Thermo Scientific SIEVE Software for Differential Expression Analysis

SELDI-TOF Mass Spectrometry Protein Data By Huong Thi Dieu La

OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis

MASCOT Search Results Interpretation

Chapter 3. Protein Structure and Function

Molecular Cell Biology. Prof. D. Karunagaran. Department of Biotechnology. Indian Institute of Technology Madras

Retrospective Analysis of a Host Cell Protein Perfect Storm: Identifying Immunogenic Proteins and Fixing the Problem

FACULTY OF MEDICAL SCIENCE

M.Sc. in Nano Technology with specialisation in Nano Biotechnology

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists

using ms based proteomics

Unique Software Tools to Enable Quick Screening and Identification of Residues and Contaminants in Food Samples using Accurate Mass LC-MS/MS

SIMULTANEOUS DETERMINATION OF NALTREXONE AND 6- -NALTREXOL IN SERUM BY HPLC

Shotgun Proteomic Analysis. Department of Cell Biology The Scripps Research Institute

Advances in Steroid Panel Analysis with High Sensitivity LC/MS/MS. Kenneth C. Lewis 1, Lisa St. John-Williams 1, Changtong Hao 2, Sha Joshua Ye 2

Mass Frontier 7.0 Quick Start Guide

High Resolution LC-MS Data Output and Analysis

Master course KEMM03 Principles of Mass Spectrometric Protein Characterization. Exam

Quantitative mass spectrometry in proteomics: a critical review

WATERS QUANTITATIVE ANALYSIS solutions

Urine Drug Testing. Why drug test? Set the Standard!

Rapid and Reproducible Amino Acid Analysis of Physiological Fluids for Clinical Research Using LC/MS/MS with the atraq Kit

Technical Report. Automatic Identification and Semi-quantitative Analysis of Psychotropic Drugs in Serum Using GC/MS Forensic Toxicological Database

Agilent 6000 Series LC/MS Solutions CONFIDENCE IN QUANTITATIVE AND QUALITATIVE ANALYSIS

High resolution mass spectrometry (HRMS*) in Graz

GENERAL UNKNOWN SCREENING FOR DRUGS IN BIOLOGICAL SAMPLES BY LC/MS Luc Humbert1, Michel Lhermitte 1, Frederic Grisel 2 1

Investigating Biological Variation of Liver Enzymes in Human Hepatocytes

Method Development of LC-MS/MS Analysis of Aminoglycoside Drugs: Challenges and Solutions

ASMS Regulated Bioanalysis Interest Group (RBIG) Workshop. Antibody-Drug Conjugates (ADC) A Complex Problem in Regulated Bioanalysis.

13C NMR Spectroscopy

Transcription:

Introduction to mass spectrometry (MS) based proteomics and metabolomics Tianwei Yu Department of Biostatistics and Bioinformatics Rollins School of Public Health Emory University September 10, 2015

Background High- throughput profiling of biological samples Red line: central dogma Blue line: interaction/ regulation Metabolites Environment (Picture edited from http://www.ncbi.nlm.nih.gov/class/mlacourse/modules/molbioreview/) Genome: genotype, copy number, epigenetics... Proteome: protein concentration, modification, interaction Transcriptome: mrna expression levels, alternative splicing, microrna, lincrna Metabolome: metabolite concentration, dynamics, environmental chemicals

Background

Background Proteins: http://matznerd.com/amino- acids/ Made of 20 amino acids Sequence coded in DNA (Codon) http://www.biogem.org/codon.jpg Folded into 3 dimensional structures Carry out biological functions like machines. http://compbio.cs.toronto.edu/ligalign/desc.html

Why Mass Spectrometry Proteins/metabolites are harder to measure than DNA/RNA - proteins are structured - metabolites are small - neither can be identified/quantified using sequence- based methods mrna measurements by RNAseq and microarrays are indeed proxy measurements of protein levels

Why Mass Spectrometry Proteins/metabolites could be separated according to their properties: mass/size hydrophilicity/hydrophobicity binding to specific ligands charge Using Chromatography Electrophoresis But these techniques cannot identify what is in the mixture http://en.wikibooks.org/

Why Mass Spectrometry Problems with separation techniques: Reproducibility Identification / Quantification Inability to separate tens of thousands of iion species Mass Spectrometry: Measurements based on mass/charge ratio (m/z) Highly accurate, highly reproducible measurements Theoretical values easy to obtain à identification Can study protein modifications (small ligands attached)

Principle of Mass Spectrometry (MS) Proteins, Peptides, or small molecules Ionization Mass Spectrometer Mass/Charge (m/z) Solution Phase Gas Phase Kindly provided by Dr. Junmin Peng

Mass Spectrometry - - - getting ion from solution to gas phase Picture provided by Prof. Junmin Peng (Emory) Matrix assisted laser desorption ionization (MALDI) Electrospray ionization (ESI)

Mass Spectrometry - - - finding m/z Time- of- flight: Putting a charged particle in an electric field, the time of flight is k: a constant related to instrument characteristics t = k m z

Mass Spectrometry - - - finding m/z Quadrupole: Radio- frequency voltage applied to opposing pair of poles. Only ions with a specific m/z can pass to the detector at each frequency.

Mass Spectrometry - - - finding m/z Fourier transform MS. Ions detected not by hitting a detector, but by passing by a detecting plate. Ions detected simultaneously. Very high resolution. m/z detected based on the frequency of the ion in the cyclotron. f z m

Mass Spectrometry - - - finding m/z High resolution Orbitrap Fusion

Metabolomics - Background Why profile metabolites? Enzymatic diseases directly influence metabolite concentrations Detect disease- causing chemicals (e.g. pesticides in Parkinson s) and their interactions with metabolic networks Some medicines influence the metabolic pattern, or even directly interact with some metabolites. Metabolic markers may exist for non- enzymatic diseases Elucidate the regulations in the enzyme/metabolite network (Picture from KEGG PATHWAY)

Metabolomics - Background Thousands of metabolites exist in the biological system. Combined with other small molecules, high- throughput analysis of the system is a not an easy task. # entries Human Metabolome Database DrugBank FooDB 41,993 metabolites 7,759 drugs (1602 FDA approved small molecules) (projected) 25,000 food components and food additives Metlin 240,964 entries Methods used to profile the metabolome: Nuclear magnetic resonance (NMR) Gas Chromatography - Mass Spectrometry (GC/MS) Liquid Chromatography - Mass Spectrometry (LC/MS) LC/MS/MS

Metabolomics LC/MS Liquid chromotography retention time Mass- to- charge ratio (m/z) Take slices in retention time, send to MS retention time Mass- to- charge ratio (m/z)

Metabolomics - LC/MS An ideal peak should show a Gaussian curve in intensity along the retention time axis, while keeping constant in the m/z axis Issues in LC/MS data processing: Backgroundnoise Chemicalnoise (ridges in spectra, ion suppression, ) Peak intensity (along retention time axis) deviate from Gaussian curve substantially (fronting, breaks, ) Slight m/z shift across MS spectra (not severe with newer machines) Retention time shift across LC/MS spectra Multiplecharge status of a single metabolite Isotopes

Metabolomics LC/FTMS data

LC/MS data processing

LC/MS data processing How do we know which feature is which chemical? We have three measurement on each feature: m/z value, retention time, peak intensity In high- resolution data, m/z value could almost pin- point the chemical composition. However some chemicals share same chemical composition. A targeted MS/MS experiment may be necessary to differentiate between them. http://www.cliffsnotes.com/sciences/chemistry/chemistry/o rganic- compounds/structural- formulas

Proteomics - How to identify a single protein by MS? Mass/Charge (m/z) Mass Digest into many peptides Mass of many peptides peptide mass fingerprinting (PMF) Kindly provided by Dr. Junmin Peng Mass of many peptide fragments By Tandem Mass Spectrometry (MS/MS)

Proteomics - How to identify a single protein by MS/MS? Protein digestion Ionization MS spectrum MS/MS spectrum Database searching Theoretical Spectrum Peptides m/z Fragmentation m/z m/z LIFAGKQLEDGR b ions y ions 1: L IFAGKQLEDGR:11 2: LI FAGKQLEDGR:10 3: LIF AGKQLEDGR :9 4: LIFA GKQLEDGR :8 5: LIFAG KQLEDGR :7 6: LIFAGK QLEDGR :6 7: LIFAGKQ LEDGR :5 8: LIFAGKQL EDGR :4 9: LIFAGKQLE DGR :3 10:LIFAGKQLED GR :2 11:LIFAGKQLEDG R :1 LI F A G K Q D E L Q Q Peptide/protein identification L E K G A F A D G Kindly provided by Dr. Junmin Peng 200 400 600 800 1000 1200 m/z

Proteomics - Analysis of protein mixture by tandem MS Digestion HPLC Protein mixture Peptides 0 10 20 30 min MS Database Searching LLTTIADAAK SAGGNYVVFGEAK EDDVEEAVQAADR 1 sequencing attempt per 0.5 sec. 3600 sequencing attempts in 30 min. MS/MS 400 800 1200 1600 m/z All peptide sequences Identification of many proteins Kindly provided by Dr. Junmin Peng

Proteomics - Analysis of complex protein mixtures Difficulty: In a complex biological sample (cell, tissue, serum, ), there are several thousand protein species tens of thousands of peptides after digestion; signal from less- abundant species may be suppressed. Solution: Must reduce complexity to identify and quantify proteins. Incorporate biochemical separation techniques: LC- MS/MS LC/LC- MS/MS Separate proteins in multiple dimensions. Sacrifice speed. 2D gel- MS/MS 2D gel/lc- MS/MS Affinity column separation LC- MS/MS Analyze a subset of proteins. Sacrifice coverage.

LC/MS some example methods Reviewed by Katajamaa&Oresic (2007) J Chr. A 1158:318 Noise reduction disregarding the time axis:

LC/MS some example methods Reviewed by Katajamaa&Oresic (2007) J Chr. A 1158:318 Peak Detection:

LC/MS some example methods Smith et.al. (2006) Analytical Chemistry. 78(3):779-87 Matched Gaussian filter along the time axis after EIC extraction. coefficients are equal to a second- derivative Gaussian function. The filtered chromatogram crosses the x- axis roughly at the peak inflection points.

LC/MS some example methods Reviewed by Katajamaa&Oresic (2007) J Chr. A 1158:318 Retention time alignment:

LC/MS some example methods A model for assymetric peaks Bi- Gaussian model: g( t) = # % % $ % % &% δ 2π e δ 2π e ( t α ) 2 2 2σ 1 ( t α ) 2 2 2σ 2, t < α, t α Quantities used in peak location estimation:! τ ( ) = log g(t)dt A τ ( ) = 1 3 B τ $ %& At the summit, log g(t) t τ ' log $ g(t)dt ' () %& τ () ( τ ( ) 2 dt) 1 3 A( α) = log( δσ 1 2) log( δσ 2 2) = 1 3 log δσ 1 $ ( ( ) 2 dt) τ log g(t) t τ " # 2 3 % & ' 1 3 log " δσ 3 % $ 2 ' = B α # 2 & ( )

LC/MS some example methods Fitting a mixture of bi- Gaussian curves using EM- like algorithms

LC/MS some example methods

Beyond pre- processing Ø Find features (genes/proteins/metabolites) significantly associated with a phenotype biomarkers Ø Find biological pathways or subnetworksassociated with a phenotype Ø Based on identified biomarkers, build predictive models for diagnosis, prognosis, or predict treatment response - > personalized medicine? Ø Identify feature- feature associations and previously unknown structures in the data Ø Find regulation patterns, both intrinsic and in relation to a phenotype

Beyond pre- processing This is the common structure of microarray gene expression data from a simple cross- sectional case- control design. Data from other high- throughput technology are often similar. Control 1 Control 2 Control 25 Disease 1 Disease 2 Disease 40 Gene 1 9.25 9.77 9.4 8.58 5.62 6.88 Gene 2 6.99 5.85 5 5.14 5.43 5.01 Gene 3 4.55 5.3 4.73 3.66 4.27 4.11 Gene 4 7.04 7.16 6.47 6.79 6.87 6.45 Gene 5 2.84 3.21 3.2 3.06 3.26 3.15 Gene 6 6.08 6.26 7.19 6.12 5.93 6.44 Gene 7 4 4.41 4.22 4.42 4.09 4.26 Gene 8 4.01 4.15 3.45 3.77 3.55 3.82 Gene 9 6.37 7.2 8.14 5.13 7.06 7.27 Gene 10 2.91 3.04 3.03 2.83 3.86 2.89 Gene 11 3.71 3.79 3.39 5.15 6.23 4.44 Gene 50000 3.65 3.73 3.8 3.87 3.76 3.62

Workflow of feature selection Raw data Preprocessing, normalization,filtering Feature- level expression in each sample Statistical testing/model fitting Test statistic for every feature Compare to null distribution P- value for every feature Significance of every feature (FWER, FDR, ) Selected features Other information (fold change, biological pathway ) Biological interpretation,

Genome- wide metabolic network

Protein interaction network S. Wuchty, E. Ravasz and A.- L. Barabasi: The Architecture of Biological Networks

Knowledge- guided omics data analysis Example: analyzing metabolomics data with genome- scale metabolic network.

Knowledge- guided omics data analysis An example: Selecting cancer- related sub- networks using Network- Guided Random Forests. Dutkowski J, Ideker T (2011) PLoS Comput Biol 7(9): e1002180.

Workflow of Gene Set Analysis Raw data Preprocessing, normalization,filtering Feature- level expression in each sample Feature group information (biological pathway ) Statistical testing/model fitting Group- level significance Selected features from significant groups

Knowledge- guided omics data analysis An example: Selecting gene sets (pre- defined gene groups) associated with phenotype. PNAS, vol. 102, no. 43, 15545 15550

Knowledge- guided omics data analysis How to jointly model multi- layer omics data an open question.

Analyzing data together with biological network An example application: pathways associated with vaccination response.