Proteomic data analysis for Orbitrap datasets using Resources available at MSI. September 28 th 2011 Pratik Jagtap



Similar documents
泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

ProteinPilot Report for ProteinPilot Software

Proteomics software available in the public domain. Pratik Jagtap Minnesota Supercomputing institute

Aiping Lu. Key Laboratory of System Biology Chinese Academic Society

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

MaxQuant User s Guide Version

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Global and Discovery Proteomics Lecture Agenda

Advantages of the LTQ Orbitrap for Protein Identification in Complex Digests

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

Metabolomics Software Tools. Xiuxia Du, Paul Benton, Stephen Barnes

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Mass Spectrometry Based Proteomics

Introduction to Proteomics 1.0

A Quadrupole-Orbitrap Hybrid Mass Spectrometer Offers Highest Benchtop Performance for In-Depth Analysis of Complex Proteomes

Database Searching Tutorial/Exercises Jimmy Eng

For the next half hour I m going to be describing some of the different options for peak peaking. The profit is with getting better protein ID or

Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics

Already said. Already said. Outlook. Look at LC-MS data. A look at data for quantitative analysis using MSight and Phenyx. What data for quantitation?

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics

Research-grade Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, How I learned to ditch the triple quad and love the QE

ProSightPC 3.0 Quick Start Guide

Application Note # LCMS-62 Walk-Up Ion Trap Mass Spectrometer System in a Multi-User Environment Using Compass OpenAccess Software

DRUG METABOLISM. Drug discovery & development solutions FOR DRUG METABOLISM

Protein Prospector and Ways of Calculating Expectation Values

Quantitative proteomics background

Interpretation of MS-Based Proteomics Data

Software for protein identification and quantitative analysis of mass spectrometry data used for protein characterization and proteomics Version 5.

A Streamlined Workflow for Untargeted Metabolomics

Agilent G2721AA/G2733AA Spectrum Mill MS Proteomics Workbench

La Protéomique : Etat de l art et perspectives

HRMS in Clinical Research: from Targeted Quantification to Metabolomics

itraq Tips and Tricks

Tutorial 9: SWATH data analysis in Skyline

Mascot Search Results FAQ

PeptidomicsDB: a new platform for sharing MS/MS data.

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

Proteomic Analysis using Accurate Mass Tags. Gordon Anderson PNNL January 4-5, 2005

using ms based proteomics

Overview. Introduction. AB SCIEX MPX -2 High Throughput TripleTOF 4600 LC/MS/MS System

Overview. Triple quadrupole (MS/MS) systems provide in comparison to single quadrupole (MS) systems: Introduction

Building innovative drug discovery alliances. Evotec Munich. Quantitative Proteomics to Support the Discovery & Development of Targeted Drugs

Cliquid ChemoView 3.0 Software Simple automated analysis, from sample to report

# LCMS-35 esquire series. Application of LC/APCI Ion Trap Tandem Mass Spectrometry for the Multiresidue Analysis of Pesticides in Water

Unique Software Tools to Enable Quick Screening and Identification of Residues and Contaminants in Food Samples using Accurate Mass LC-MS/MS

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis

Investigating Biological Variation of Liver Enzymes in Human Hepatocytes

Improving the Metabolite Identification Process with Efficiency and Speed: LightSight Software for Metabolite Identification

NUVISAN Pharma Services

Thermo Scientific PepFinder Software A New Paradigm for Peptide Mapping

Appendix 5 Overview of requirements in English

Mass Frontier Version 7.0

Error Tolerant Searching of Uninterpreted MS/MS Data

Shotgun Proteomic Analysis. Department of Cell Biology The Scripps Research Institute

A Tool To Visualize and Evaluate Data Obtained by Liquid Chromatography-Electrospray Ionization-Mass Spectrometry

MASCOT Search Results Interpretation

Tutorial for proteome data analysis using the Perseus software platform

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

MarkerView Software for Metabolomic and Biomarker Profiling Analysis

SimGlycan Software*: A New Predictive Carbohydrate Analysis Tool for MS/MS Data

Thermo Scientific GC-MS Data Acquisition Instructions for Cerno Bioscience MassWorks Software

Waters Core Chromatography Training (2 Days)

Introduction to Database Searching using MASCOT

LC-MS/MS for Chromatographers

Master course KEMM03 Principles of Mass Spectrometric Protein Characterization. Exam

INTEGRATED TECHNOLOGY TO SEE THE WHOLE PICTURE. AxION iqt GC/MS/MS

Statistical Analysis Strategies for Shotgun Proteomics Data

PEAKS Studio User Manual (v5.3) PEAKS Team

Learning Objectives:

Chapter 14. Modeling Experimental Design for Proteomics. Jan Eriksson and David Fenyö. Abstract. 1. Introduction

CPAS Overview. Josh Eckels LabKey Software

A High Throughput Automated Sample Preparation and Analysis Workflow for Comprehensive Forensic Toxicology Screening using LC/MS/MS

WATERS QUANTITATIVE ANALYSIS solutions

Proteomics in Practice

HR/AM Targeted Peptide Quantitation on a Q Exactive MS: A Unique Combination of High Selectivity, Sensitivity and Throughput

Sub menu of functions to give the user overall information about the data in the file

Mass Frontier 7.0 Quick Start Guide

Using Natural Products Application Solution with UNIFI for the Identification of Chemical Ingredients of Green Tea Extract

Development of computational methods for analysing proteomic data for genome annotation

Challenges and Opportunities in Proteomics Data Analysis*

MassMatrix Web Server User Manual

Bruker ToxScreener TM. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS

EKSIGENT EKSPERT NANOLC 400. Unmatched flexibility for low flow LC/MS

Q-TOF User s Booklet. Enter your username and password to login. After you login, the data acquisition program will automatically start.

Alignment and Preprocessing for Data Analysis

Practical Analysis of Proteome Data Using Bioinformatics and Statistics

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

Introduction to mass spectrometry (MS) based proteomics and metabolomics

MassHunter for Agilent GC/MS & GC/MS/MS

Transcription:

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. September 28 th 2011 Pratik Jagtap The Minnesota http://www.mass.msi.umn.edu/

Proteomics workflow Trypsin Protein Peptides RP HPLC MS/MS Ion Chromatogram Mass Spectrum And Tandem Mass Spectrum Search Against Database Protein Identification

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Orbitrap and precision Proteomics. Data formats. data processing and effects Proteomics workflow - search algorithms - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

Precision proteomics Precision proteomics: The case for high resolution and high mass accuracy ; PNAS (2008) Mann and Kelleher. Need for newer approaches to assign peptides by taking advantage of increase in mass resolution and accuracy of new mass spectrometers.

Accuracy vs Precision

Accuracy and Precision

Importance of high mass accuracy measurements Low Mass Accuracy: 1500 ppm (LTQ) 631 ± 1 630.0 630.1 630.2 630.3... Upon searching the Mus database: 197 proteins have a peptide in the range of 630-632 High Mass Accuracy: 10 ppm (Orbitrap) 631.345 ± 0.006 631.339 630.341 630.343 3 proteins have a peptide in the range of 631.339-631.351

Orbital Trap Mass Analyzer Axial Frequency is Independent of Initial Energy, Angle, And Position of Ions Hu et al. (2005) J Mass Spectrom 40: 430-43 Frequency is Similar to a Simple Harmonic Oscillator ω (m/z)1/2

University of Minnesota Center for Proteomics and Mass Spectrometry currently has: 1 LTQ-OrbitrapXL (ThermoScientific) It will be upgraded to a Velos-Orbitrap (ThermoScientific) within a few months

Orbitrap LTQ-Orbitrap has proven to be a tremendous advance for shotgun proteomics, combining high resolving power, mass accuracy, and reliability in a relatively compact form. The high-resolution part of the instrument consists of an Orbitrap analyzer, which is a new type of mass spectrometer developed by Alexander Makarov. The LTQ offers high sequencing speed, high sensitivity, and very robust performance but low resolving power. The use of LTQ-Orbitrap has lead to higher quality datasets and reduced false positive peptide identifications.

Orbitrap Peptide fragmentation is mostly performed in low-resolution but very sensitive and fast linear ion traps. The Orbitrap is capable of very high mass accuracy because of the axial motion and oscillation of ions along the central spindle. From http://cbsu.tc.cornell.edu

Orbitrap velos Olsen J V et al. Mol Cell Proteomics 2009;8:2759-2769 Schematic of the LTQ Orbitrap Velos MS instrument. A, the stacked ring ion guide (S-Lens) increases the ion flux from the electrospray ion source into the instrument by a factor 5 10. B, the dual linear ion trap design enables efficient trapping and activation in the high-pressure cell (left) and fast scanning and detection in the low pressure cell (right). C, the combo C-trap and HCD collision cell with an applied axial field with improved fragment ion extraction and trapping capabilities. 2009 by American Society for Biochemistry and Molecular Biology 2009 Regents of the University of Minnesota. All rights reserved.

Conclusions 1. High Mass Accuracy Measurements Facilitate Database Matching and Increase Peptide Identification Confidence 2. Orbitrap Mass Spectrometers Have Capability of Performing Robust and Accurate Quantitative Proteomics Experiments

Proteomics workflow Orbitrap Mass spectral data. (.raw) Processing Search Algorithm Sta9s9cal valida9on of Protein Iden9fica9on Visualiza9on Descrip9ve Sta9s9cs Pathway Analysis

Mass Spectrometers & data formats Thermofinnigan AB Sciex Waters Bruker Xcalibur /.raw Analyst /.wiff ;.t2d Masslynx /.raw.baf X! tandem OMSSA Mascot Sequest ProteinPilot

Orbitrap : Data formats mzxml : XML (extensible Markup Language) based common file format for proteomics mass spectrometric data. mzml : Joinf format developed from mzdata and mzxml formats. Sequest (DTA) Files : Convert.RAW to.dta file using extract_msn.exe utility. Name : Myoglobin_digest.0012.0015.2.dta File starts with : 1999 2 Mascot Generic Format (MGF) : ASCII format used as an input for Mascot searches (also OMSSA, X!tandem, Phenyx, ProteinPilot, Myrimatch, TagIdent, etc.). The precursor peptide mass is an observed m/z value, from which relative molecular mass is calculated using the prevailing charge state. PEPMASS=1000 CHARGE=2+ Relative molecular mass M r is 1998.

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Data formats. data processing and effects - search algorithms - de novo analysis : Peaks - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

raw data and peaklist enovatia.com Raw Data Peaks with wider distribution (Area under curve) Additional information (Noise Peaks) Peak detection Noise removal Baseline correction Monoisotoping Charge state PeakList (Processed) Peaks (centroid) with intensity values. Removal of noise peaks.

Orbitrap :Processing and effects Average ppm and Standard deviation improves when MaxQuant processed files are used.

Orbitrap :Processing and effects Peak detection Noise removal Baseline correction Monoisotoping Charge state

raw data conversion tools XRawfile library from ThermoFinnigan Xcalibur software. mzxml.raw ReAdW MSConvert ProteoWizard mzxml Raw2MSM MSM http://tools.proteomecenter.org/wiki/index.php?title=software:readw http://proteowizard.sourceforge.net/ http://tinyurl.com/raw2msm

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Data formats. data processing and effects - search algorithms - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

Proteomics workflow Orbitrap Mass spectral data. (.raw) Processing Search Algorithm

search algorithm Nature Methods - 4, 787-797 (2007)

search algorithm Mass spectral Mass data. Spectrometry Search algorithm Sequest X!tandem OMSSA MaxQuant RDC : sdvlapp32, PC2 and PC3 in SDVL. ProteinPilot CGL 138. Mascot https://sequest7.msi.umn.edu/mascot

Protip / TINT John Chilton Mark Nelson https://tropix.msi.umn.edu/

LTQ-iQuant (itraq Quantication software) in Protip. Getiria Onsongo

Protip Raw Data from Orbitrap msconvert mzxml format Mgf format dta format 14 node SEQUEST cluster X!TANDEM search OMSSA search SEQUEST search MASCOT search Scaffold Analysis Scaffold Viewer Powered by

Protip https://tropix.msi.umn.edu/

MAXQUANT ORBITRAP DATASET Raw Data MAXQUANT QUANT module MSM Mascot.dat file Protein Summary with informa=on that can be used for Pathway Analysis, Func=onal Enrichment, Localiza=on predic=on etc. IDENTIFY module Protein Identification List. Peptide or MS/MS Summary

Improving Mass Accuracy

References / Weblinks MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Cox J, Mann M. Nat Biotechnol. 2008; 26(12):1367-1372. http://www.ncbi.nlm.nih.gov/pubmed/19029910 A practical guide to the MaxQuant computational platform for SILAC-based quantitative proteomics. Cox J, Matic I, Hilger M, Nagaraj N, Selbach M, Olsen JV, Mann M. Nat Protoc. 2009;4 (5):698-705. http://www.ncbi.nlm.nih.gov/pubmed/19373234 Computational principles of determining and improving mass precision and accuracy for proteome measurements in an Orbitrap. Cox J, Mann M. J Am Soc Mass Spectrom. 2009;20(8):1477-1485. http://www.ncbi.nlm.nih.gov/pubmed/19553133 Precision proteomics: the case for high resolution and high mass accuracy. Mann M, Kelleher NL. Proc Natl Acad Sci U S A. 2008 Nov 25;105(47):18132-8. http://www.ncbi.nlm.nih.gov/pubmed/18818311 http://www.maxquant.org http://groups.google.com/group/maxquant-list http://tinyurl.com/maxquant-intro http://tinyurl.com/maxquant-tutorial

MaxQuant : Current availability Currently MaxQuant v1.0.13.13 is available on PC1 and PC3 You can login into these machines using your MSI password. Please store your raw files in U: drive Transfer your.par files to the C: drive after Quant analysis.

Search algorithm PROTEINPILOT https://netfiles.umn.edu/users/pjagtap/proteomics_webinars_at_msi./proteinpilot_web_modules.htm

PROTEINPILOT MGF peaklist generated from an Orbitrap dataset can be used as an input for ProteinPilot searches. Performs normalization and quantification of datasets acquired in HCD mode. Multiple features such as modifications, semitryptic and non-enzymatic digestion are detected simultaneously. Robust FDR estimation. https://netfiles.umn.edu/users/pjagtap/proteomics_webinars_at_msi./proteinpilot_web_modules.htm

Introduction to ProteinPilot ( LeeAnn Higgins and Pratik Jagtap). https://netfiles.umn.edu/users/pjagtap/proteomics_webinars_at_msi./proteinpilot_web_modules.htm Module One : Introduction to ProteinPilot (LeeAnn Higgins) http://mediamill.cla.umn.edu/mediamill/embed/66243 Time : 23 minutes Module Two : ProteinPilot Outputs http://mediamill.cla.umn.edu/mediamill/embed/68879 Time : 27 minutes Module Three : ProteinPilot Peptide Summary (LeeAnn Higgins) http://mediamill.cla.umn.edu/mediamill/embed/68701 Time : 21 minutes Module Four : ProteinPilot Protein Summary and Quantitation. http://mediamill.cla.umn.edu/mediamill/embed/68840 Time : 15 minutes Module Five : False Discovery Rate Analysis and ProteinPilot FDR output. http://mediamill.cla.umn.edu/mediamill/embed/68865 Time : 25 minutes Module Six : Beyond ProteinPilot. (LeeAnn Higgins) http://mediamill.cla.umn.edu/mediamill/embed/68866 Time : 15 minutes

acquisition mode and data analysis. Data Acquisition Conversion Tool Optimal use Software Centroid Mode Any Identification Any Profile Mode Raw2MSM yields better results MS-based quantification MaxQuant CID Any Identification Any ProteinPilot and HCD Any itraq Quantification LTQ-iQuant PQD Any itraq Quantification LTQ-iQuant ETD Any Modifications and c and z ions OMSSA and COMPASS

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Orbitrap and precision Proteomics. Data formats. data processing and effects Proteomics workflow - search algorithms - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

Statistical Validation of Peptide and Protein Identification Orbitrap Mass spectral data. (.raw) Processing Search Algorithm Sta9s9cal valida9on of Protein Iden9fica9on

False Discovery Rate Analysis

# of Matches FDR Global Threshold 1 : 99 % correct; 1% incorrect. Threshold 2 : 95 % correct; 5% incorrect. Threshold 3 : 90 % correct; 10% incorrect. Local Threshold 1 : 95 % correct; 5% incorrect. Threshold 2 : 60 % correct; 40% incorrect. Threshold 3 : 20 % correct; 80% incorrect. Discriminant Score Slide Courtesy : Brian Searle (Proteome Software) Choi H, Ghosh D, Nesvizhskii A. J. Proteome Res. 2008 :7(1):286

Combining results from multiple search algorithms increases the confidence and number of peptide and protein identifications. HUMAN DATASET 8400 7200 8162 6554 6962 7443 6000 5522 5137 5486 4800 3600 2400 1200 401 370 411 491 441 441 462 0 Sequest X! tandem Mascot All Together Sequest + Mascot Sequest + X! tandem X! tandem + Mascot

Scaffold : Current availability Currently Scaffold (v 3.03) is available on app1.msi.umn.edu You can Remote login into app1.msi.umn.edu using your MSI password. Please store your files on U: drive Scaffold v 3.03 is also available on ProTIP.

ProteoIQ Sta9s9cal valida9on of pep9de and protein iden9fica9ons.

ProteoIQ: Current availability Currently ProteoIQ (v 2.1.03) is available on app1.msi.umn.edu You can Remote login into app1.msi.umn.edu using your MSI password. Please store your files on U: drive

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Orbitrap and precision Proteomics. Data formats. data processing and effects Proteomics workflow - search algorithms - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

Visualization, Descriptive Statistics and Pathway analysis Mass spectral Mass data. Spectrometry Search algorithm De novo Tools. Sta9s9cal valida9on of pep9de and protein iden9fica9ons. Visualiza9on Descrip9ve Sta9s9cs Pathway analysis

Spectral Visualization Scaffold Viewer MaxQuant Viewer ProteoIQ ProteinPilot SeeMS Spectra Viewer

The ProteinPilot Descriptive Statistics Template (PDST) Tool Description: A tool for users of ProteinPilot Software that provides a large amount of additional analyses not provided directly by the software. Features: Describe and assure the quality of identification and quantitation results. Optimize acquisition effectiveness with detailed metrics on acquisition redundancy, chromatography, mass accuracy, etc. Characterize digestion, quantitation labeling efficiency, minor artifact modifications, etc. Detect when important features are missing from search space. Provide basic estimation of quantitation false discovery rates for workflows using itraq reagents. Slide content provided by Sean L. Seymour, AB Sciex

PDST : Descriptive Statistics Number of proteins searched. FDR single summary Peptide Summary Protein Summary Number of spectra ProteinPilot Outputs PDST Excel Template Quality Control of Identification and Quantitation. Detailed metrics on acquisition redundancy, chromatography, mass accuracy, etc. Characterize digestion, labeling efficiency, minor artifact modifications, etc. Provide basic estimation of quantitation false discovery rates for workflows using itraq reagents.

PDST : Descriptive Statistics Slide content provided by Sean L. Seymour, AB Sciex

PDST : Descriptive Statistics Slide content provided by Sean L. Seymour, AB Sciex

Pathway Analysis Software : Ingenuity Pathway Analysis

Pathway Analysis Database for Annotation, Visualization and Integrated Discovery (DAVID ) http://david.abcc.ncifcrf.gov/ http://www.nature.com/nprot/journal/v4/n1/abs/nprot.2008.211.html

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. Orbitrap and precision Proteomics. Data formats. data processing and effects Proteomics workflow - search algorithms - Statistical validation of protein identification - Visualization - Descriptive Statistics : PDST (ProteinPilot outputs) - Pathway analysis Applications of workflows : Horses for Courses.

Choosing the workflow outputs to maximise identifications : Horses for courses

Choosing the workflow outputs to maximise identifications : Horses for courses

Orbitrap :Processing and effects

Protein identification : Orbitrap datasets

Conclusions Multiple search algorithms and statistical validation tools are available for analysis. (including Quantitative Analysis Tools) Various visualization, descriptive statistics and pathway tools are available. Outputs from various workflows can be creatively merged to harness the best features of each workflow.

Sequest X! tandem http://sitemaker.umich.edu/iwsmoi http://www.mass.msi.umn.edu/ Questions? pratik@msi.umn.edu