Global and Discovery Proteomics Lecture Agenda



Similar documents
Aiping Lu. Key Laboratory of System Biology Chinese Academic Society

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists

Quantitative proteomics background

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

Error Tolerant Searching of Uninterpreted MS/MS Data

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Searching Nucleotide Databases

Chapter 14. Modeling Experimental Design for Proteomics. Jan Eriksson and David Fenyö. Abstract. 1. Introduction

Interpretation of MS-Based Proteomics Data

PeptidomicsDB: a new platform for sharing MS/MS data.

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics

Research-grade Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, How I learned to ditch the triple quad and love the QE

Proteomic Analysis using Accurate Mass Tags. Gordon Anderson PNNL January 4-5, 2005

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

Introduction to Proteomics 1.0

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. September 28 th 2011 Pratik Jagtap

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

ProSightPC 3.0 Quick Start Guide

La Protéomique : Etat de l art et perspectives

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

Introduction to Proteomics

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Advantages of the LTQ Orbitrap for Protein Identification in Complex Digests

Introduction to mass spectrometry (MS) based proteomics and metabolomics

Statistical Analysis Strategies for Shotgun Proteomics Data

Protein Prospector and Ways of Calculating Expectation Values

Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics

CPAS Overview. Josh Eckels LabKey Software

ProteinPilot Report for ProteinPilot Software

Thermo Scientific PepFinder Software A New Paradigm for Peptide Mapping

High Throughput Proteomics

A Primer of Genome Science THIRD

Introduction to Database Searching using MASCOT

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

MaxQuant User s Guide Version

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Challenges and Opportunities in Proteomics Data Analysis*

Master course KEMM03 Principles of Mass Spectrometric Protein Characterization. Exam

A Quadrupole-Orbitrap Hybrid Mass Spectrometer Offers Highest Benchtop Performance for In-Depth Analysis of Complex Proteomes

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Pinpointing phosphorylation sites using Selected Reaction Monitoring and Skyline

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

Tutorial 9: SWATH data analysis in Skyline

Shotgun Proteomic Analysis. Department of Cell Biology The Scripps Research Institute

Sub menu of functions to give the user overall information about the data in the file

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Mascot Search Results FAQ

The Open2Dprot Proteomics Project for n-dimensional Protein Expression Data Analysis

MASCOT Search Results Interpretation

Quan%ta%ve proteomics. Maarten Altelaar, 2014

Absolute quantification of low abundance proteins by shotgun proteomics

Integrating Bioinformatics, Medical Sciences and Drug Discovery

Un (bref) aperçu des méthodes et outils de fouilles et de visualisation de données «omics»

Mass Spectrometry Based Proteomics

HRMS in Clinical Research: from Targeted Quantification to Metabolomics

Mass Spectra Alignments and their Significance

Computational analysis of unassigned high-quality MS/MS spectra in proteomic data sets

Introduction to Proteomics

Biological Sequence Data Formats

FACULTY OF MEDICAL SCIENCE

CALIFORNIA STATE UNIVERSITY CHANNEL ISLANDS

Ph.D. in Bioinformatics and Computational Biology Degree Requirements

A leader in the development and application of information technology to prevent and treat disease.

MS Amanda Standalone User Manual

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

Workshop IIc. Manual interpretation of MS/MS spectra. Ebbing de Jong. Center for Mass Spectrometry and Proteomics Phone (612) (612)

Building innovative drug discovery alliances. Evotec Munich. Quantitative Proteomics to Support the Discovery & Development of Targeted Drugs

Proteomics in Practice

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

SimGlycan Software*: A New Predictive Carbohydrate Analysis Tool for MS/MS Data

using ms based proteomics

Tutorial for proteome data analysis using the Perseus software platform

High-throughput Data Analysis of Proteomic Mass Spectra on the SwissBioGrid

Database Searching Tutorial/Exercises Jimmy Eng

OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis

Per Andrén Dept. of Pharmaceutical Biosciences Medical Mass Spectrometry, Uppsala University, Uppsala, Sweden

MS/MS analysis of Polyphenols

Learning Objectives:

Development of computational methods for analysing proteomic data for genome annotation

Using NIST Search with Agilent MassHunter Qualitative Analysis Software James Little, Eastman Chemical Company Sept 20, 2012.

Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility

How Can Institutions Foster OMICS Research While Protecting Patients?

Integrated design of antibodies for systems biology using AbDesigner

Bruker ToxScreener TM. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS

University of Glasgow - Programme Structure Summary C1G MSc Bioinformatics, Polyomics and Systems Biology

Comparing Methods for Identifying Transcription Factor Target Genes

Transcription:

Global and Discovery Proteomics Christine A. Jelinek, Ph.D. Johns Hopkins University School of Medicine Department of Pharmacology and Molecular Sciences Middle Atlantic Mass Spectrometry Laboratory Global and Discovery Proteomics Lecture Agenda Genomics vs. Proteomics Discovery Proteomics: Basic Mass Spectrometry Techniques Discovery Proteomics: Basic Bioinformatic Techniques Database Searching: Mascot Sequest Combining Algorithms Cloud Computing 1

Human Genome Project Sequencing the human genome has transformed current biomedical research Human Genome Project http://www.ncbi.nlm.nih.gov/genome Initiated : October 1990 Working Draft : 2000 Complete : 2003 Coding genes: 20,476 Non coding genes: 22,170 Pseudogenes: 13,322 Ensembl Current Totals Inferences about biological systems Completion of genome sequencing inspired corresponding approach to identify and characterize proteins comprising the human proteome Protein-Coding Genes Gregory, TR. Nature Reviews Genetics. 6, 699-708. doi:10.1038/nrg1674 2

Proteomics A proteome consists of all proteins present in a sample (cell, tissue, body fluid, etc.) at a defined point in time and under defined conditions Proteomics is the large-scale study of the expression, localization, function, and interaction of proteins expressed by an organism s genome Proteomics Using Mass Spectrometry for Proteomics Experiments Surinova S. et al. J. Prot. Res. 2011 10:5-16 Over the last two decades, mass spectrometry-based technologies have undergone rapid advances and a high degree of innovation to fulfill the expectations of the proteomics and life science communities 3

Proteomics Using Mass Spectrometry for Proteomics Experiments Nagaraj N et al. Mol. Syst Biol 2011 7:548 Schwanhäusser B et al. Nature 2011 473:337-342 Beck M et al. Mol. Syst. Biol. 2011 7:549 Leading mass spectrometry-based proteomics laboratories have demonstrated that protein products of up to ~10,000 of the ~20,000 protein-coding human genes can be identified and quantified in a single experimental system The -omics Iceberg http://www.proteome.ru/en/avogadro 4

Human Plasma Proteome A case of the -omics Iceberg Anderson N L, Anderson N G Mol Cell Proteomics 2002;1:845-867 Challenges in Proteomics Aebersold R.Nature Methods. 6: 411 412. doi:10.1038/nmeth.f.255 5

Shotgun Proteomics Bottom-Up Proteomics General procedure using LC-MS/MS for proteomic profiling Aebersold R and Mann M. Nature 2003 422:198-207 Sample Preparation Proteins Peptides HPLC-MS/MS Analysis Statistical Analysis Bioinformatics MS Data MS/MS Data 6

Bottom-Up Proteomics Common Sample Preparative Steps Bottom-Up Proteomics General procedure using LC-MS/MS for proteomic profiling Aebersold R and Mann M. Nature 2003 422:198-207 Sample Preparation Proteins Peptides HPLC-MS/MS Analysis Statistical Analysis Bioinformatics MS Data MS/MS Data 7

Commonly used Enzymes Bottom-up Mass Spectrometry Bottom-Up Proteomics General procedure using LC-MS/MS for proteomic profiling Aebersold R and Mann M. Nature 2003 422:198-207 Sample Preparation Proteins Peptides HPLC-MS/MS Analysis Statistical Analysis Bioinformatics MS Data MS/MS Data 8

Bottom-Up Proteomics Data-Dependent Tandem Mass Spectrometry Tandem MS/MS Scans Fourier transformed MS Scan Fragmentation A B C Bottom-Up Proteomics Data-Dependent Tandem Mass Spectrometry 9

Bottom-Up Proteomics Peptide Fragmentation Biomed. Mass Spectrom. 11 (11): 601. doi:10.1002/bms.1200111109 Bottom-Up Proteomics Peptide Fragmentation 10

Bottom-Up Proteomics Identifying Post-Translational Modifications Technical Limitations: LOD Data Dependent Mass Spectrometry http://www.proteome.ru/en/avogadro 11

Technical Limitations: LOD Data Dependent Mass Spectrometry Ghaemmaghami S. et al. Nature. 2003. 425: 737-741. Technical Limitations: Dynamic Range Data Dependent Mass Spectrometry Smith R. et al. Advances in Protein Chemistry. 2003. 65: 85 131. 12

Technical Limitations: Sampling Data Dependent Mass Spectrometry Michalski A., Cox J., and Mann M. J. Proteome Res. 10, 1785 1793. Bottom-Up Proteomics General procedure using LC-MS/MS for proteomic profiling Aebersold R and Mann M. Nature 2003 422:198-207 Sample Preparation Proteins Peptides HPLC-MS/MS Analysis Statistical Analysis Bioinformatics MS Data MS/MS Data 13

800.43 100 90 80 70 448.20 60 50 40 602.48 814.37 30 440.22 701.38 400.63 20 213.13 391.80 585.04 10 312.35 515.36 683.26 782.43 132.93 0 200 300 400 500 600 700 800 900 m/z 11/9/2012 Bottom-Up Proteomics Protein and Peptide Identification Protein Tryptic Peptides Experimental Mass Spectrum BSA_500fmol_02_120601 #1854 RT: 21.52 AV: 1 NL: 1.91E3 T: ITMS + c NSI d Full ms2 457.27@cid35.00 [115.00-925.00] Relative Abundance Protein Sequence SEMHIKHYTTKILGFREE GDSCPLKQWDDSKILVAV ADKLLEYEEKILLFNSAKY LLDESSTYKLMHDDSV Theoretical Tryptic Peptides SEMHIKHYTTK ILGFR EEGDSCPLK QWDDSK ILVAVADK LLEYEEK ILLFNSAK YLLDESSTYK LMHDDSV Theoretical Mass Spectrum Bioinformatics Resources Protein and Peptide Identification 14

Database Searching Database Search Algorithms Protein and Peptide Identification Protein Comparing raw MS/MS data with molecular sequence databases to indentify constituent proteins Peptide molecular masses Fragment ion mass & intensity values Protein/DNA sequence databases Search Engine Protein identification and characterization 15

Public Proteomic Databases MSDB: Comprehensive, non-identical protein sequence database maintained by the Proteomics Department at the Hammersmith Campus of Imperial College London NCBInr: Comprehensive, non-identical protein database maintained by NCBI. The entries have been compiled from GenBank CDS translations, PIR, SWISS-PROT, PRF, and PDB SwissProt: High quality, curetted protein database dbest: Division of GenBank containing "single-pass" cdna sequences, or Expressed Sequence Tags ThermoFisher Public Proteomic Databases Uniprot FASTA file >gi 5524211 gb AAD44166.1 cytochrome b [Elephas maximus maximus] LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYV LPWGQMSFWGATVITNLFSAIPYIGTNLV EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHE TGSNNPLGLTSDSDKIPFHPYYTIKDFLG LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEW YFLFAYAILRSVPNKLGGVLALFLSIVIL GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQP VEYPYTIIGQMASILYFSIILAFLPIAGX IENY 16

Mascot Search Algorithm MASCOT Software Protein and Peptide Identification Mascot combines 3 types of searches: Peptide Mass Fingerprinting MS/MS ions Sequence Query Searches against any FASTA database Unique, true probability based scoring Accepts mass spectrometry data from all leading instrument manufacturers High throughput format for single and multi- processor systems and clusters Automates search submission without custom programming Results summary of search results in web browser format Licensed be more than a thousand academic and commercial laboratories 17

Peptide Mass Fingerprinting (PMF) Peptide Identification using MASCOT Peptide Mass Fingerprinting (PMF) Peptide Identification using MASCOT 18

MS/MS Ion Searching Peptide Identification using MASCOT MS/MS Ion Searching Peptide Identification using MASCOT 19

Protein Identification using MASCOT Using MS and MS/MS spectra x Protein identified using both MS and MS/MS spectra Protein Identification using MASCOT Combining MS and MS/MS spectra 20

Protein Identification using MASCOT Combining MS and MS/MS spectra Protein Identification using MASCOT Interpreting Results ThermoFisher 21

Protein Identification using MASCOT Interpreting Results ThermoFisher Protein Identification using MASCOT Interpreting Results ThermoFisher 22

Protein Identification using MASCOT Interpreting Results ThermoFisher Protein Identification using MASCOT Common Settings for Search ThermoFisher 23

Protein Identification using MASCOT Common Settings for Search ThermoFisher Sequest Search Algorithm 24

Protein Identification using Sequest MS/MS spectra-based Identification Data Dependent Mass Spectral Scans: MS/MS depends on the MS Correlation Analysis >gi 84670 pir B27257 coagulogen II precursor - horseshoe crab (Tachypleus tridentatus)gi 10809 (X04192) coagulogen type II [Tachypleus tridentatus]gi 217395 gnl PID d1000491 (D00077) coagulogen type 2 [Tachypleus tridentatus]gi 356167 prf 1208319A coagulogen [Tachypleus sp.] [MASS=21826] MEKKLFGIALLLTTVASVLAADTNAPICLCDEPGVLGRTQIV TTEIKDKIEKAVEAVAQESGVSGRGFSIFSHHPVFREC GKYECRTVRPEHSRCYNFPPFIHFKSECPVSTRDCEPVFGYT VAGEFRVIVQAPRAGFRQCVWQHKCRFGSNSCGYNGRC TQQRSVVRLVTYNLEKDGFLCESFRTCCGCPCRSF >gi 585398 sp P28175 LFC_TACTR LIMULUS FASTA Protein Database Protein / Peptide / Modification Identification Protein Identification using Sequest Sequest Search Workflow 25

Protein Identification using Sequest Sequest Search Parameters Extraction Parameters Search Parameters Modifications Comparing Algorithms Mascot vs. Sequest 26

Comparing Algorithms Mascot vs. Sequest Data file format ThermoFisher Proteome Discoverer Software 27

Combining Algorithms Mascot and Sequest using Proteome Discoverer ThermoFisher Combining Algorithms Mascot and Sequest using Proteome Discoverer ThermoFisher 28

Proteome Discoverer Interpreting Results ThermoFisher Proteome Discoverer Interpreting Results ThermoFisher 29

Proteome Discoverer Interpreting Results ThermoFisher Proteome Discoverer Interpreting Results ThermoFisher 30

Cloud Computing Strategies Cloud Computing OLD 40:00:00 NEW 01:00:00 31

Combining Bioinformatic Tools Integrated Analysis Inc. Pass Software Convert MS. (.mzxml,.mgf,.mzml) FASTA. Merger Create Decoy. FASTA Peptide Prophet. Isobaric Labeled. Quantitation Refine & Merge. Results OMSSA. X! Tandem. Custom Protein Prophet. Custom Algorithms. Combining Bioinformatic Tools Customizing Bioinformatic Workflows High Resolution MS OMSSA and X!Tandem Merge FASTA Create Reverse Sequence Convert MS 32