Protein MW determination and protein identification by mass spectrometry

Similar documents
Master course KEMM03 Principles of Mass Spectrometric Protein Characterization. Exam

Laboration 1. Identifiering av proteiner med Mass Spektrometri. Klinisk Kemisk Diagnostik

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Aiping Lu. Key Laboratory of System Biology Chinese Academic Society

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

ProteinPilot Report for ProteinPilot Software

Application Note # MT-90 MALDI-TDS: A Coherent MALDI Top-Down-Sequencing Approach Applied to the ABRF-Protein Research Group Study 2008

Introduction to Database Searching using MASCOT

LABORATÓRIUMI GYAKORLAT SILLABUSZ SYLLABUS OF A PRACTICAL DEMONSTRATION. financed by the program

Proteomics in Practice

PROTEIN SEQUENCING. First Sequence

Mass Spectra Alignments and their Significance

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

Definition of the Measurand: CRP

Workshop IIc. Manual interpretation of MS/MS spectra. Ebbing de Jong. Center for Mass Spectrometry and Proteomics Phone (612) (612)

Mass Spectrometry Based Proteomics

Global and Discovery Proteomics Lecture Agenda

Error Tolerant Searching of Uninterpreted MS/MS Data

ProSightPC 3.0 Quick Start Guide

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics

MASCOT Search Results Interpretation

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists

Agilent G2721AA/G2733AA Spectrum Mill MS Proteomics Workbench

Introduction to Proteomics 1.0

SpikeTides TM Peptides for relative and absolute quantification in SRM and MRM Assays

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics

Principles and Applications of Proteomics

Introduction to Proteomics

Standard Mixture. TOF/TOF Calibration Mixture. Calibration Mixture 1 (Cal Mix 1, 1:10) Calibration Mixture 1 (Cal Mix 1, 1:100)

How Mascot Integra helps run a Core Lab

Methods for Protein Analysis

Advantages of the LTQ Orbitrap for Protein Identification in Complex Digests

MaxQuant User s Guide Version

A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys

(c) How would your answers to problem (a) change if the molecular weight of the protein was 100,000 Dalton?

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

Biochemistry - I. Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture-11 Enzyme Mechanisms II

ANALYSIS OF PROTEINS AND PROTEOMES BY MASS SPECTROMETRY

THE ABC S (AND XYZ S) OF PEPTIDE SEQUENCING

Protein Hit1, a novel box C/D snornp assembly factor, controls cellular concentration of protein Rsa1p by direct interaction

SimGlycan Software*: A New Predictive Carbohydrate Analysis Tool for MS/MS Data

Proteomic Analysis using Accurate Mass Tags. Gordon Anderson PNNL January 4-5, 2005

Quantitative mass spec based proteomics

Marmara Üniversitesi Fen-Edebiyat Fakültesi Kimya Bölümü / Biyokimya Anabilim Dalı

Mass spectrometry based proteomics

for mass spectrometry calibration tools Thermo Scientific Pierce Controls and Standards for Mass Spectrometry

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

Protein Prospector and Ways of Calculating Expectation Values

Fast and Automatic Mapping of Disulfide Bonds in a Monoclonal Antibody using SYNAPT G2 HDMS and BiopharmaLynx 1.3

Proteomic data analysis for Orbitrap datasets using Resources available at MSI. September 28 th 2011 Pratik Jagtap

Electrospray Ion Trap Mass Spectrometry. Introduction

Shotgun Proteomic Analysis. Department of Cell Biology The Scripps Research Institute

La Protéomique : Etat de l art et perspectives

Quantitative proteomics background

Appendix 5 Overview of requirements in English

Evaluation of LC-MS data for the absolute quantitative analysis of marker proteins

Retrospective Analysis of a Host Cell Protein Perfect Storm: Identifying Immunogenic Proteins and Fixing the Problem

MS Amanda Standalone User Manual

Introduction to mass spectrometry (MS) based proteomics and metabolomics

Quantification of Multiple Therapeutic mabs in Serum Using microlc-esi-q-tof Mass Spectrometry

--not necessarily a protein! (all proteins are polypeptides, but the converse is not true)

SimGlycan Software*: A New Predictive Carbohydrate Analysis Tool for MS/MS Data

Mascot Search Results FAQ

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Quantitative mass spectrometry in proteomics: a critical review

ALGORITHMS FOR SHOTGUN PROTEOMICS SPECTRAL IDENTIFICATION AND QUALITY ASSESSMENT. Ze-Qiang Ma

In-depth Analysis of Tandem Mass Spectrometry Data from Disparate Instrument Types* S

Lecture Conformation of proteins Conformation of a protein three-dimensional structure native state. native condition

A Guide to the Analysis and Purification of Proteins and Peptides by Reversed-Phase HPLC

Interpretation of MS-Based Proteomics Data

Statistics and Algorithms for Peptide Mass Fingerprinting

itraq Tips and Tricks

Application Note # LCMS-66 Straightforward N-glycopeptide analysis combining fast ion trap data acquisition with new ProteinScape functionalities

PeptidomicsDB: a new platform for sharing MS/MS data.

Mascot Integra: Data management for Proteomics ASMS 2004

1) Technical informations. - a) How does it work? - b) Purification - c) Quality Control. 2) Standard synthesis

STANFORD UNIVERSITY MASS SPECTROMETRY 333 CAMPUS DR., MUDD 175 STANFORD, CA

Carlos III Networked Proteomics Platform

Investigating Biological Variation of Liver Enzymes in Human Hepatocytes

Data mining with Mascot Integra ASMS 2005

Metabolomic Profiling of Accurate Mass LC-MS/MS Data to Identify Unexpected Environmental Pollutants

For the next half hour I m going to be describing some of the different options for peak peaking. The profit is with getting better protein ID or

12/6/12. Dr. Sanjeeva Srivastava. IIT Bombay 2

Application of a New Immobilization H/D Exchange Protocol: A Calmodulin Study

Q-TOF User s Booklet. Enter your username and password to login. After you login, the data acquisition program will automatically start.

Overview. Triple quadrupole (MS/MS) systems provide in comparison to single quadrupole (MS) systems: Introduction

Database Searching Tutorial/Exercises Jimmy Eng

Separation of Amino Acids by Paper Chromatography

Pinpointing phosphorylation sites using Selected Reaction Monitoring and Skyline

Molecular Cell Biology. Prof. D. Karunagaran. Department of Biotechnology. Indian Institute of Technology Madras

Industry Perspective: Advantages of Open Access and Walkup LC/ MS Supporting Protein Drug Discovery and Development

An In-Gel Digestion Protocol

The Organic Chemistry of Amino Acids, Peptides, and Proteins

Mass Spectrometry Signal Calibration for Protein Quantitation

Transcription:

Protein MW determination and protein identification by mass spectrometry Tuula Nyman Protein Chemistry Research Group Institute of Biotechnology tuula.nyman@helsinki.fi Protein MW determination by MS (NOT= identification) -for MW determination the protein needs to be in solution without salts and detergents -usually proteins are first purified with RP chromatography before MW measurement 1

-MALDI TOF MS, linear mode -accuracy is not as good with ESI MS Singly charged ion Doubly charged ion Protein MW determination, ESI MS +24 +23 +22 +21 +20 +19 p 1 +18 p 2 +17 -Protein is observed as a series of different charges -Protein MW can be calculated based on adjacent m/z-values +25 +16 +15 +14 2

Protein MW calculation from ESI spectra p=m/z p 1 =(M r +z 1 )/z 1 p 2 =[M r +(z 1-1)]/(z 1-1) p= a peak in the mass spectrum m= total mass of an ion z= total charge M r = average mass of the protein If p 1 =942.753 and p 2 =998.067 p=m/z p 1 =(M r +z 1 )/z 1 p 2 =[M r +(z 1-1)]/(z 1-1) 942.753 =(M r +z 1 )/z 1 942.753z 1 = Mr+z 1 941.753z 1 = M r 998.067 = (941.753z 1 +z 1-1)/z 1-1 998.067z 1-997.067 = 942.753z 1-1 (998.067-942.753)z 1 = 996.067 55.314z 1 = 996.067 => z 1 =18.0075 M r = 941.753z 1 = 16 951.6 Da 3

Deconvoluted electrospray mass spectrum of myoglobin Protein identification by mass spectrometry protein of interest is cleaved into peptides with a specific enzyme peptides are analyzed by MS (and MS/MS) Protein identification methods: Peptide mass fingerprinting (PMF) Identification based on MS/MS data from one or more peptides 4

Peptide mass fingerprint (PMF) A mass spectrum of the peptide mixture resulting from the digestion of a protein by an enzyme, usually measured by MALDI-TOF Identification based on peptide MW information only Database search engines create theoretical PMFs for all the proteins in the database and compare these to the measured PMF from protein X Protein identification based on MS/MS data from one or more peptides Database search engines create theoretical peptide fragmentation patterns for all the proteins in the database and compare these to the measured MS/MS data 5

Protein identification by MS Protein has to be digested into peptides Disulphide bridges need to be reduced and alkylated before digestion intact protein alkylated protein tryptic fragments from the protein N S S C Reduction Alkylation N C Trypsin K R K K K K R R Protein digestion into peptides before MS analysis Digestion can be done in-solution or in-gel The enzyme has to be as specific as possible Trypsin is most commonly used enzyme: -very specific, quite cheap -cleaves peptide bond after lysines and arginines -tryptic peptides are good for MS analysis because they end up with basic amino acid -works for both in-solution and in-gel digestion 6

Other enzymes: -LysC, cleaves after lysines -LysN, cleaves before lysines -AspN, cleaves before aspartic acid residues -V8 protease, cleaves peptide bonds exclusively on the carbonyl side of aspartate and glutamate residues -possible to do double-digestions Chemical cleavage: Cyanogen bromide, cuts after methionines In-gel digestion -works with low-femtomolar amounts of proteins BUT: -easy to contaminate samples with keratin -the gel staining protocol needs to be compatible with MS -silver-staining: fixing with glutaraldehy cross-links protein into gel matrix after which identification is not possible 7

Peptide mass fingerprinting -usually peptides need to be desalted and concentrated before MALDI analysis -> ZipTips, peptide elution directly onto MALDI target plate -MALDI matrix included in the elution solution Peptide mass fingerprinting MALDI, Positive ion reflector mode -silver-stained spots from 2-DE gels, in-gel digestion+ziptipping 8

5 4 3 2 1 1574.775 0 1568 15 70 1572 15 74 15 76 1578 1580 1582 m /z Intens. [a.u.] x10 4 Peak picking parameters important in PMF Intens. [a.u.] x10 4 1590.760 1574.775 5 Label monoisotopic masses Check for correct isotope! 4 1434.642 3 2 1 1606.761 1450.637 1475.728 1657.772 0 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 m/z Publicly available search engines for PMF Mascot ProteinProspector/ MS-Fit PROWL/ ProFound Aldente Databases 9

PMF/ database search www.matrixscience.com 10

11

Usually not many missed cleavages Mass accuracy critical -check for known contaminants (these should be removed before search) -possibility to do 2nd pass search Normal search, max 1missed cleavage allowed Max 4 missed cleavages allowed 12

PEPTIDE MASS FINGERPRINTING quick and easy requires a very spesific enzyme optimized digestion+ desalting protocols internal/close external calibration of MALDI spectra works only for proteins which are already in the databases as protein sequences not suitable for complex protein mixtures Protein identification based on MS/MS data from one or more peptides -MALDI-TOF/TOF or nanolc-esi-ms/ms analysis -suitable for (complex) protein mixtures, especially when combined with LC separation of peptides before MS/MS 13

Tandem mass spectrometry scan types Product ion scanning MS scan: The ions are first separated according to their m/z ratios MS/MS scan: Peptides with certain m/z-ratio are selected and fragmented inside the mass analyzer, and the m/z-ratios of the fragment ions are measured Intensity, counts TOF product 745,4 C H I L L P L I H m/z, amu 14

Peptide Fragmentation Nomenclature Peptides do not fragment sequentially, the fragmentation events are somewhat random. The most common peptide fragments observed in low energy collisions are a, b and y ions. The b ions appear to extend from the amino terminus (N-terminus), and y ions appear to extend from the carboxyl terminus (C-terminus). Protein identification with MALDI-TOF/TOF -first, PMF with MALDI-TOF -next, selected precursor ions can be fragmented (TOF/TOF analysis) -MALDI produces singly charged parent ions product ion spectra not as easy to interpret as in ESI-MS/MS -database search with both PMF and MS/MS information 15

250 500 750 1000 1250 1500 1750 2000 2250 m/z 1.2 1.0 0.8 0.6 0.4 0.2 0.0 1060.282 1040 1050 10 60 10 70 1080 1090 m /z MALDI-TOF/TOF fragment ion spectra from parent ion m/z 1952 Intens. [a.u.] x10 5 874.168 Intens. [a.u.] x10 5 1.25 340.943 1060.282 1.00 0.75 0.50 243.966 1953.011 194.917 0.25 111.963 479.932 1825.969 664.960 746.087 972.188 1158.227 1448.409 1613.623 0.00 Sequence Query: One or more peptide mass values associated with information such as partial or ambiguous sequence strings, amino acid composition information, MS/MS fragment ion masses, etc. 16

Protein identification using nanolc-ms/ms -50-75 um i.d. RP-columns, 200 nl/min no need to split the effluent before MS -DDA= data dependent analysis, can be fully automated -suitable for complex protein mixtures, possibility to identify hundreds of proteins in one run 17

DDA =Data Dependent Acquisition (IDA =Information Dependent Acquisition) -fully automated experiment, first MS scan followed by two or more product ion scans -the acquisition software is set to choose certain types of ions for fragmentation and to use suitable collision energy for this ion -in ESI tryptic peptides have usually 2-4 charges DDA experiment Exp 1 = TOF MS Exp 2 = Product ion scan from parent I Exp 3 = Product ion scan from parent II 18

TIC = total ion current Exp 1 = TOF MS Exp 2 = Product ion scan I Exp 3 = Product ion scan II 19

Search engines for (LC-)MS/MS data -Mascot, Sequest, OMSSA etc -the programs take the fragment ion spectrum of a peptide as input and score it against theoretical fragmentation patterns constructed for peptides from the searched database. -in practise the user is often limited to use those search engines which accept the data format from the mass spec used -mzxml is a open data format for storage and exchange of mass spec data -raw, proprietary file formats from most vendors can be converted to the open mzxml format 20

Ion scores from individual MS/MS spectra Protein identification from complex mixtures using nanolc-ms/ms -produces huge amounts of raw data -requires efficient data processing tools -different database search programs can produce different results from the same raw data -false discovery rate estimation 21

Comparative proteome cataloging of Lactobacillus rhamnosus strains GG and Lc705 -in-depth proteome analysis of two Lactobacillus rhamnosus strains, the well-known probiotic strain GG and the dairy strain Lc705 -GeLC-MS/MS: proteins are separated using SDS_PAGE and identified using nanolc-ms/ms -to maximize the number of identifications, all data sets were searched against the target databases using two search engines, Mascot and Paragon Savijoki et al, JPR 2011 22

Data analyses Database Searches Protein Pilot (Mascot & Paragon) Compid analyses Compilation of Mascot and Paragon results Bioinformatics Cellular location of the identified proteins Savijoki et al, JPR 2011 Natri et al, JPR 2010 23

Comparison of protein ID results from the same raw data with different search engines ProICAT SP2 Spectrum Mill Sequest Moulder et al, Proteomics. 2005;5:2748-60. Comparison of the ID results with different search engines a)peptide identifications, b) Proteins identifications with two or more peptides given as a proportion of all protein identifications. 24

False discovery rate (FDR) estimation FDR = FP/(TP+FP) TP= true positive matches FP =false positive matches Previously: search the raw data first against the normal database and then against the decoy database -> calculate FDR based on these NOW the preferred method: One search against a database in which the target and decoy sequences have been concatenated. This means that you will only record a false positive when a match from the decoy sequences is better than any match from the target sequences. Elias, J. E., et al, Nature Methods 2 667-675 (2005). Now you re protein is identified What else can we find in the data? Protein fragmentation -protein separation by SDS-PAGE/2-DE followed by in-gel digestion and MS analysis 25

Cytokeratin-18 is cleaved during viral infection, and fragments localize onto mitochondria Mitochondrial protemes Based on 2-DE and MS data: N-terminal fragment of the identified protein 26

Post-translational modifications TAP-tag purification of protein complexes, on-beads digestion with double enzyme, first LysC 2h, then trypsin o/n 27

KIN2 phosphopeptide 28

De novo sequencing De novo =peptide sequencing performed without prior knowledge of the amino acid sequence -if enough material is available classical Edman degradation is still a vey good method -partial peptide sequencing is possible based on MS/MS data 29