1 Genzyme Corp., Framingham, MA, 2 Positive Probability Ltd, Isleham, U.K.



Similar documents
Thermo Scientific PepFinder Software A New Paradigm for Peptide Mapping

Functional Data Analysis of MALDI TOF Protein Spectra

Q-TOF User s Booklet. Enter your username and password to login. After you login, the data acquisition program will automatically start.

Alignment and Preprocessing for Data Analysis

Learning Objectives:

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

Micromass LCT User s Guide

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Tutorial 9: SWATH data analysis in Skyline

Pesticide Analysis by Mass Spectrometry

Application of a New Immobilization H/D Exchange Protocol: A Calmodulin Study

Mass Frontier 7.0 Quick Start Guide

AMD Analysis & Technology AG

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

MASCOT Search Results Interpretation

Using NIST Search with Agilent MassHunter Qualitative Analysis Software James Little, Eastman Chemical Company Sept 20, 2012.

A Streamlined Workflow for Untargeted Metabolomics

SELDI-TOF Mass Spectrometry Protein Data By Huong Thi Dieu La

Copyright 2007 Casa Software Ltd. ToF Mass Calibration

Appendix 5 Overview of requirements in English

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

Monoclonal Antibody Fragment Separation and Characterization Using Size Exclusion Chromatography Coupled with Mass Spectrometry

# LCMS-35 esquire series. Application of LC/APCI Ion Trap Tandem Mass Spectrometry for the Multiresidue Analysis of Pesticides in Water

The Use of Micro Flow LC Coupled to MS/MS in Veterinary Drug Residue Analysis

Enhancing GCMS analysis of trace compounds using a new dynamic baseline compensation algorithm to reduce background interference

UHPLC/MS: An Efficient Tool for Determination of Illicit Drugs

Analysis of the Vitamin B Complex in Infant Formula Samples by LC-MS/MS

VTT TECHNICAL RESEARCH CENTRE OF FINLAND

[ Care and Use Manual ]

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

Analyzing Small Molecules by EI and GC-MS. July 2014

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

NUVISAN Pharma Services

PosterREPRINT AN LC/MS ORTHOGONAL TOF (TIME OF FLIGHT) MASS SPECTROMETER WITH INCREASED TRANSMISSION, RESOLUTION, AND DYNAMIC RANGE OVERVIEW

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

STANFORD UNIVERSITY MASS SPECTROMETRY 333 CAMPUS DR., MUDD 175 STANFORD, CA

Application Note # LCMS-66 Straightforward N-glycopeptide analysis combining fast ion trap data acquisition with new ProteinScape functionalities

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

Mascot Search Results FAQ

Quick Start Guide Chromeleon 7.2

Tuning & Mass Calibration

How To Use An Ionsonic Microscope

Thermo Scientific GC-MS Data Acquisition Instructions for Cerno Bioscience MassWorks Software

Retrospective Analysis of a Host Cell Protein Perfect Storm: Identifying Immunogenic Proteins and Fixing the Problem

Application Note # MT-90 MALDI-TDS: A Coherent MALDI Top-Down-Sequencing Approach Applied to the ABRF-Protein Research Group Study 2008

Advantages of the LTQ Orbitrap for Protein Identification in Complex Digests

Waters Core Chromatography Training (2 Days)

STANDARD OPERATING PROCEDURES

Protein Prospector and Ways of Calculating Expectation Values

Unique Software Tools to Enable Quick Screening and Identification of Residues and Contaminants in Food Samples using Accurate Mass LC-MS/MS

Chemometric Analysis for Spectroscopy

Master course KEMM03 Principles of Mass Spectrometric Protein Characterization. Exam

RAPID MARKER IDENTIFICATION AND CHARACTERISATION OF ESSENTIAL OILS USING A CHEMOMETRIC APROACH

Application of TargetView software in the fragrance industry: Accurate identi cation of allergens with fast GC

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

Extraction Heights for STIS Echelle Spectra

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

Overview. Purpose. Methods. Results

1 st day Basic Training Course

DYNAMIC LIGHT SCATTERING COMMON TERMS DEFINED

MassHunter for Agilent GC/MS & GC/MS/MS

Aiping Lu. Key Laboratory of System Biology Chinese Academic Society

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

Resolving physical line profiles from Bragg X-ray spectra using Withbroe-Sylwester deconvolution

Analysis of Phthalate Esters in Children's Toys Using GC-MS

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

Audit Report and Requirements of Mantid Software from the Molecular Spectroscopy Group (2013)

Strategies for Developing Optimal Synchronous SIM-Scan Acquisition Methods AutoSIM/Scan Setup and Rapid SIM. Technical Overview.

Performance and advantages of qnmr measurements

Electrospray Ion Trap Mass Spectrometry. Introduction

Overview. Introduction. AB SCIEX MPX -2 High Throughput TripleTOF 4600 LC/MS/MS System

Mass Frontier Version 7.0

ProteinPilot Report for ProteinPilot Software

Introduction. What Can an Offline Desktop Processing Tool Provide for a Chemist?

FTIR Instrumentation

Analysis of Polyphenols in Fruit Juices Using ACQUITY UPLC H-Class with UV and MS Detection

Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach

Correlation of the Mass Spectrometric Analysis of Heat-Treated Glutaraldehyde Preparations to Their 235nm / 280 nm UV Absorbance Ratio

Increasing Quality While Maintaining Efficiency in Drug Chemistry with DART-TOF MS Screening

Building Agilent GC/MSD Deconvolution Reporting Libraries for Any Application Technical Overview

Mass Spectrometry Signal Calibration for Protein Quantitation

Agilent 6000 Series LC/MS Solutions CONFIDENCE IN QUANTITATIVE AND QUALITATIVE ANALYSIS

Application of Structure-Based LC/MS Database Management for Forensic Analysis

Application Note # LCMS-62 Walk-Up Ion Trap Mass Spectrometer System in a Multi-User Environment Using Compass OpenAccess Software

using ms based proteomics

Background Information

Laboration 1. Identifiering av proteiner med Mass Spektrometri. Klinisk Kemisk Diagnostik

Overview. Triple quadrupole (MS/MS) systems provide in comparison to single quadrupole (MS) systems: Introduction

Accurate calibration of on-line Time of Flight Mass Spectrometer (TOF-MS) for high molecular weight combustion product analysis

Tackling the data analysis challenge for characterisation of biotherapeutics

Noise reduction of fast, repetitive GC/MS measurements using principal component analysis (PCA)

VALIDATION OF ANALYTICAL PROCEDURES: TEXT AND METHODOLOGY Q2(R1)

Daniel M. Mueller, Katharina M. Rentsch Institut für Klinische Chemie, Universitätsspital Zürich, CH-8091 Zürich, Schweiz

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

ProteinChip Energy Absorbing Molecules (EAM)

MarkerView Software for Metabolomic and Biomarker Profiling Analysis

Transcription:

Overview Fast and Quantitative Analysis of Data for Investigating the Heterogeneity of Intact Glycoproteins by ESI-MS Kate Zhang 1, Robert Alecio 2, Stuart Ray 2, John Thomas 1 and Tony Ferrige 2. 1 Genzyme Corp., Framingham, MA, 2 Positive Probability Ltd, Isleham, U.K. Aim: To compare related probability-based methods for charge deconvolution including a novel methodology for the unambiguous analysis of highly complex ESI-MS data of large, heterogeneous glycoprotein samples. Methods: The methods compared were Bayesian transformations, maximum entropy and the novel ReSpect algorithm. Results: The ReSpect algorithm provided much cleaner and more highly resolved results that allowed unambiguous analysis of the data. Introduction The investigation of the heterogeneity of intact proteins larger than 20 kda has always been difficult by ESI-MS due to instrument resolution limits. With the latest generation of instruments (e.g. Qq-TOF), the resolution now exceeds 10,000 for MW less than 20 kda, provides a possible means of characterizing intact proteins. However, the characterization of larger glycoproteins, ~65 kda is still a challenge, since the traditional deconvolution algorithms often fail to resolve complex features in the data even when the ion intensity of the proteins is respectable. In the present study, data for human recombinant glycoproteins, which have MW of ~65 kda, were acquired by QSTAR and the data then processed using several charge deconvolution methods in an attempt to obtain detailed information on their glycosylation. The Algorithms a) Bayesian Transformation Advantage: Probability-based method. Superior to algebraic transformations. Disadvantages: Slow. No resolution enhancement. No error assessments. Prone to produce artefacts. b) Maximum Entropy Advantages: Mathematically plausible results often free of artefacts. Substantial gains in resolution. Disadvantages: Very slow. Works best with a section of the ESI-MS envelope because calibration errors reduce result quality. Extraneous signals - solvents and chemical noise - allow unwanted correlations and artifacts. The methods require a random number seed on which the quantified errors are dependent. c) ReSpect Advantages: Very fast. Gives the most plausible results based on the physics of the experiment. Results are virtually free of artifacts. Accommodates varying noise levels. Provides highly resolved zero-charge spectra and a single set of robust error assessments. Disadvantages: Slower than algebraic methods. The work described here is aimed at comparing the performance of Bayesian transformations, maximum entropy and the new ReSpect method for obtaining zero-charge spectra from complex ESI-MS data.

Experimental Conditions & Raw Data The glycoprotein samples A and B were dissolved in methanol/water (50/50) with 0.5% acetic acid at a concentration of 10 pmol/;l and infused into QSTAR at a flow rate of 2 ;l/min. The spectra were acquired using MCA mode. The data for the two samples were acquired under identical conditions. Glycoprotein A has a molecular weight around 65 kda and four N-linked glycosylation sites. Glycoprotein B is the recombinant form of the human glycoprotein A. Probably due to different buffer conditions, the S/N for the acquired spectrum of Glycoprotein A was inferior to that of Glycoprotein B. The processing parameters determined for Glycoprotein B were therefore used for processing both samples.the enhanced resolution of the QSTAR instrument gave a clear separation for many of the different combinations of glycoforms on the proteins. However, a close examination of the spectra showed that some peaks were broader than others, suggesting the presence of severely overlapped peaks. These data therefore represent a severe test for the different charge deconvolution methods. Data Processing 1. Data Preparation The different methods were presented with the same baseline corrected data so that the comparison was as objective as possible. The method employed used the novel Nadir baseline correction algorithm provided by Positive Probability Limited. Therefore, any differences between the results could not be attributed to different baseline corrections. The full spectra and the corresponding baseline corrected spectra are shown in Figures 1 & 2. Figure 3 shows the quality of the computed baseline for a single charge state ([M+20H] 20+ ) for Glycoprotein B. The peaks at m/z 3001, 3011 and 3021 are clearly broader and are indicative of severe peak overlap. Figure 1: Top: Glycoprotein A raw data; Bottom: Baseline corrected with Nadir.

Figure 2: Top: Glycoprotein B raw data; Bottom: Baseline corrected with Nadir. Figure 3: Black: Glycoprotein B (M+20H) 20+ ; Red: Baseline computed by Nadir. Note: The peaks at m/z 3001, 3011 & 3021 are broad, indicating overlapped peaks.

2. Charge Deconvolution with Peak Model The charge deconvolutions were performed on both the whole data (m/z ~800-4000) using an output mass range of 50-70 kda and on a reduced range from m/z 2025-3450 with an output range of 58-62 kda. These experiments were performed to determine how the input parameters affected the results. Bayesian transformation: Does not require a user-set model. Maximum entropy: The peak model width was measured from the first peak of the most intense clusters. Output resolution was set to 2 Da/point so that the computation would not be excessively long. ReSpect: This used the same model width as for maximum entropy but with the shape parameters measured from the data included. The method first performs a spectral deconvolution that is then analyzed to compute the most probable zero-charge spectrum, along with the quantified mass errors. Output resolution limits do not apply. 3. Zero-charge Results Figure 4 compares the zero-charge results for Glycoprotein A for the different methods using all the data and an output mass range of 50-70 kda. Expansions of the results for Glycoproteins A & B are compared in Figures 5 & 6 respectively. Also shown are the peak clusters of [M+20H] 20+ for each sample, to demonstrate the evidence for the found masses in the raw data. Figure 4: Zero-charge results for glycoprotein A. Data range m/z ~400-4000; Output mass range 50-70 kda. Top: Bayesian transformation; Centre: Maximum entropy; Bottom: ReSpect.

Figure 5: Glycoprotein A results. Input raw data m/z ~400-4000; Output mass range 50-70 kda. The raw data cluster (M+20H) 20+ is scaled to overlay the output mass range of the zero-charge spectra. Figure 6: Glycoprotein B results. Input raw data m/z ~400-4000; Output mass range 50-70 kda. The raw data cluster (M+20H) 20+ is scaled to overlay the output mass range of the zero-charge spectra.

Figures 7 & 8 compare the results using the reduced data range from m/z 2025-3450. Figure 7: Glycoprotein A results. Input raw data m/z 2025-3450; Output mass range 58-62 kda. The raw data cluster (M+20H) 20+ - scaled to overlay the output mass range of the zero-charge spectra. Figure 8: Glycoprotein B results. Input raw data m/z 2025-3450; Output mass range 58-62 kda. The raw data cluster (M+20H) 20+ - scaled to overlay the output mass range of the zero-charge spectra.

Data Analysis The well-resolved zero-charge spectra allow unambiguous identification of combinations of the different glycosylation on the four N-glycosylation sites of the proteins (Figs 9 & 10). It also illustrates the clear glycosylation difference for the two proteins. Table 1 shows the results interpretation for glycoprotein A. Figure 9: The zero-charge spectrum of glycoprotein A using all the data. Table 1: Glycoprotein A Results Found M* Predicted M Dif Proposed CHO Structures 59167.9 59168.9 1.0 4Core** 59314.1 59315.0 0.9 4Core+1Fuc 59460.9 59461.2 0.3 4Core+2Fuc 59478.9 59477.2-1.8 4Core+1Fuc+1Hex 59516.9 59518.2 1.3 4Core+1Fuc+1HexNAc 59623.3 59623.3 0.0 4Core+2Fuc+1Hex 59639.3 59639.3 0.0 4Core+1Fuc+2Hex 59663.6 59664.4 0.7 4Core+2Fuc+1HexNAc 59784.9 59785.4 0.5 4Core+2Fuc+2Hex 59804.4 59801.4-2.9 4Core+1Fuc+3Hex 59845.7 59842.5-3.2 4Core+1Fuc+2Hex+1HexNAc 59926.7 59931.6 4.9 4Core+3Fuc+2Hex 59946.5 59947.6 1.1 4Core+2Fuc+3Hex 59967.5 59963.6-3.9 4Core+1Fuc+4Hex 59987.3 59988.6 1.3 4Core+2Fuc+2Hex+1HexNAc 60006.5 60004.6-1.9 4Core+1Fuc+3Hex+1HexNAc 60044.1 60045.7 1.6 4Core+1Fuc+2Hex+2HexNAc *Calibrated masses; **Core = (HexNAc) 2 Hex 3

Discussion Figure 10: The mass differences in the zero-charge spectrum of glycoprotein B. Note: Spectrum using all the data (see Fig. 6). Bayesian transformation generates numerous harmonics and artefacts (Fig. 4) and there is no improvement in resolution (Figs 5 & 6). It is also clear that the method works poorly on low S/N data since there is substantial residual noise in the result compared with the data (Fig. 5). Although using a reduced data range would normally be beneficial, the smaller output window forces all the harmonics and artefacts into the result causing further degradation to the result (Figs 5 & 7). Maximum entropy provides substantial gains in resolution and fewer artefacts (Figs 5 & 6) but overlapped peaks are unresolved due to program model limitations and a small calibration error. The results are also compromised by using a global noise estimate and over- and under-deconvolution occurs. The smaller data and output ranges improves the resolution but overlapped peaks are still unresolved. A serious deficiency is that some peaks in the results are clearly displaced from their expected masses! ReSpect produces almost no artefacts, if any (Fig. 4). There are substantial gains in resolution and overlapped peaks are well resolved (Figs 5 & 6). The improvement is due to several program features that include the ability to accommodate a calibration error and the incorporation of the "rules of electrospray" so that only masses consistent with the physics of the experiment are reported - mathematically plausible masses inconsistent with the "rules" are not computed. The smaller output range cannot have any effect but the smaller data range improves the quality and resolution of the results (Figs 5-8).

Although not reported here, the mass errors assigned by maximum entropy were confusing as they were smaller than the observed peak displacements. Therefore, only a partial analysis was possible. Of the three methods, ReSpect produced the cleanest and most highly resolved results, allowing unambiguous data interpretation. Conclusions A comparison with existing methods has shown that the novel ReSpect algorithm provides a much more useful reconstructed zero-charge spectrum with very few (if any) artefacts and powerful resolution that allows unambiguous data analysis on complex ESI-MS data of heterogeneous glycoproteins. Nadir and ReSpect are trademarks of Positive Probability Limited, UK.