High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy

pplication Note: 555 High-Resolution Targeted Quantitation: iomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly 1, Manuel Fréret 2, Laurent Drouot 2, Jenny T.C. Ho 3, Pascal Cosette 4, Olivier oyer 2 1 Thermo Fisher Scientific, Courtaboeuf cedex, France; 2 Inserm, U905, Université de Rouen, Faculté de médecine et de pharmacie, Rouen cedex, France; 3 Thermo Fisher Scientific, Hemel Hempstead, UK; 4 Plate-forme de protéomique, Université de Rouen, Mont-Saint-ignan, France Key Words LTQ Orbitrap Velos Label-Free Protein Quantitation Protein Identification iological nnotation Introduction Normal muscle fibers do not express detectable levels of class I major histocompatibility complex (MHC-I). In contrast, high-level expression of MHC-I is a hallmark of muscle autoimmune diseases such as polymyositis, inclusion-body myositis or dermatomyositis. It is generally considered that damage of muscle fibers in myositis results from an auto-immune attack by autoreactive cytotoxic T lymphocytes (CTLs) that recognize muscle autoantigens presented by MHC-I molecules (Figure 1). We used a MHC-I transgenic mouse model in which a controllable muscle-specific promoter governs the conditional up-regulation of a MHC-I molecule (H-2K b ) in skeletal muscle resulting in severe myopathy. proteomic study was performed on mouse muscle tissue to evaluate the consequences of this forced H-2K b expression. n experimental workflow was designed including protein digestion and label-free qualitative and quantitative analysis by high-resolution, accurate mass LC-MS/MS. Three muscle tissues from MHC-I inducible mice were compared to two healthy controls using a differential analysis and targeted quantitation approach. Goal Identification and confirmation of candidate biomarkers of a mouse transgenic model of myopathy by highresolution LC-MS/MS. Figure 1 Experimental Sample Preparation Mice (kindly provided to Inserm by NIMS, NIH, ethesda, MD) were sacrificed and muscle tissues were collected, snap-frozen in liquid nitrogen for 30 seconds, and stored at -80 C. Then, 150 mg of muscle tissues were manually homogenized in liquid nitrogen and solubilized in 5.2 ml lysis buffer (urea 7 M, thiourea 2 M, DTT 65 mm, CHPS 4%, Tris 40 mm, 50 µl of ampholytes and 100 µl of protease inhibitor (Sigma)). Samples were ultra-centrifuged for 30 minutes (84,000 rpm at 4 C (eckman, rotor TL-120). Supernatants were aliquoted in 200 µl fractions and stored at -80 C. Digestion The sample (50 μg) was cleaned up with a 2-D clean up kit (GE Healthcare) to remove interfering molecules and proteins were separated on SDS-PGE. Gel migration was stopped in the stacking gel. In-gel digestion was performed after reduction with DTT and alkylation with iodoacetamide. Supernatants from the digestion were collected and peptide-containing fractions were combined. Samples were dried and suspended in water / 0.1% formic acid prior to LC-MS/MS analysis. CTL MHC-I Cell lysis cell Figure 1. Schematic of the MHC-I molecules displaying fragments of cytoplasmic proteins on the cell surface in order to be presented to immune cytotoxic T cells.

LC-MS/MS nalysis Samples were separated by online reversed-phase chromatography using a Thermo Scientific Proxeon Easy-nLC system equipped with a peptide trap cartridge (CapTrap, Michrom ioresources) and C18 packed tip column (100 µm ID x 15 cm, Nikkyo Technos Co. Ltd). Peptides were separated using an increasing amount of acetonitrile (5%-40% over 100 minutes) and a flow rate of 300 nl/min. The LC eluent was electrosprayed directly from the analytical column and voltage was applied via a liquid junction of the nanospray source. The chromatography system was coupled to a Thermo Scientific LTQ Orbitrap Velos mass spectrometer. bout 1 µg of peptides was used for each LC-MS/MS experiment. Peptides were analyzed with full-scan MS detection in the Orbitrap TM mass analyzer at 60,000 (FWHM) resolving power. Precursors were selected on the fly for collision-induced dissociation (CID) fragmentation with ion trap detection. The method was set to analyze the top 20 most intense ions from the survey scan. Details of chromatographic and mass spectrometric settings are listed in Tables 1 and 2. Data Processing Tandem mass spectra were processed with Thermo Scientific Proteome Discoverer software version 1.2. Spectra were searched against Swiss-Prot Mouse (version 57.9) using the SEQUEST algorithm. Differential analysis was performed between the two sample groups with Thermo Scientific SIEVE software. LC-MS files were aligned in retention time (RT) and peak detection was performed. Frames consisting of a 2-minute RT window and 0.015-m/z windows were created around each detected feature and peaks were integrated for each frame and each LC-MS/MS file. Peak areas were normalized against the signal of the peptide FTQGSEVSLLGR, which was used as an internal standard to account for injection or ionization bias. Results from the database search were imported into SIEVE TM software. Database search parameters and SIEVE software parameters are detailed in Table 3. Table 1. LC parameter settings LC Separation HPLC System Column Pre-column Mobile Phases Gradient Flow Easy-nLC C18 Reprosil analytical column (100 µm id x 15 cm packed tip column, Nikkyo Technos Co. Ltd) Peptide trap cartridge (Captrap, Michrom ioresources) 0.1% formic acid in water (eluent ); 0.1% formic acid in acetonitrile (eluent ) 5% 40% in 100 min 300 nl/min Table 2. Mass spectrometer parameter settings Mass Spectrometer Settings Source nano-esi Capillary temperature 200 C S-lens RF level 60% Source voltage 1.75 kv Full MS mass range m/z 400-2000 Resolution settings (FWHM at m/z 400) Full MS 60,000 Target value Full MS 1E6 CID (Ion Trap) 5E3 Max. injection times Full MS [ms] Full MS 500 MS n (Ion Trap) 100 Dynamic Exclusion Repeat count 1 MS n parameters Exclusion list size 500 Exclusion duration Exclusion mass width low/high Isolation width 2 µ 30 s 5 ppm Minimum signal required 500 Collision energy (CID) 30% ctivation time Top n MS/MS 20 Charge state screening on: +1 rejected Monoisotopic precursor selection enabled Table 3. Data analysis parameter settings Database Search Parameters (Proteome Discoverer Software) Peak list generation conditions Total intensity threshold 100 Minimum peak count 8 SEQUEST search 10 ms Database Swissprot _MOUSE (57.9) Mass tolerance (precursor) Mass tolerance (fragment) Search against decoy database Dynamic modifications Static modifications SIEVE Parameters m/z width RT width 5 ppm 0.5 Da True Oxidation (M) Carbamidomethyl (C) 0.015 Da 2 min Intensity threshold 10000 Maximum frames 100000 SEQUEST probability threshold 10 SEQUEST Xcorr threshold Charge state = 2 2.5 Charge state 3 3 SEQUEST maximum rank 1

Thermo Scientific Pinpoint software was used for the targeted quantitative analysis of proteins identified with a ratio < 0.5 or >2 using SIEVE software (see details of proteins/peptides selection and data processing in the Results and Discussion section). ccession numbers of differentially regulated proteins were submitted to Thermo Scientific ProteinCenter software and statistical analysis was performed against the Swiss-Prot Mouse database for KEGG pathway annotation. Results and Discussion Label-Free, Untargeted pproach In this study, a complete biomarker research workflow was used to identify and confirm proteins which are involved in the development of myopathy (Figure 2). database search was performed for the five LC-MS/MS runs separately resulting in the identification of 1693 proteins (1471 protein groups, of which 1096 were identified in at least 2 samples) and 5948 unique peptides with a FDR below 1% after merging the result files. Differential analysis was performed between the two sample groups to find proteins that are differentially expressed in the context of the disease. total of 1623 proteins could be quantified based on peptide MS trace. Protein ratios were calculated based on the variance weighted average of the peptide ratios. Protein-ratio distribution described an S-type plot with a median at 1.01 indicating that there was no bias in the quantitation (Figure 3). re 3 Figure 2. Complete analytical workflow for biomarker discovery including non-targeted label-free differential analysis and biomarker confirmation with targeted quantitation based on high resolution MS measurements 6 5 4 3 2 Log Ratio 1 0-1 0 200 400 600 800 1000 1200 1400 1600 1800-2 -3-4 -5 Figure 3. S-Plot of the protein ratios from the label-free approach with SIEVE software

Figure 4 SIEVE software lists all protein references for each candidate peptide and equally considers each protein candidate as a valid result. That is, if a candidate peptide is found in three different proteins, then each of the three proteins are considered for display in the protein report. Protein ratios can therefore be biased by peptides that belong to several proteins. It is then important to qualify potential biomarkers identified with SIEVE software with a targeted approach that considers only peptides that are unique to the target protein. Here 335 proteins were identified with at least two peptides and observed with ratios >2 or <0.5 with SIEVE software. These proteins were subsequently selected for validation with a targeted quantitation approach. Targeted Quantitation Using Pinpoint Software Generation of a targeted assay begins with selecting the targeted protein(s) and/or peptide(s) required for the biological study. Here, all targeted protein sequences were simultaneously uploaded from a FST file exported from Proteome Discoverer TM software. The second step of the workflow is to create a list of peptides to serve as surrogates for the target proteins. list of theoretical Figure peptides 4 was created through the in silico digestion of the targeted protein sequences (Figure 4). Trypsin was selected as the enzyme used in the experimental protocol. Refinement of the resulting in silico peptide list was accomplished by defining a sequence length from 7 to 50 amino acids and excluding peptides containing potentially modified amino acids such as cysteine and methionine. Proteotypic peptides were selected as being unique to the target protein in a reference database containing all known mouse protein sequence in Swiss-Prot. The peptide list was finally refined by only selecting peptides that were already identified by SEQUEST and therefore present in Proteome Discoverer software results files (Figure 4). total of 975 peptides corresponding to 289 proteins Figure 4. Method setup for targeted quantitation with Pinpoint software. Selection of proteotypic peptides based on in silico digestion of target proteins () and previous identification with SEQUEST algorithm through the import of Proteome Discoverer spectral libraries ().

were selected for targeted quantitation. Some proteins failed to fulfill the peptide selection criteria described above, mostly because they were missing proteotypic peptides resulting from a complete digestion with trypsin. One solution would be to target peptides with one missed cleavage site and verify that the digestion efficiency is similar in all samples. It is also possible to target theoretical proteotypic peptides and use the hydrophobicity factor of the peptide for the retention time validation. Figure 5 Quantitation was performed based on full-scan MS measurements and retention time confirmed in Proteome Discoverer software. Peptides were quantified by extracting the first, second and third isotope of each charge state of identified peptide (Figure 5). For each mass, an extracted ion chromatogram was generated with 5-ppm mass tolerance and a retention-time-shift tolerance of 2 minutes between the sample runs. Peak areas with a minimum of 10,000 counts were integrated and normalized against the signal of the internal standard 975 peptides 289 proteins Figure 5. Method setup for targeted quantitation with Pinpoint software: atch selection of peptides and isotopes which were used for quantitation. Only peptides following in silico digestion rules and present in spectral libraries were selected. The first three isotopes of each identified charge states were used for the quantitation.

peptide FTQGSEVSLLGR. Peptides and proteins were filtered out based on a coefficient of variation (CV) of normalized signal area below 50% in each sample group. In total, 230 proteins were successfully analyzed and relative quantitation was performed between MHC-I inducible mice and healthy controls. Of these, 84 proteins were validated as being down-regulated with ratios 0.5 and 103 proteins were found to be upregulated with protein expression levels at least two times higher in MHC-I mice (ratios 2). Figure 6 displays two myosin forms which were down-regulated. High consistency was observed between all the peptide ratios. This can be explained by the pre-selection of peptides used for quantitation based on previous identification with the SEQUEST algorithm and their uniqueness to the target proteins. Some proteins were found to be highly up-regulated with ratios >100 such as proteinglutamine gamma-glutamyltransferase 2 (Figure 7). In addition, 59 proteins could not be quantified because of excessive biological variations or inconsistency of the peptide responses. Finally, 42 proteins were found not to be differentially expressed between the two sample groups although they were in the discovery phase. This emphasizes the need of a targeted approach on proteotypic peptides for qualifying biomarkers. The accession numbers of potential biomarkers were submitted to ProteinCenter data interpretation software in order to retrieve relevant biological information from publicly available protein databases (including gene ontology references and correlation with metabolic pathways). Six of the down-regulated proteins were found to be involved in the Dilated cardiomyopathy KEGG pathway (Figure 8) and six of the up-regulated proteins also showed a correlation with the ntigen processing and presentation pathway (Figure 8). This was anticipated because of the high expression level of H-2K b protein in MHC-I inducible mice. Figure 6 Figure 6 Myosin regulatory Myosin light regulatory Myosin regulatory chain 2 light chain light chain 2 R=0.25 R=0.25 R=0.25 2.50E+01 2.50E+01 2.50E+01 2.00E+01 2.00E+01 2.00E+01 1.50E+01 1.50E+01 1.50E+01 1.00E+01 1.00E+01 1.00E+01 5.00E+00 5.00E+00 5.00E+00 0.00E+00 0.00E+00 EFTVIDQNR GDPEDVITGFK FSQEEIK 0.00E+00 EFTVIDQNR GDPEDVITGFK FSQEEIK EFTVIDQNR GDPEDVITGFK FSQEEIK R=0.24 R=0.27 R=0.24 R=0.24 R=0.27 R=0.24 R=0.24 R=0.27 R=0.24 Control_2 Control_2 Control_2 Transgenic imhc_1 mice Transgenic imhc_1 mice Transgenic imhc_1 Transgenic imhc_2 mice mice Transgenic imhc_2 mice Transgenic imhc_2 mice C C C Control Control Control Control Transgenic mice Transgenic mice mice Control Myosin light Myosin chain light Myosin light 1/3 chain 1/3 chain 1/3 R=0.28 R=0.28 R=0.28 Transgenic mice Transgenic mice mice Control D D D 3.00E+01 3.00E+01 3.00E+01 2.50E+01 2.50E+01 2.50E+01 2.00E+01 2.00E+01 2.00E+01 1.50E+01 1.50E+01 1.50E+01 1.00E+01 1.00E+01 1.00E+01 5.00E+00 5.00E+00 5.00E+00 0.00E+00 0.00E+00 0.00E+00 EFLLFDR EFLLFDR EFLLFDR ITLSQVGDVLR ITLSQVGDVLR ITLSQVGDVLR DQGGYEDFVEGLR DQGGYEDFVEGLR DQGGYEDFVEGLR HVLTLGEK HVLTLGEK HVLTLGEK R=0.22 R=0.29 R=0.29 R=0.22 R=0.22 R=0.22 R=0.29 R=0.29 R=0.29 R=0.29 R=0.22 R=0.22 Control_2 Control_2 Control_2 Transgenic imhc_1 mice Transgenic imhc_1 mice Transgenic imhc_1 Transgenic imhc_2 mice mice Transgenic imhc_2 mice Transgenic imhc_2 mice Figure 6. Example of quantitation of Myosin regulatory light chain 2 and Myosin light chain 1/3 at the protein level ( and C) and at the peptide level ( and D). R=Ratio.

Protein-glutamine gammaglutamyltransferase 2 Control Transgenic mice 4.00E-02 3.50E-02 3.00E-02 2.50E-02 2.00E-02 1.50E-02 1.00E-02 5.00E-03 0.00E+00 GYESVDSLTFGVTGPDPSEEGTK NEFGELESNK YPEGSPEER GLLIEPNSYLLER DLYLENPEIK SVEVSDPVPGDLVK Control_2 Transgenic imhc_1 mice Transgenic imhc_2 mice Figure 7. Example of quantitation of protein-glutamine gamma-glutamyltransferase 2 at the protein level () and the peptide level (). Figure 8 R=0.24 R=0.23 R=0.34 R=0.34 R=0.34 R=0.21 R>100 R=23.3 R=4.1 R=2.7 R>100 R=4.3 Figure 8. iological annotation with ProteinCenter. statistical analysis was performed against Swiss-Prot Mouse database for KEGG pathway annotations. () Down-regulated proteins with ratios 0.5. () Up-regulated proteins with ratios 2. R=Ratio.

Conclusion complete biomarker discovery workflow including label-free analysis followed by targeted quantitation of potential biomarkers was applied to the study of MHC-I induced mice developing myopathy. Pinpoint TM software was used to perform targeted quantitation based on full-scan MS measurements from five separately acquired LC-MS/MS runs corresponding to two control samples and three transgenic mice with severe myopathy. total of 230 proteins were quantified at high stringency based on the discovery data. Several candidate biomarkers were identified and confirmed with protein expression ratios below 0.5 or above 2. iomarkers could be correlated to metabolic pathways which are known to be associated with the disease. In addition to these offices, Thermo Fisher Scientific maintains a network of representative organizations throughout the world. frica-other +27 11 570 1840 ustralia +61 3 9757 4300 ustria +43 1 333 50 34 0 elgium +32 53 73 42 41 Canada +1 800 530 8447 China +86 10 8419 3588 Denmark +45 70 23 62 60 Europe-Other +43 1 333 50 34 0 Finland/Norway/ Sweden +46 8 556 468 00 France +33 1 60 92 48 00 Germany +49 6103 408 1014 India +91 22 6742 9434 Italy +39 02 950 591 Japan +81 45 453 9100 Latin merica +1 561 688 8700 Middle East +43 1 333 50 34 0 Netherlands +31 76 579 55 55 New Zealand +64 9 980 6700 Russia/CIS +43 1 333 50 34 0 South frica +27 11 570 1840 Spain +34 914 845 965 Switzerland +41 61 716 77 00 UK +44 1442 233555 US +1 800 532 4752 www.thermoscientific.com Legal Notices: 2011 Thermo Fisher Scientific Inc. ll rights reserved. SEQUEST is a registered trademark of the University of Washington. Swiss-Prot is a registered trademark of SI Institut Suisse de ioinformatique Foundation Switzerland. ll other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information results from not-for-profit research and is presented solely as an example of the capabilities of Thermo Fisher Scientific Inc. products. It is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult your local sales representative for details. Thermo Fisher Scientific, San Jose, C US is ISO Certified. N63495_E 10/11S Part of Thermo Fisher Scientific