CASSS AT 2015 Berlin March 2015 1 Tackling the data analysis challenge for characterisation of biotherapeutics Carsten P Sönksen, Ph.D., Novo Nordisk
Tackling the data analysis challenge 2 Personal background: Carsten P Sönksen Experience: ~19 years with mass spectrometry ~14 years in an industrial setting Responsibility: Senior Research Scientist, responsible for MS-based protein characterisation of biopharmaceuticals Dept.: CMC - analytical support, Novo Nordisk Implemented Genedata Expressionist for vendor independent data processing and analysis
Tackling the data analysis challenge 3 Case study: Characterisation of the charged isoforms of stressed IgG4 Stressed IgG4 sample was analysed by icief Preparative isoelectric focusing (Agilent) of fractions for characterisation by MS MS-characterisation by Intact mass analysis LC-MS (Waters) Tryptic peptide map LC-MS/MS (Thermo) Raw data processing, analysis, and reporting Genedata Expressionist for Mass Spectrometry
Tackling the data analysis challenge icief electropherograms of stressed sample and isolated preparative fractions 4 0.59 0.49 Challenge: Several peaks in each fraction 0.39 0.29 0.19 0.09-0.01 6.4 6.6 6.8 7 7.2 7.4 pi Load Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3
Presentation title 5 Intact LC-MS data: Single sample view vs. overlay Weak patterns become clearer Acidic 3 Main form: +2 G0F, - 2 K, 2 Pyro-glu Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 -F: -146 Da + K: +128 Da +2 K: +256 Da - 2 Pyro-glu: + 34 Da + G: + 162 +2 G: + 324 Da Mass
Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3 Presentation title 6 Quality check of tryptic peptide maps (LC-MS/MS): Box Plot analysis 1: Average signal intensity analysis - check for differences - check for abnormal runs 2: Set Acidic 2 and Basic 3 on the watch list
Tackling the data analysis challenge 7 Guided vs. blind analysis: Only a fraction of the peaks are identified (red crosses) Acidic 3 peptide map TIC chromatogram Acidic 3 peptide map TIC 2D heat map
Many unidentified high intense peaks still ask for identification ~ 345 charge clusters not identified out of 530! A need for further identification analysis. Mass 5556.89 Charge 4 Mass 5556.89 Charge 3 Tackling the data analysis challenge 8
Presentation title 9 Are the detected modification valid? Check modifications in the mass spectral data Cleaned Data Raw Data Dominant form HC aa 309-324 Dehydrated Form? Deamidated Form Succinimide Form
Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3 Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3 Presentation title Date 10 Absence of the C-terminal lysine (K) Peptide with C-terminal lysine on the HC Peptide without C-terminal lysine on the HC
Presentation title Date 11 Unbiased Statistic Analysis of Charged Isoforms Contrast analysis: Only the HC C-terminal lysine peptides describe the difference in the preparative fractions with a significance of ~ P<0.05
Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3 Acidic 3 Acidic 2 Acidic 1 Major 1 Major 2 Basic 1 Basic 2 Basic 3 Tackling the data analysis challenge Date 12 Distribution of deamidated and succinimide forms Check signal in raw data Deamidated Δ = 0.984 Da Δ = 1.003 Da Succinimide
Presentation title Date 13 Distribution of N-glycosylation forms G0F G1F G0 No N-glyco
Presentation title 14 Conclusion Loss of lysine on the C-terminal of the heavy chain explains the basic isoforms Deamidation and succinimide explain only part of the acidic isoforms Neutral modifications like G0F are fractionated as well 0.25 0.2 +2 G0F, - 2 K, 2 Pyro-glu G1F G0 0.15 0.1 0.05 Deamidation C-term Lysine Deamidation 0 6.4 6.5 6.6 6.7 6.8 6.9 7 7.1 7.2 7.3 7.4-0.05
Presentation title Genedata Expressionist has become our standard platform for biopharmaceutical characterisation Why did we look beyond vendor software So much data, so little time Instruments from 5 vendors -> 5 softwares to learn Automating standardized work and using free time to dig deeper into interesting peaks... Workflow-based system enables: - Reporting of data, results and parameters - Swift analysis of samples in parallel - Fast iterations for reanalysis - Excellent visualisation of data - Confirmation of results in raw data Date 15
Tackling the data analysis challenge 16 Data analytical recommendations Overlaying of parallel processed data is a simple powerful approach to verify patterns Box plot analysis is a simple and fast analysis to check peptide load integrity between samples Automation saves time on the routine tasks which can be spent on high intense unidentified peaks Always verify conclusion by looking at both the processed and raw data Sophisticated visualization aids unbiased and complete characterization
Tackling the data analysis challenge 17 Thanks to Brian Kristensen, Novo Nordisk Ingelise Fabrin, Novo Nordisk Leif H. Bagger, Novo Nordisk Arnd Brandenburg, Genedata