OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis

Size: px
Start display at page:

Download "OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis"

Transcription

1 OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis Thang V. Pham and Connie R. Jimenez OncoProteomics Laboratory, Cancer Center Amsterdam, VU University Medical Center De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands Abstract. We present a software package for the analysis of MALDI- TOF mass spectrometry data. The software is designed to facilitate a complete exploratory workflow: pre-processing of raw spectral data, specification of study groups for comparison, statistical differential analysis, visualization of peptide peaks, and classification. The software supports various external tools for these tasks. We also pay special attention to the iterative nature of a typical analysis. Finally, we present two proteomics studies where the software has been used for data analysis. Keywords: data analysis, differential analysis, bio-marker discovery, MALDI-TOF, mass spectrometry, OplAnalyzer, proteomics. 1 Introduction Mass spectrometry is an attractive method in proteomics research because of its ability to identify and quantify a large number of proteins in complex biological samples [1]. However, the pre-processing and analysis of mass spectrometry data are fast becoming a bottle neck in the discovery process. This paper describes a software platform developed in our laboratory called OplAnalyzer, which supports proteomics mass spectrometry data pre-preprocessing and analysis. Specifically, we deal with MALDI-TOF mass spectrometry, a standard high throughput platform that can potentially be used for various diagnostic purposes. There are a number of tasks involved in a typical analysis: pre-processing of raw spectral data, specification of study groups for comparison, statistical differential analysis, visualization of peptide peaks, and classification [2]. Instead of integrating all these components into a single tool for a complete analysis, we develop a flexible platform where various existing tools for different tasks are accommodated. Our design also supports the interactive nature of the analysis process. Currently, the software supports the analysis of MALDI-TOF MS-1 data only. Tools for the analysis of MS/MS data with protein identification as well as data from another mass spectrometry platform namely LC-FTMS are under active development. P. Perner and O. Salvetti (Eds.): MDA 2008, LNAI 5108, pp , c Springer-Verlag Berlin Heidelberg 2008

2 74 T.V. Pham and C.R. Jimenez a. Data pre processing b. Sample grouping c. Exploratory analysis Differential analysis Classification Visualization d. Batch processing Fig. 1. An analysis workflow The analysis workflow and the system are described in Section 2. In section 3 we present two proteomics studies where the software has been employed for data analysis. 2 The System Fig. 1 shows a typical workflow in proteomics mass spectrometry data analysis. The four main steps are: data pre-processing, sample grouping, exploratory analysis, and batch processing. 2.1 Data Pre-processing The data pre-processing step includes the preparation of metadata and the processing of raw mass spectrometry signals which consists of peak detection, alignment, normalization, and deisotoping. To facilitate the use of existing tools we define a common data format between this step and the subsequent steps, which is simply based on tab-separated texts. For our instrument, a 4800 MALDI-TOF/TOF mass spectrometer (Applied Biosystems, Foster City, USA), we found that the MarkerView software (Applied Biosystems) works well for data produced in the reflectron mode. For data produced in the linear mode we have implemented a new method. To detect peaks in an individual spectrum, we search for locations of maximal value within a local m/z window. The size of the window is 11 discrete sampling points. This method is similar to the peak detection method employed in [4].

3 OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis 75 Individual spectrum and peak Mean spectrum and common peak d A B m/z p M p I Fig. 2. Peak alignment. For each common peak p M in the mean spectrum, the closest peak p I in each individual spectrum is located. If the distance d between the two peaks is less than 5, the value at point A is registered for the common peak p M in this particular spectrum. Otherwise, the value at B is registered. To find peaks that are common in all spectra, we apply peak detection to the mean spectra, analogously to [5]. Subsequently, peaks in an individual spectrum are aligned to this set of common peaks as follows. For each common peak, its value in an individual spectrum is that of the closest detected peak in that spectrum if the distance between the common peak and the closest peak (in the m/z axis) is less than 5 Da. (A better choice is likely to be based on the actual mass accuracy of the measurement and on the m/z value.) If there is no such peak, the value is simply assigned to the value of the spectrum at the m/z location of the common peak. Figure 2 illustrates the procedure. By visual inspection, we found that the quality of our alignment method is comparable to that of the more computationally expensive clustering method in [4] (data not shown). 2.2 Sample Grouping Typically, researchers are interested in several comparisons in each experiment, for examples, comparisons based on gender, age, and clinical outcomes. Also, in an interactive analysis the user might want to modify the sample groups for instance to include or exclude certain samples. To enable an efficient sample grouping, we define a text-based sample selection based on metadata. The strategy is easy to use and particularly suited for batch processing. For example, to specify two groups Healthy consisting of samples from healthy individuals and Cancer consisting of samples from cancer patients before treatment, the selection is as follows. Healthy:Cancer-type=Healthy;Cancer:Cancer-type=NSCLC,Time=PreTx

4 76 T.V. Pham and C.R. Jimenez Fig. 3. A screenshot of the output of the statistical testing module 2.3 Exploratory Analysis For data analysis we exploit existing tools in Matlab (The MathWorks, Inc). A typical first step is unsupervised analysis with principle component analysis (PCA) using all peptide intensities. Here all data points are projected onto a two or three-dimensional space for visualization. The projection does not use any information of group labels. The purpose is two-fold. First, one can observe if the data are clustered in a low dimensional space according to group labels. Second, one can detect possible outliers or unusual pattern in the data by visual inspection. For differential analysis, we provide interfaces for the t-test, Mann-Whitney U test, Kruskal-Wallis test. The p-values can be adjusted for multiple testing. The peptides are further subjected to intensity filtering, requiring that the median intensity of at least one group must be greater than 80 units and the fold change of the median intensities of the two groups must be greater than 1.5. (The numbers can be tuned for each study). Fig. 3 depicts a screenshot of the result of a comparative study. The candidate peaks are examined visually by spectra overlay. Again, we use the visualization capability of Matlab for this purpose. Finally, we provide classification model selection with support vector machine [3]. A grid search method is used to find the optimal parameter values. For each value in the grid, the generalization error is estimated by either leave-one-out cross validation or repeatedly splitting the data into two partitions randomly, one for training and one for testing. The grid point with lowest estimated generalization error is selected as our model for classification.

5 OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis Batch Processing We consider batch processing an important step in data analysis, especially with regard to reproducibility of figures and other results. In addition, batch processing helps produce a large number of figures of peptide peaks in a convenient format for visual examination. Again, we make use of the scripting capability of Matlab for this purpose. 3 Examples In the following, we describe two studies where the current software has been employed for data analysis. 3.1 Time-Course MALDI-TOF-MS Serum Peptide Profiling of Non-small Cell Lung Cancer Patients Treated with Bortezomib, Cisplatin and Gemcitabine This study performs serum peptide profiling of non-small cell lung cancer (NSCLC) patients treated with gemcitabine, cisplatin and bortezomib combinations before, during, and at end of treatment to discover peptide patterns associated with treatment-related effects and clinical outcomes [7]. Fig. 4 shows a three-dimensional PCA plot of serum peptide spectra of 13 healthy individuals and the pre-treatment serum spectra of 27 NSCLC patients. Fig. 4. Principle component analysis (PCA) of healthy versus NSCLC comparison

6 78 T.V. Pham and C.R. Jimenez (a) (b) Fig. 5. (a) Spectra overlay of the eight most differential peaks in the healthy (red) versus NSCLC (blue) comparison according to p-values of the Mann-Whitney U test. All peaks have a p-value less than (b) Heatmap of the 47 differential peaks in the healthy versus NSCLC comparison shown in the natural log scale. The peaks are ordered by median fold change between the two groups. Here, the MarkerView software was used for preprocessing, resulting in 682 peptide peaks per raw spectrum. The Mann-Whitney U test is carried out on each of the 682 peptides, resulting in 47 differential peptides. Fig. 5(a) shows the spectra overlay of the eight most differential peaks in the healthy versus NSCLC comparison. Fig. 5(b) shows a heatmap of the 47 differential peaks. We carried out classification analysis using support vector machine. A grid search for parameters was employed to find the best model according to leaveone-out cross validation (LOOCV). Using all 682 peptides, a LOOCV accuracy of 93% was achieved. When the 47 peptides selected by the Mann-Whitney U test were used, the LOOCV accuracy was 98% with 100% sensitivity and 96% specificity. The software has also been used for a large number of other comparisons such as gender, age, short and long progression free survival, and clinical treatment responses.

7 OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis intensity (tranformed value) m/z Fig. 6. Mean spectrum and detected peaks in the Da range 3.2 Breast Cancer Study with Maldi-TOF Mass Spectrometry Data of Serum Samples This study is part of the international competition on mass spectrometry proteomic diagnosis [8][9]. The dataset consists of 153 mass spectra of blood samples drawn from control individuals and patients with breast cancers. The aim is to construct a classification rule separating the two groups with a low generalization error. For this dataset, the baseline correction had been performed by the competition organizer. We used the software to perform further pre-processing: peak detection and alignment. Fig. 6 shows an example of the result of the pre-procesing algorithm. Again, a Mann-Whitney U test was performed to select features discriminating the two classes significantly. Furthermore, the Benjamini-Hochberg false discovery rate correction [6] was employed to correct for multiple testing. This results in on average 117 peaks with a false discovery rate less than 1%. Fig. 7 shows the distribution of the values of the 16 most discriminative peaks. We employed grid search with exponential spacing to find the optimal values for support vector machine model selection. The generalization error is estimated by averaging over 200 runs of randomly splitting the given data into two partitions, where the size of the test set is roughly a tenth of size of the whole dataset. The feature selection was performed for each random splitting procedure, so that fair estimates of classification accuracy were obtained. The final accuracy on a separate validation set of 78 samples is 83%.

8 80 T.V. Pham and C.R. Jimenez m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = m/z = Fig. 7. Top 16 differential peaks

9 OplAnalyzer: A Toolbox for MALDI-TOF Mass Spectrometry Data Analysis 81 4 Summary The paper has introduced a software toolbox for the pre-processing and statistical analysis of MALDI-TOF mass spectrometry data. Our current development focuses on the support for the analysis of MS/MS data with protein identification and data from another mass spectrometry platform namely LC-FTMS. References 1. Jimenez, C.R., Piersma, S., Pham, T.V.: High-throughput and targeted in-depth mass spectrometry-based approaches for biofluid profiling and biomarker discovery. Biomarkers in Medicine 1(4), (2007) 2. Villanueva, J., Martorella, A.J., Lawlor, K., Philip, J., Fleisher, M., Robbins, R.J., Tempst, P.: Serum peptidome patterns that distinguish metastatic thyroid carcinoma from cancer-free controls are unbiased by gender and age. Mol. Cell Proteomics 5, (2006) 3. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1999) 4. Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., Le, Q.- T.: Sample classification from protein mass spectroscopy, by peak probability contrasts. Bioinformatics 20(17), (2004) 5. Karpievitch, Y.V., Hill, E.G., Smolka, A.J., Morris, J.S., Coombes, K.R., Baggerly, K.A., Almeida, J.S.: PrepMS: TOF MS data graphical preprocessing tool. Bioinformatics 23(2), (2007) 6. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. B 57, (1995) 7. Voortman, J., Pham, T.V., Knol, J.C., Giaccone, G., Jimenez, C.R.: Time-course MALDI-TOF-MS serum peptide profiling of non-small cell lung cancer patients treated with bortezomib, cisplatin and gemcitabine. In: Proceedings of American Society of Clinical Oncology (ASCO) 2008 Annual Meeting, Chicago, USA (2008) 8. Mertens, B.: International competition on mass spectrometry proteomic diagnosis. Statistical Applications in Genetics and Molecular Biology 7(2), Article 1 (2008) 9. Pham, T.V., van de Wiel, M.A., Jimenez, C.R.: Support vector machine approach to separate control and breast cancer serum samples. Statistical Applications in Genetics and Molecular Biology 7(2), Article 11 (January 2008)

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant Statistical Analysis NBAF-B Metabolomics Masterclass Mark Viant 1. Introduction 2. Univariate analysis Overview of lecture 3. Unsupervised multivariate analysis Principal components analysis (PCA) Interpreting

More information

MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis

MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis Overview MarkerView software is a novel program designed for metabolomics applications and biomarker profiling workflows 1. Using

More information

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Mario Cannataro, Pietro Hiram Guzzi, Tommaso Mazza, and Pierangelo Veltri University Magna Græcia of Catanzaro, 88100

More information

Tutorial for proteome data analysis using the Perseus software platform

Tutorial for proteome data analysis using the Perseus software platform Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

More information

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data M. Cannataro, P. H. Guzzi, T. Mazza, and P. Veltri Università Magna Græcia di Catanzaro, Italy 1 Introduction Mass Spectrometry

More information

Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening

Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening , pp.169-178 http://dx.doi.org/10.14257/ijbsbt.2014.6.2.17 Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening Ki-Seok Cheong 2,3, Hye-Jeong Song 1,3, Chan-Young Park 1,3, Jong-Dae

More information

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis The First International Symposium on Optimization and Systems Biology (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 45 51 Integrated Data Mining Strategy for Effective Metabolomic

More information

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays

Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays Increasing the Multiplexing of High Resolution Targeted Peptide Quantification Assays Scheduled MRM HR Workflow on the TripleTOF Systems Jenny Albanese, Christie Hunter AB SCIEX, USA Targeted quantitative

More information

Functional Data Analysis of MALDI TOF Protein Spectra

Functional Data Analysis of MALDI TOF Protein Spectra Functional Data Analysis of MALDI TOF Protein Spectra Dean Billheimer dean.billheimer@vanderbilt.edu. Department of Biostatistics Vanderbilt University Vanderbilt Ingram Cancer Center FDA for MALDI TOF

More information

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs

AB SCIEX TOF/TOF 4800 PLUS SYSTEM. Cost effective flexibility for your core needs AB SCIEX TOF/TOF 4800 PLUS SYSTEM Cost effective flexibility for your core needs AB SCIEX TOF/TOF 4800 PLUS SYSTEM It s just what you expect from the industry leader. The AB SCIEX 4800 Plus MALDI TOF/TOF

More information

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests

Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests Effects of Intelligent Data Acquisition and Fast Laser Speed on Analysis of Complex Protein Digests AB SCIEX TOF/TOF 5800 System with DynamicExit Algorithm and ProteinPilot Software for Robust Protein

More information

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics 泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics 2014 Training Course Wei-Hung Chang ( 張 瑋 宏 ) ABRC, Academia

More information

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists

Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists Program Overview Session 1. Course Presentation: Mass spectrometry-based proteomics for molecular and cellular biologists Session 2. Principles of Mass Spectrometry Session 3. Mass spectrometry based proteomics

More information

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification

MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification MultiQuant Software 2.0 for Targeted Protein / Peptide Quantification Gold Standard for Quantitative Data Processing Because of the sensitivity, selectivity, speed and throughput at which MRM assays can

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

Learning Objectives:

Learning Objectives: Proteomics Methodology for LC-MS/MS Data Analysis Methodology for LC-MS/MS Data Analysis Peptide mass spectrum data of individual protein obtained from LC-MS/MS has to be analyzed for identification of

More information

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics

Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics Pep-Miner: A Novel Technology for Mass Spectrometry-Based Proteomics Ilan Beer Haifa Research Lab Dec 10, 2002 Pep-Miner s Location in the Life Sciences World The post-genome era - the age of proteome

More information

A Streamlined Workflow for Untargeted Metabolomics

A Streamlined Workflow for Untargeted Metabolomics A Streamlined Workflow for Untargeted Metabolomics Employing XCMS plus, a Simultaneous Data Processing and Metabolite Identification Software Package for Rapid Untargeted Metabolite Screening Baljit K.

More information

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates

In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates In-Depth Qualitative Analysis of Complex Proteomic Samples Using High Quality MS/MS at Fast Acquisition Rates Using the Explore Workflow on the AB SCIEX TripleTOF 5600 System A major challenge in proteomics

More information

Quantitative proteomics background

Quantitative proteomics background Proteomics data analysis seminar Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post transcriptional regulation of key cellular processes de Groot, M., Daran

More information

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF Tutorial for Proteomics Data Submission Katalin F. Medzihradszky Robert J. Chalkley UCSF Why Have Guidelines? Large-scale proteomics studies create huge amounts of data. It is impossible/impractical to

More information

SELDI-TOF Mass Spectrometry Protein Data By Huong Thi Dieu La

SELDI-TOF Mass Spectrometry Protein Data By Huong Thi Dieu La SELDI-TOF Mass Spectrometry Protein Data By Huong Thi Dieu La References Alejandro Cruz-Marcelo, Rudy Guerra, Marina Vannucci, Yiting Li, Ching C. Lau, and Tsz-Kwong Man. Comparison of algorithms for pre-processing

More information

Sequential projection pursuit principal component analysis dealing with missing data associated with new -omics technologies

Sequential projection pursuit principal component analysis dealing with missing data associated with new -omics technologies Supplementary Material for: Sequential projection pursuit principal component analysis dealing with missing data associated with new -omics technologies Bobbie-Jo M. Webb-Robertson 1*, Melissa M. Matzke

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Using MATLAB: Bioinformatics Toolbox for Life Sciences

Using MATLAB: Bioinformatics Toolbox for Life Sciences Using MATLAB: Bioinformatics Toolbox for Life Sciences MR. SARAWUT WONGPHAYAK BIOINFORMATICS PROGRAM, SCHOOL OF BIORESOURCES AND TECHNOLOGY, AND SCHOOL OF INFORMATION TECHNOLOGY, KING MONGKUT S UNIVERSITY

More information

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry

ProteinScape. Innovation with Integrity. Proteomics Data Analysis & Management. Mass Spectrometry ProteinScape Proteomics Data Analysis & Management Innovation with Integrity Mass Spectrometry ProteinScape a Virtual Environment for Successful Proteomics To overcome the growing complexity of proteomics

More information

Un (bref) aperçu des méthodes et outils de fouilles et de visualisation de données «omics»

Un (bref) aperçu des méthodes et outils de fouilles et de visualisation de données «omics» Un (bref) aperçu des méthodes et outils de fouilles et de visualisation de données «omics» Workshop «Protéomique & Maladies rares» 25 th September 2012, Paris yves.vandenbrouck@cea.fr CEA Grenoble irtsv

More information

DeCyder Extended Data Analysis module Version 1.0

DeCyder Extended Data Analysis module Version 1.0 GE Healthcare DeCyder Extended Data Analysis module Version 1.0 Module for DeCyder 2D version 6.5 User Manual Contents 1 Introduction 1.1 Introduction... 7 1.2 The DeCyder EDA User Manual... 9 1.3 Getting

More information

Maschinelles Lernen mit MATLAB

Maschinelles Lernen mit MATLAB Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical

More information

Aiping Lu. Key Laboratory of System Biology Chinese Academic Society APLV@sibs.ac.cn

Aiping Lu. Key Laboratory of System Biology Chinese Academic Society APLV@sibs.ac.cn Aiping Lu Key Laboratory of System Biology Chinese Academic Society APLV@sibs.ac.cn Proteome and Proteomics PROTEin complement expressed by genome Marc Wilkins Electrophoresis. 1995. 16(7):1090-4. proteomics

More information

La Protéomique : Etat de l art et perspectives

La Protéomique : Etat de l art et perspectives La Protéomique : Etat de l art et perspectives Odile Schiltz Institut de Pharmacologie et de Biologie Structurale CNRS, Université de Toulouse, Odile.Schiltz@ipbs.fr Protéomique et Spectrométrie de Masse

More information

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method

Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method Application Note # LCMS-81 Introducing New Proteomics Acquisiton Strategies with the compact Towards the Universal Proteomics Acquisition Method Introduction During the last decade, the complexity of samples

More information

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics With Unique QTRAP and TripleTOF 5600 System Technology Targeted peptide quantification is a rapidly growing application

More information

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring

The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring The Scheduled MRM Algorithm Enables Intelligent Use of Retention Time During Multiple Reaction Monitoring Delivering up to 2500 MRM Transitions per LC Run Christie Hunter 1, Brigitte Simons 2 1 AB SCIEX,

More information

ProteinPilot Report for ProteinPilot Software

ProteinPilot Report for ProteinPilot Software ProteinPilot Report for ProteinPilot Software Detailed Analysis of Protein Identification / Quantitation Results Automatically Sean L Seymour, Christie Hunter SCIEX, USA Pow erful mass spectrometers like

More information

Statistical Analysis Strategies for Shotgun Proteomics Data

Statistical Analysis Strategies for Shotgun Proteomics Data Statistical Analysis Strategies for Shotgun Proteomics Data Ming Li, Ph.D. Cancer Biostatistics Center Vanderbilt University Medical Center Ayers Institute Biomarker Pipeline normal shotgun proteome analysis

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

Alignment and Preprocessing for Data Analysis

Alignment and Preprocessing for Data Analysis Alignment and Preprocessing for Data Analysis Preprocessing tools for chromatography Basics of alignment GC FID (D) data and issues PCA F Ratios GC MS (D) data and issues PCA F Ratios PARAFAC Piecewise

More information

Bruker ToxScreener TM. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS

Bruker ToxScreener TM. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS Bruker ToxScreener TM A Comprehensive Screening Solution for Forensic Toxicology Innovation with Integrity UHR-TOF MS ToxScreener - Get the Complete Picture Forensic laboratories are frequently required

More information

Global and Discovery Proteomics Lecture Agenda

Global and Discovery Proteomics Lecture Agenda Global and Discovery Proteomics Christine A. Jelinek, Ph.D. Johns Hopkins University School of Medicine Department of Pharmacology and Molecular Sciences Middle Atlantic Mass Spectrometry Laboratory Global

More information

PeptidomicsDB: a new platform for sharing MS/MS data.

PeptidomicsDB: a new platform for sharing MS/MS data. PeptidomicsDB: a new platform for sharing MS/MS data. Federica Viti, Ivan Merelli, Dario Di Silvestre, Pietro Brunetti, Luciano Milanesi, Pierluigi Mauri NETTAB2010 Napoli, 01/12/2010 Mass Spectrometry

More information

using ms based proteomics

using ms based proteomics quantification using ms based proteomics lennart martens Computational Omics and Systems Biology Group Department of Medical Protein Research, VIB Department of Biochemistry, Ghent University Ghent, Belgium

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Increasing Quality While Maintaining Efficiency in Drug Chemistry with DART-TOF MS Screening

Increasing Quality While Maintaining Efficiency in Drug Chemistry with DART-TOF MS Screening Increasing Quality While Maintaining Efficiency in Drug Chemistry with DART-TOF MS Screening Application Note Forensics Author Erin Shonsey Director of Research Alabama Department of Forensic Science Abstract

More information

AppNote 6/2003. Analysis of Flavors using a Mass Spectral Based Chemical Sensor KEYWORDS ABSTRACT

AppNote 6/2003. Analysis of Flavors using a Mass Spectral Based Chemical Sensor KEYWORDS ABSTRACT AppNote 6/2003 Analysis of Flavors using a Mass Spectral Based Chemical Sensor Vanessa R. Kinton Gerstel, Inc., 701 Digital Drive, Suite J, Linthicum, MD 21090, USA Kevin L. Goodner USDA, Citrus & Subtropical

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Mascot Search Results FAQ

Mascot Search Results FAQ Mascot Search Results FAQ 1 We had a presentation with this same title at our 2005 user meeting. So much has changed in the last 6 years that it seemed like a good idea to re-visit the topic. Just about

More information

Introduction to mass spectrometry (MS) based proteomics and metabolomics

Introduction to mass spectrometry (MS) based proteomics and metabolomics Introduction to mass spectrometry (MS) based proteomics and metabolomics Tianwei Yu Department of Biostatistics and Bioinformatics Rollins School of Public Health Emory University September 10, 2015 Background

More information

Visualization of Breast Cancer Data by SOM Component Planes

Visualization of Breast Cancer Data by SOM Component Planes International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

MultiAlign Software. Windows GUI. Console Application. MultiAlign Software Website. Test Data

MultiAlign Software. Windows GUI. Console Application. MultiAlign Software Website. Test Data MultiAlign Software This documentation describes MultiAlign and its features. This serves as a quick guide for starting to use MultiAlign. MultiAlign comes in two forms: as a graphical user interface (GUI)

More information

Building a Collaborative Informatics Platform for Translational Research: Prof. Yike Guo Department of Computing Imperial College London

Building a Collaborative Informatics Platform for Translational Research: Prof. Yike Guo Department of Computing Imperial College London Building a Collaborative Informatics Platform for Translational Research: An IMI Project Experience Prof. Yike Guo Department of Computing Imperial College London Living in the Era of BIG Big Data : Massive

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Protein Prospector and Ways of Calculating Expectation Values

Protein Prospector and Ways of Calculating Expectation Values Protein Prospector and Ways of Calculating Expectation Values 1/16 Aenoch J. Lynn; Robert J. Chalkley; Peter R. Baker; Mark R. Segal; and Alma L. Burlingame University of California, San Francisco, San

More information

P4 Medicine: Personalized, Predictive, Preventive, Participatory A Change of View that Changes Everything

P4 Medicine: Personalized, Predictive, Preventive, Participatory A Change of View that Changes Everything P4 Medicine: Personalized, Predictive, Preventive, Participatory A Change of View that Changes Everything Leroy E. Hood Institute for Systems Biology David J. Galas Battelle Memorial Institute Version

More information

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Statistical issues in the analysis of microarray data

Statistical issues in the analysis of microarray data Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Guide for Data Visualization and Analysis using ACSN

Guide for Data Visualization and Analysis using ACSN Guide for Data Visualization and Analysis using ACSN ACSN contains the NaviCell tool box, the intuitive and user- friendly environment for data visualization and analysis. The tool is accessible from the

More information

Standardization and Its Effects on K-Means Clustering Algorithm

Standardization and Its Effects on K-Means Clustering Algorithm Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics

OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics Knut Reinert 1, Oliver Kohlbacher 2,Clemens Gröpl 1, Eva Lange 1, Ole Schulz-Trieglaff 1,Marc Sturm 2 and Nico Pfeifer 2 1 Algorithmische Bioinformatik,

More information

Thermo Scientific SIEVE Software for Differential Expression Analysis

Thermo Scientific SIEVE Software for Differential Expression Analysis m a s s s p e c t r o m e t r y Thermo Scientific SIEVE Software for Differential Expression Analysis Automated, label-free, semi-quantitative analysis of proteins, peptides, and metabolites based on comparisons

More information

Mascot Integra: Data management for Proteomics ASMS 2004

Mascot Integra: Data management for Proteomics ASMS 2004 Mascot Integra: Data management for Proteomics 1 Mascot Integra: Data management for proteomics What is Mascot Integra? What Mascot Integra isn t Instrument integration in Mascot Integra Designing and

More information

PREDA S4-classes. Francesco Ferrari October 13, 2015

PREDA S4-classes. Francesco Ferrari October 13, 2015 PREDA S4-classes Francesco Ferrari October 13, 2015 Abstract This document provides a description of custom S4 classes used to manage data structures for PREDA: an R package for Position RElated Data Analysis.

More information

Medical Informatics II

Medical Informatics II Medical Informatics II Zlatko Trajanoski Institute for Genomics and Bioinformatics Graz University of Technology http://genome.tugraz.at zlatko.trajanoski@tugraz.at Medical Informatics II Introduction

More information

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation Identification of rheumatoid arthritis and osterthritis patients by transcriptome-based rule set generation Bering Limited Report generated on September 19, 2014 Contents 1 Dataset summary 2 1.1 Project

More information

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

The Open2Dprot Proteomics Project for n-dimensional Protein Expression Data Analysis

The Open2Dprot Proteomics Project for n-dimensional Protein Expression Data Analysis The Open2Dprot Proteomics Project for n-dimensional Protein Expression Data Analysis http://open2dprot.sourceforge.net/ Revised 2-05-2006 * (cf. 2D-LC) Introduction There is a need for integrated proteomics

More information

Mass Spectra Alignments and their Significance

Mass Spectra Alignments and their Significance Mass Spectra Alignments and their Significance Sebastian Böcker 1, Hans-Michael altenbach 2 1 Technische Fakultät, Universität Bielefeld 2 NRW Int l Graduate School in Bioinformatics and Genome Research,

More information

Tackling the data analysis challenge for characterisation of biotherapeutics

Tackling the data analysis challenge for characterisation of biotherapeutics CASSS AT 2015 Berlin March 2015 1 Tackling the data analysis challenge for characterisation of biotherapeutics Carsten P Sönksen, Ph.D., Novo Nordisk Tackling the data analysis challenge 2 Personal background:

More information

1 Genzyme Corp., Framingham, MA, 2 Positive Probability Ltd, Isleham, U.K.

1 Genzyme Corp., Framingham, MA, 2 Positive Probability Ltd, Isleham, U.K. Overview Fast and Quantitative Analysis of Data for Investigating the Heterogeneity of Intact Glycoproteins by ESI-MS Kate Zhang 1, Robert Alecio 2, Stuart Ray 2, John Thomas 1 and Tony Ferrige 2. 1 Genzyme

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Tutorial 9: SWATH data analysis in Skyline

Tutorial 9: SWATH data analysis in Skyline Tutorial 9: SWATH data analysis in Skyline In this tutorial we will learn how to perform targeted post-acquisition analysis for protein identification and quantitation using a data-independent dataset

More information

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances

Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances Accurate Mass Screening Workflows for the Analysis of Novel Psychoactive Substances TripleTOF 5600 + LC/MS/MS System with MasterView Software Adrian M. Taylor AB Sciex Concord, Ontario (Canada) Overview

More information

K-nearest-neighbor: an introduction to machine learning

K-nearest-neighbor: an introduction to machine learning K-nearest-neighbor: an introduction to machine learning Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Outline Types of learning Classification:

More information

Accurate calibration of on-line Time of Flight Mass Spectrometer (TOF-MS) for high molecular weight combustion product analysis

Accurate calibration of on-line Time of Flight Mass Spectrometer (TOF-MS) for high molecular weight combustion product analysis Accurate calibration of on-line Time of Flight Mass Spectrometer (TOF-MS) for high molecular weight combustion product analysis B. Apicella*, M. Passaro**, X. Wang***, N. Spinelli**** mariadellarcopassaro@gmail.com

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

SESSION DEPENDENT DE-IDENTIFICATION OF ELECTRONIC MEDICAL RECORDS

SESSION DEPENDENT DE-IDENTIFICATION OF ELECTRONIC MEDICAL RECORDS SESSION DEPENDENT DE-IDENTIFICATION OF ELECTRONIC MEDICAL RECORDS A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Bachelor of Science with Honors Research Distinction in Electrical

More information

Introduction. What Can an Offline Desktop Processing Tool Provide for a Chemist?

Introduction. What Can an Offline Desktop Processing Tool Provide for a Chemist? Increasing Chemist Productivity in an Open-Access Environment Ryan Sasaki, Graham A. McGibbon, Steve Hayward Featuring ACD/Spectrus Processor and Aldrich Library for ACD/Labs Advanced Chemistry Development,

More information

Exploratory data analysis for microarray data

Exploratory data analysis for microarray data Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany heydebre@molgen.mpg.de Visualization

More information

Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study

Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study The data for this study is taken from experiment GSE848 from the Gene Expression

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Anomaly Detection in Predictive Maintenance

Anomaly Detection in Predictive Maintenance Anomaly Detection in Predictive Maintenance Anomaly Detection with Time Series Analysis Phil Winters Iris Adae Rosaria Silipo Phil.Winters@knime.com Iris.Adae@uni-konstanz.de Rosaria.Silipo@knime.com Copyright

More information

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Introduction to Proteomics 1.0

Introduction to Proteomics 1.0 Introduction to Proteomics 1.0 CMSP Workshop Tim Griffin Associate Professor, BMBB Faculty Director, CMSP Objectives Why are we here? For participants: Learn basics of MS-based proteomics Learn what s

More information

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social

More information

Biopharmaceutical Glycosylation Analysis

Biopharmaceutical Glycosylation Analysis Biopharmaceutical Glycosylation Analysis Glycosylation Analysis: Product Offering Molecular model of erythropoietin with complex N-linked glycans. Courtesy of M.R Wormald and R.A Dwek, Oxford Glycobioloy

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Research-grade Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, How I learned to ditch the triple quad and love the QE

Research-grade Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, How I learned to ditch the triple quad and love the QE Research-grade Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, How I learned to ditch the triple quad and love the QE Jacob D. Jaffe Skyline Webinar July 2015 Proteomics and

More information

A QUICK OVERVIEW OF THE OMNeT++ IDE

A QUICK OVERVIEW OF THE OMNeT++ IDE Introduction A QUICK OVERVIEW OF THE OMNeT++ IDE The OMNeT++ 4.x Integrated Development Environment is based on the Eclipse platform, and extends it with new editors, views, wizards, and additional functionality.

More information

Cross-Validation. Synonyms Rotation estimation

Cross-Validation. Synonyms Rotation estimation Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical

More information

Fig. 1 A typical Knowledge Discovery process [2]

Fig. 1 A typical Knowledge Discovery process [2] Volume 4, Issue 7, July 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Review on Clustering

More information

R software tutorial: Random Forest Clustering Applied to Renal Cell Carcinoma Steve Horvath and Tao Shi

R software tutorial: Random Forest Clustering Applied to Renal Cell Carcinoma Steve Horvath and Tao Shi R software tutorial: Random Forest Clustering Applied to Renal Cell Carcinoma Steve orvath and Tao Shi Correspondence: shorvath@mednet.ucla.edu Department of uman Genetics and Biostatistics University

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information