Deep profiling of multitube flow cytometry data Supplemental information

Size: px
Start display at page:

Download "Deep profiling of multitube flow cytometry data Supplemental information"

Transcription

1 Deep profiling of multitube flow cytometry data Supplemental information Kieran O Neill et al December 19,

2 Table S1: Markers in simulated multitube data. The data was split into three tubes, each containing CD3, CD4 and CD8 in addition to FSC and SSC. The remaining nine markers were distributed across the tubes, three per tube. Marker Type Tube 1 Tube 2 Tube 3 Common (scatter) FSC FSC FSC Common (scatter) SSC SSC SSC Common (fluorescent) CD3 CD3 CD3 Common (fluorescent) CD4 CD4 CD4 Common (fluorescent) CD8 CD8 CD8 Phenotyping (fluorescent) KI67 CD57 CD27 Phenotyping (fluorescent) CD28 CCR5 CCR7 Phenotyping (fluorescent) CD45RO CD19 CD127 2

3 Figure S1: Overview of the flowbin pipeline, applied to one multitube sample. 1) FCM data from individual aliquot tubes is quantile normalised in terms of the common population markers present in every tube. 2) The tubes are then binned in terms of these population markers, using either K-means or flowfp. 3) The bins from the first tube are mapped to the other tubes (by nearest-neighbour mapping for K-means bins, or directly for flowfp bins). 4) The expression of each bin in terms of each phenotyping marker (those markers differing across tubes) is measured. This may be done by taking median fluorescent intensity, normalised median fluorescent intensity, or proportion of cells exceeding the 98th percentile of a negative control. The final result is a highdimensional matrix containing expression levels for each bin in terms of each unique marker. 3

4 Bone Marrow FCS Data Markers FSC Aliquot, stain, run flow cytometry Scatter Tube 1 Tube 2 Combine tubes using flowbin 129 patients 7-10 tubes/patient 20,000 cells, 3-4 markers/tube SSC CD45 HLA-DR CD13 CD34 CD20 CD19 CD10 CD61 CD56 CD33 CD64 CD117 CD14 CD7 CD2 CD4 Cell Clusters CD3 FSC SSC CD45 HLA-DR CD13 CD34 CD20 CD19 CD10 CD61 CD56 CD33 CD64 CD117 CD14 CD7 CD2 CD4 CD3 128 clusters/patient 17 markers each Cell Clusters Cell Type Proportions FSC SSC CD45 HLA-DR CD13 CD34 CD20 CD19 CD10 CD61 CD56 CD33 CD64 CD117 CD14 CD7 CD2 CD4 CD3 Type clusters using flowtype (1:6-combinations) Wilcoxon rank sum vs patient NPM1 with Holm correction 801 cell types associated with NPM-1 FSC SSC CD45 HLA-DR CD13 CD34 CD20 CD19 CD10 CD61 CD56 CD33 CD64 CD117 CD14 CD7 CD2 CD4 CD3 128 clusters/patient 17 markers each 616,285 cell types per patient Figure S2: Pipeline used to determine NPM1-associated immunophenotypes in AML. Steps taken are denoted by arrows, while the data consumed/produced is indicated in boxes. FCM was performed in the clinic historically; all other steps were computational. The end result was a list of 801 cell types which showed a significant difference in abundance between NPM1 mutated and wild-type patients. 4

5 Figure S3: One, two and three-dimensional representations of quantile normalisation of population markers. Empirical cumulative density function (ECDF) plots are shown for all tubes and for forward scatter (FS), the most variant marker. Following normalisation, the ECDF for all tubes is identical, as is expected from quantile normalisation. Two-dimensional scatter plots for representative tubes show visually the improvement in two-dimensional registration. Lastly, flowfp plots show the improvement in three-dimensional registration, measured by the standard deviation of the number of cells falling within each bin, after bins have been fitted to the consensus of all tubes. 5

6 Figure S4: The two options for binning within flowbin: k-means and flowfp, as applied to a 7-tube sample. a. and b. show comparisons between the bin labels themselves. K-means creates roughly spherical bins, which conform around the location of cell populations. FlowFP creates grid-like bins,which may not conform to the true underlying shape of cell populations. c. shows the number of cells per bin across all tubes, for every bin. flowfp has approximately the same mean distribution of bin density across tubes as K-means (mean SD: 24.6 vs 28.5). However, flowfp has a much closer to constant number of cells per bin across bins (SD of means: 0.07 vs 255). 6

7 Figure S5: Comparison between nearest-neighbours merging and flowbin for two tubes computationally sampled from a real data set. a. Raw data (compensated, transformed and filtered for debris), gated for CD3 + cells, and showing the true CD4 and CD8 distribution. b. The two sampled tubes, one containing CD4 and the other CD8. The CD4 + population has slightly higher average CD3 than the CD8 +, but both have substantially overlapping CD3 distributions. c. Results of merging by nearest neighbours and by flowbin, including proportion of resulting cells falling within each quadrant. The nearest-neighbours merging created a substantial CD4 + CD8 + population not present in the original sample. Both nearest neighbours and flowbin slightly overestimate the CD4 CD8 population. flowbin is more accurate at reproducing the CD4 + CD8 and CD4 CD8 + populations than nearest neighbours. d. and e. This analysis was repeated 100 times each for each number of bins, with a separate sampling of 5,000 events each. d. Representative results (those with median RMSD) for selected numbers of bins. e. All results for all numbers of bins and NN merging. The best result (lowest RMSD) was for 128 bins, whereafter increasing bin number caused RMSD to tend towards that of NN. 7

8 Figure S6: nu-svm separation of normal and abnormal cell populations in AML samples. a. Heatmap of all populations within the AML samples that were predicted to be normal. Most can readily be identified as having the properties of common blood and bone marrow cell populations: myeloid cells expressing CD16 and/or CD64, lymphoid cells (dominated by CD3-expressing T-lymphocytes/precursors, and erythroid cells not expressing any of the markers in the panel, including CD45. b. Heatmap of all populations predicted to be abnormal. In contrast to the cells predicted to be normal, many of these express CD34 and CD117, primitive markers typical of stem cells and of AML. a. b. Training data patient 1 patient 2 pop 1) pop 2) pop 3) pop 1) pop 2) pop 3) patient 1 Test data (all bins from one patient) pop 1) pop 2) pop 3) Classifier Predicted classes pop 1) AML pop 2) AML pop 3) healthy Classifier Training Algorithm Classifier patient 1 Training classes pop 1) AML pop 2) AML pop 3) AML Take vote patient 2 pop 1) healthy pop 2) healthy pop 3) healthy Predicted class Patient 1) AML Figure S7: Schema for a voting classifier using flowbin output. a. Training. Every bin from every patient is treated as an individual measurement, labelled with the class of the patient. A classifier is then trained on the entire dataset at once (all bins from all patients). b. Prediction To predict the class of a new patient, a prediction is made by the trained classifier for every one of the bins from the patient. The majority vote from these predictions is then taken as the overall prediction for the patient. 8

9 a. sample 1 data 2) ) ) Training data 1) ) ) sample 1 classes 2) healthy 3) AML 5) healthy Classifier Training Algorithm Classifier 1 sample 2 data Training classes 1) healthy 2) AML 3) healthy Subsample 1) ) ) sample 2 classes 1) healthy 3) healthy 8) AML Classifier Training Algorithm Classifier 2 b. Predicted classes Test data 1) ) ) Classifier Classifier 1) AML 2) AML 3) healthy Predicted classes 1) AML 2) healthy 3) healthy Take vote Final classes 1) AML 2) AML 3) healthy Predicted classes Take vote Classifier 1) AML 2) AML 3) AML Patient class Patient 1) AML Figure S8: Schema for a voting classifier for flowbin output incorporating balanced bagging. a. Training. This is similar to the base classifier (Fig. S7), except that multiple classifiers are trained, each on a bootstrap subsample of patients. Each bootstrap sample is set to contain equal numbers of patients from each class. b. Prediction To predict the class of a new patient, predictions for each bin from that patient are made by each of the trained classifiers. Final per-bin predictions are taken by majority vote of those predictions. Then, the prediction for the patient is made based on a majority vote of the per-bin predictions. 9

10 log10(p-value) All Cells CD34+ CD34+ CD61 CD34+CD61 CD14 CD34+CD61 CD14 CD10 CD20 CD34+CD10 CD61 CD14 CD34+CD20 CD61 CD14 CD20 CD34+CD20 CD10 CD61 CD14 CD3+ CD34+CD20 CD10 CD61 CD14 CD3+ Figure S9: An example of RchyOptimyx analysis of one cluster of cell types. As 801 cell types are too many to visualise meaningfully with RchyOptimyx, we clustered the cell types and visualised each in turn. In this example, the addition of CD10- or CD20- make little difference to the P-value of the cell type CD34 + CD61 CD14. As this was a general trend and in line with reported AML biology, we chose to exclude cell types defined over these markers from further analysis. 10

11 q q q q q q a. Proportion of all cells b. Proportion of all cells c. Proportion of all cells d. Proportion of all cells wt CD34- P>0.05 mt CD34-CD2- P= CD34-CD13+ P= wt mt CD34-CD2-CD4+ P= CD34-CD33+ P= CD13+CD34 CD33+ P= wt mt wt mt CD13+CD34-CD2-CD4+ P=1.94e-06 wt mt wt mt wt mt CD34+CD61-CD14- P=0.015 wt mt HLA+CD34+CD33-CD64- P= CD34+CD61-CD14-CD2+ P= CD34+CD61-CD2+CD4- P= wt mt wt mt HLA+CD34+CD4-CD64- P= wt mt wt mt Figure S10: Selected classes of cell types showing significant differences in abundance between NPM1-mt and NPM1-wt. P-values are given after Holm correction. a. Gating for the presence of myeloid lineage markers CD13 and CD33 within the CD34- compartment yields much stronger differences in abundance between NPM1-wt and NPM1-mt than CD34- alone. b. Gating for CD2- within the CD34- compartment yields a slightly better separation than CD34- alone, but gating down further to CD4- and CD13+ is a cell type that, while present in most NPM1-mt, is absent or below 20% abundance in nearly all NPM1-wt. c. Gating for CD61- and CD14- within the CD34+ compartment leads to a cell type which is common in NPM1-wt but almost entirely absent in NPM1-mt. d. Gating for HLA-DR+ and CD64- within the CD34+ compartment leads to a cell type that occurs in a subset of NPM1-wt but is entirely absent in NPM1-mt. 11

Flow Cytometry for Everyone Else Susan McQuiston, J.D., MLS(ASCP), C.Cy.

Flow Cytometry for Everyone Else Susan McQuiston, J.D., MLS(ASCP), C.Cy. Flow Cytometry for Everyone Else Susan McQuiston, J.D., MLS(ASCP), C.Cy. At the end of this session, the participant will be able to: 1. Describe the components of a flow cytometer 2. Describe the gating

More information

Stepcount. Product Description: Closed transparent tubes with a metal screen, including a white matrix at the bottom. Cat. Reference: STP-25T

Stepcount. Product Description: Closed transparent tubes with a metal screen, including a white matrix at the bottom. Cat. Reference: STP-25T Product Description: Closed transparent tubes with a metal screen, including a white matrix at the bottom Cat. Reference: STP-25T Reagent provided:: 25 Stepcount tubes for 25 test INTENDED USE. Immunostep

More information

Minimal residual disease detection in Acute Myeloid Leukaemia on a Becton Dickinson flow cytometer

Minimal residual disease detection in Acute Myeloid Leukaemia on a Becton Dickinson flow cytometer Minimal residual disease detection in Acute Myeloid Leukaemia on a Becton Dickinson flow cytometer Purpose This procedure gives instruction on minimal residual disease (MRD) detection in patients with

More information

COMPENSATION MIT Flow Cytometry Core Facility

COMPENSATION MIT Flow Cytometry Core Facility COMPENSATION MIT Flow Cytometry Core Facility Why do we need compensation? 1) Because the long emission spectrum tail of dyes causes overlap like with the fluorophores FITC and PE. 2) For sensitivity reasons,

More information

Immunophenotyping Flow Cytometry Tutorial. Contents. Experimental Requirements...1. Data Storage...2. Voltage Adjustments...3. Compensation...

Immunophenotyping Flow Cytometry Tutorial. Contents. Experimental Requirements...1. Data Storage...2. Voltage Adjustments...3. Compensation... Immunophenotyping Flow Cytometry Tutorial Contents Experimental Requirements...1 Data Storage...2 Voltage Adjustments...3 Compensation...5 Experimental Requirements For immunophenotyping with FITC and

More information

Subtypes of AML follow branches of myeloid development, making the FAB classificaoon relaovely simple to understand.

Subtypes of AML follow branches of myeloid development, making the FAB classificaoon relaovely simple to understand. 1 2 3 4 The FAB assigns a cut off of 30% blasts to define AML and relies predominantly on morphology and cytochemical stains (MPO, Sudan Black, and NSE which will be discussed later). Subtypes of AML follow

More information

Pathology No: SHS-CASE No. Date of Procedure: Client Name Address

Pathology No: SHS-CASE No. Date of Procedure: Client Name Address TEL #: (650) 725-5604 FAX #: (650) 725-7409 Med. Rec. No.: Date of Procedure: Sex: A ge: Date Received: Date of Birth: Account No.: Physician(s): Client Name Address SPECIMEN SUBMITTED: LEFT PIC BONE MARROW,

More information

Introduction to Flow Cytometry

Introduction to Flow Cytometry Introduction to Flow Cytometry presented by: Flow Cytometry y Core Facility Biomedical Instrumentation Center Uniformed Services University Topics Covered in this Lecture What is flow cytometry? Flow cytometer

More information

Immunophenotyping peripheral blood cells

Immunophenotyping peripheral blood cells IMMUNOPHENOTYPING Attune Accoustic Focusing Cytometer Immunophenotyping peripheral blood cells A no-lyse, no-wash, no cell loss method for immunophenotyping nucleated peripheral blood cells using the Attune

More information

Technical Bulletin. Threshold and Analysis of Small Particles on the BD Accuri C6 Flow Cytometer

Technical Bulletin. Threshold and Analysis of Small Particles on the BD Accuri C6 Flow Cytometer Threshold and Analysis of Small Particles on the BD Accuri C6 Flow Cytometer Contents 2 Thresholds 2 Setting the Threshold When analyzing small particles, defined as particles smaller than 3.0 µm, on the

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?

More information

UNIVERSITY OF PÉCS MEDICAL SCHOOL FLOW CYTOMETRY AND CELL SEPARATION BIOPHYSICS 2. 2015 4th March Dr. Beáta Bugyi Department of Biophysics Flow cytometry and cell separation FLOW = STREAM OF FLUID in a

More information

FlowMergeCluster Documentation

FlowMergeCluster Documentation FlowMergeCluster Documentation Description: Author: Clustering of flow cytometry data using the FlowMerge algorithm. Josef Spidlen, jspidlen@bccrc.ca Please see the gp-flowcyt-help Google Group (https://groups.google.com/a/broadinstitute.org/forum/#!forum/gpflowcyt-help)

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

The NIAID Flow Cytometry Advisory Committee; the Guidelines Subcommittee

The NIAID Flow Cytometry Advisory Committee; the Guidelines Subcommittee January, 1999 From: To: Subject: The NIAID Flow Cytometry Advisory Committee; the Guidelines Subcommittee NIAID DAIDS Flow Cytometry Laboratories Comparison study information for labs wishing to switch

More information

CyTOF2. Mass cytometry system. Unveil new cell types and function with high-parameter protein detection

CyTOF2. Mass cytometry system. Unveil new cell types and function with high-parameter protein detection CyTOF2 Mass cytometry system Unveil new cell types and function with high-parameter protein detection DISCOVER MORE. IMAGINE MORE. MASS CYTOMETRY. THE FUTURE OF CYTOMETRY TODAY. Mass cytometry resolves

More information

No-wash, no-lyse detection of leukocytes in human whole blood on the Attune NxT Flow Cytometer

No-wash, no-lyse detection of leukocytes in human whole blood on the Attune NxT Flow Cytometer APPLICATION NOTE Attune NxT Flow Cytometer No-wash, no-lyse detection of leukocytes in human whole blood on the Attune NxT Flow Cytometer Introduction Standard methods for isolating and detecting leukocytes

More information

123count ebeads Catalog Number: 01-1234 Also known as: Absolute cell count beads GPR: General Purpose Reagents. For Laboratory Use.

123count ebeads Catalog Number: 01-1234 Also known as: Absolute cell count beads GPR: General Purpose Reagents. For Laboratory Use. Page 1 of 1 Catalog Number: 01-1234 Also known as: Absolute cell count beads GPR: General Purpose Reagents. For Laboratory Use. Normal human peripheral blood was stained with Anti- Human CD45 PE (cat.

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Analyzing Flow Cytometry Data with Bioconductor

Analyzing Flow Cytometry Data with Bioconductor Introduction Data Analysis Analyzing Flow Cytometry Data with Bioconductor Nolwenn Le Meur, Deepayan Sarkar, Errol Strain, Byron Ellis, Perry Haaland, Florian Hahne Fred Hutchinson Cancer Research Center

More information

DELPHI 27 V 2016 CYTOMETRY STRATEGIES IN THE DIAGNOSIS OF HEMATOLOGICAL DISEASES

DELPHI 27 V 2016 CYTOMETRY STRATEGIES IN THE DIAGNOSIS OF HEMATOLOGICAL DISEASES DELPHI 27 V 2016 CYTOMETRY STRATEGIES IN THE DIAGNOSIS OF HEMATOLOGICAL DISEASES CLAUDIO ORTOLANI UNIVERSITY OF URBINO - ITALY SUN TZU (544 b.c. 496 b.c) SUN TZU (544 b.c. 496 b.c.) THE ART OF CYTOMETRY

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

BD FACSDiva 4.1 - TUTORIAL TSRI FLOW CYTOMETRY CORE FACILITY

BD FACSDiva 4.1 - TUTORIAL TSRI FLOW CYTOMETRY CORE FACILITY BD FACSDiva 4.1 - TUTORIAL TSRI FLOW CYTOMETRY CORE FACILITY IMPORTANT NOTES BEFORE READING THIS TUTORIAL This is a very expensive piece of equipment so PLEASE treat it with respect! After you are done

More information

CELL CYCLE BASICS. G0/1 = 1X S Phase G2/M = 2X DYE FLUORESCENCE

CELL CYCLE BASICS. G0/1 = 1X S Phase G2/M = 2X DYE FLUORESCENCE CELL CYCLE BASICS Analysis of a population of cells replication state can be achieved by fluorescence labeling of the nuclei of cells in suspension and then analyzing the fluorescence properties of each

More information

Application Note 10. Measurement of Cell Recovery. After Sorting with a Catcher-Tube-Based. Cell Sorter. Introduction

Application Note 10. Measurement of Cell Recovery. After Sorting with a Catcher-Tube-Based. Cell Sorter. Introduction Application Note 10 Measurement of Cell Recovery After Sorting with a Catcher-Tube-Based Cell Sorter Introduction In many experiments using sorted cells, it is important to be able to count the number

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

PREDA S4-classes. Francesco Ferrari October 13, 2015

PREDA S4-classes. Francesco Ferrari October 13, 2015 PREDA S4-classes Francesco Ferrari October 13, 2015 Abstract This document provides a description of custom S4 classes used to manage data structures for PREDA: an R package for Position RElated Data Analysis.

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Data Quality Assessment of Ungated Flow Cytometry Data in High Throughput Experiments

Data Quality Assessment of Ungated Flow Cytometry Data in High Throughput Experiments q 2007 International Society for Analytical Cytology Cytometry Part A 71A:393 403 (2007) Data Quality Assessment of Ungated Flow Cytometry Data in High Throughput Experiments Nolwenn Le Meur, 1 * Anthony

More information

CyAn : 11 Parameter Desktop Flow Cytometer

CyAn : 11 Parameter Desktop Flow Cytometer CyAn : 11 Parameter Desktop Flow Cytometer Cyan ADP 3 excitation lines 488nm, 635nm, and UV or violet 11 simultaneous parameters FSC, SSC, and 7-9 colors with simultaneous width, peak, area, and log on

More information

INSIDE THE BLACK BOX

INSIDE THE BLACK BOX FLOW CYTOMETRY ESSENTIALS INSIDE THE BLACK BOX Alice L. Givan Englert Cell Analysis Laboratory of the Norris Cotton Cancer Center Dartmouth Medical School HOW NOT TO BE A FLOW CYTOMETRIST Drawing by Ben

More information

BD CellQuest Pro Software Analysis Tutorial

BD CellQuest Pro Software Analysis Tutorial BD CellQuest Pro Analysis Tutorial This tutorial guides you through an analysis example using BD CellQuest Pro software. If you are already familiar with BD CellQuest Pro software on Mac OS 9, refer to

More information

Demographics of Atlanta, Georgia:

Demographics of Atlanta, Georgia: Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From

More information

Cluster Analysis for Evaluating Trading Strategies 1

Cluster Analysis for Evaluating Trading Strategies 1 CONTRIBUTORS Jeff Bacidore Managing Director, Head of Algorithmic Trading, ITG, Inc. Jeff.Bacidore@itg.com +1.212.588.4327 Kathryn Berkow Quantitative Analyst, Algorithmic Trading, ITG, Inc. Kathryn.Berkow@itg.com

More information

Multicolor Bead Flow Cytometry Standardization Heba Degheidy MD, PhD, QCYM DB/OSEL/CDRH/FDA Manager of MCM Flow Cytometry Facility

Multicolor Bead Flow Cytometry Standardization Heba Degheidy MD, PhD, QCYM DB/OSEL/CDRH/FDA Manager of MCM Flow Cytometry Facility Multicolor Bead Flow Cytometry Standardization Heba Degheidy MD, PhD, QCYM DB/OSEL/CDRH/FDA Manager of MCM Flow Cytometry Facility The mention of commercial products, their sources, or their use in connection

More information

Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014. Uncontrolled Copy

Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014. Uncontrolled Copy Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014 Contents 1. The need for change 2. Current systems 3. Proposed z-score system 4. Comparison of z-score

More information

APPLICATION INFORMATION

APPLICATION INFORMATION DRAFT: Rev. D A-2045A APPLICATION INFORMATION Flow Cytometry 3-COLOR COMPENSATION Raquel Cabana,* Mark Cheetham, Jay Enten, Yong Song, Michael Thomas,* and Brendan S. Yee Beckman Coulter, Inc., Miami FL

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3 COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping

More information

BD Trucount Tubes IVD

BD Trucount Tubes IVD 02/2015 23-3483-07 IVD BD Trucount Tubes For determining absolute counts of leucocytes in blood 25 Tubes Catalog No. 340334 BD, BD Logo and all other trademarks are property of Becton, Dickinson and Company.

More information

!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"

!!!#$$%&'()*+$(,%!#$%$&'()*%(+,'-*&./#-$&'(-&(0*.$#-$1(2&.3$'45 !"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"!"#"$%&#'()*+',$$-.&#',/"-0%.12'32./4'5,5'6/%&)$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:

More information

Automated Quadratic Characterization of Flow Cytometer Instrument Sensitivity (flowqb Package: Introductory Processing Using Data NIH))

Automated Quadratic Characterization of Flow Cytometer Instrument Sensitivity (flowqb Package: Introductory Processing Using Data NIH)) Automated Quadratic Characterization of Flow Cytometer Instrument Sensitivity (flowqb Package: Introductory Processing Using Data NIH)) October 14, 2013 1 Licensing Under the Artistic License, you are

More information

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation Identification of rheumatoid arthritis and osterthritis patients by transcriptome-based rule set generation Bering Limited Report generated on September 19, 2014 Contents 1 Dataset summary 2 1.1 Project

More information

Multicolor Flow Cytometry: Setup and Optimization on the BD Accuri C6 Flow Cytometer

Multicolor Flow Cytometry: Setup and Optimization on the BD Accuri C6 Flow Cytometer Multicolor Flow Cytometry: Setup and Optimization on the BD Accuri C6 Flow Cytometer Presented by Clare Rogers, MS Senior Marketing Applications Specialist BD Biosciences 23-13660-00 Webinar Overview Multicolor

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary

More information

Instructions for Use. CyAn ADP. High-speed Analyzer. Summit 4.3. 0000050G June 2008. Beckman Coulter, Inc. 4300 N. Harbor Blvd. Fullerton, CA 92835

Instructions for Use. CyAn ADP. High-speed Analyzer. Summit 4.3. 0000050G June 2008. Beckman Coulter, Inc. 4300 N. Harbor Blvd. Fullerton, CA 92835 Instructions for Use CyAn ADP High-speed Analyzer Summit 4.3 0000050G June 2008 Beckman Coulter, Inc. 4300 N. Harbor Blvd. Fullerton, CA 92835 Overview Summit software is a Windows based application that

More information

Potency Assays for an Autologous Active Immunotherapy (Sipuleucel-T) Pocheng Liu, Ph.D. Senior Scientist of Product Development Dendreon Corporation

Potency Assays for an Autologous Active Immunotherapy (Sipuleucel-T) Pocheng Liu, Ph.D. Senior Scientist of Product Development Dendreon Corporation Potency Assays for an Autologous Active Immunotherapy (Sipuleucel-T) Pocheng Liu, Ph.D. Senior Scientist of Product Development Dendreon Corporation Sipuleucel-T Manufacturing Process Day 1 Leukapheresis

More information

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters: : Manipulating Display Parameters in ArcMap Symbolizing Features and Rasters: Data sets that are added to ArcMap a default symbology. The user can change the default symbology for their features (point,

More information

These particles have something in common

These particles have something in common These particles have something in common Blood cells Chromosomes Algae Protozoa Certain parameters of these particles can be measured with a flow cytometer Which parameters can be measured? the relative

More information

How To Read Flow Cytometry Data

How To Read Flow Cytometry Data 26 Nature Publishing Group http://www.nature.com/natureimmunology Interpreting flow cytometry data: a guide for the perplexed Leonore A Herzenberg, James Tung, Wayne A Moore, Leonard A Herzenberg & David

More information

Analyzer Experiment setup guide for LSRII and Canto s

Analyzer Experiment setup guide for LSRII and Canto s Analyzer Experiment setup guide for LSRII and Canto s 1. Check the instrument configuration on our website to determine the most appropriate cytometer for your experiment, https://depts.washington.edu/flowlab/instrumentation.html,

More information

Standardization, Calibration and Quality Control

Standardization, Calibration and Quality Control Standardization, Calibration and Quality Control Ian Storie Flow cytometry has become an essential tool in the research and clinical diagnostic laboratory. The range of available flow-based diagnostic

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents: Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap

More information

Introduction to the BioConductor framework Algorithmic Analysis of Flow Cytometry Data (Part 1)

Introduction to the BioConductor framework Algorithmic Analysis of Flow Cytometry Data (Part 1) Introduction to the BioConductor framework Algorithmic Analysis of Flow Cytometry Data (Part 1) Ryan Brinkman Senior Scientist, BC Cancer Agency Associate Professor, Department of Medical Genetics, UBC

More information

Data Mining and Visualization

Data Mining and Visualization Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford Overview Data mining components Functionality Example application Quality control Visualization Use of 3D Example application Market research

More information

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.

More information

Principles of Flowcytometry

Principles of Flowcytometry Objectives Introduction to Cell Markers: Principles of Flowcytometry Michelle Petrasich NZIMLS Scientific Meeting August 24, 2010, Paihia What are cell markers How do we detect them Production of Monoclonal

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Selected Topics in Electrical Engineering: Flow Cytometry Data Analysis

Selected Topics in Electrical Engineering: Flow Cytometry Data Analysis Selected Topics in Electrical Engineering: Flow Cytometry Data Analysis Bilge Karaçalı, PhD Department of Electrical and Electronics Engineering Izmir Institute of Technology Outline Compensation and gating

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Cluster Analysis: Advanced Concepts

Cluster Analysis: Advanced Concepts Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means

More information

Analyzing Samples for CD34 Enumeration Using the BD Stem Cell Enumeration Kit

Analyzing Samples for CD34 Enumeration Using the BD Stem Cell Enumeration Kit 2 Analyzing Samples for CD34 Enumeration Using the BD Stem Cell Enumeration Kit Presented by Ellen Meinelt, MS MLS(ASCP) CM, Technical Applications Specialist and Calin Yuan, Product Course Developer,

More information

COC131 Data Mining - Clustering

COC131 Data Mining - Clustering COC131 Data Mining - Clustering Martin D. Sykora m.d.sykora@lboro.ac.uk Tutorial 05, Friday 20th March 2009 1. Fire up Weka (Waikako Environment for Knowledge Analysis) software, launch the explorer window

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

BD FACSComp Software Tutorial

BD FACSComp Software Tutorial BD FACSComp Software Tutorial This tutorial guides you through a BD FACSComp software lyse/no-wash assay setup run. If you are already familiar with previous versions of BD FACSComp software on Mac OS

More information

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data

More information

sample median Sample quartiles sample deciles sample quantiles sample percentiles Exercise 1 five number summary # Create and view a sorted

sample median Sample quartiles sample deciles sample quantiles sample percentiles Exercise 1 five number summary # Create and view a sorted Sample uartiles We have seen that the sample median of a data set {x 1, x, x,, x n }, sorted in increasing order, is a value that divides it in such a way, that exactly half (i.e., 50%) of the sample observations

More information

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

Emerging New Prognostic Scoring Systems in Myelodysplastic Syndromes 2012

Emerging New Prognostic Scoring Systems in Myelodysplastic Syndromes 2012 Emerging New Prognostic Scoring Systems in Myelodysplastic Syndromes 2012 Arjan A. van de Loosdrecht, MD, PhD Department of Hematology VU University Medical Center VU-Institute of Cancer and Immunology

More information

Using CyTOF Data with FlowJo Version 10.0.7. Revised 2/3/14

Using CyTOF Data with FlowJo Version 10.0.7. Revised 2/3/14 Using CyTOF Data with FlowJo Version 10.0.7 Revised 2/3/14 Table of Contents 1. Background 2. Scaling and Display Preferences 2.1 Cytometer Based Preferences 2.2 Useful Display Preferences 3. Scale and

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

More information

Exiqon Array Software Manual. Quick guide to data extraction from mircury LNA microrna Arrays

Exiqon Array Software Manual. Quick guide to data extraction from mircury LNA microrna Arrays Exiqon Array Software Manual Quick guide to data extraction from mircury LNA microrna Arrays March 2010 Table of contents Introduction Overview...................................................... 3 ImaGene

More information

Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

Local outlier detection in data forensics: data mining approach to flag unusual schools

Local outlier detection in data forensics: data mining approach to flag unusual schools Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential

More information

II. RELATED WORK. Sentiment Mining

II. RELATED WORK. Sentiment Mining Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

More information

Uses of Flow Cytometry

Uses of Flow Cytometry Uses of Flow Cytometry 1. Multicolour analysis... 2 2. Cell Cycle and Proliferation... 3 a. Analysis of Cellular DNA Content... 4 b. Cell Proliferation Assays... 5 3. Immunology... 6 4. Apoptosis... 7

More information

An Introduction to Point Pattern Analysis using CrimeStat

An Introduction to Point Pattern Analysis using CrimeStat Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

More information

Exploratory Data Analysis for Ecological Modelling and Decision Support

Exploratory Data Analysis for Ecological Modelling and Decision Support Exploratory Data Analysis for Ecological Modelling and Decision Support Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and 5th ECEM conference,

More information

Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.

Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study. Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

BD FACSCalibur. The flow cytometer for your routine cell analysis needs

BD FACSCalibur. The flow cytometer for your routine cell analysis needs BD FACSCalibur The flow cytometer for your routine cell analysis needs A system with a rich application basis and a modular approach that continues to meet evolving needs of cell analysis worldwide. The

More information

Gates/filters in Flow Cytometry Data Visualization

Gates/filters in Flow Cytometry Data Visualization Gates/filters in Flow Cytometry Data Visualization October 3, Abstract The flowviz package provides tools for visualization of flow cytometry data. This document describes the support for visualizing gates

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

A. FSC and SSC gating of total BM cells. B. Gating strategy used to identify the Lin -

A. FSC and SSC gating of total BM cells. B. Gating strategy used to identify the Lin - Supplementary Figure legends Figure S1. Multiparametric analysis of HSC Populations. A. FSC and SSC gating of total BM cells. B. Gating strategy used to identify the Lin - cell population. BM cells were

More information

Course on Functional Analysis. ::: Gene Set Enrichment Analysis - GSEA -

Course on Functional Analysis. ::: Gene Set Enrichment Analysis - GSEA - Course on Functional Analysis ::: Madrid, June 31st, 2007. Gonzalo Gómez, PhD. ggomez@cnio.es Bioinformatics Unit CNIO ::: Contents. 1. Introduction. 2. GSEA Software 3. Data Formats 4. Using GSEA 5. GSEA

More information

Supplementary Material. Free-radical production after post-thaw incubation of ram spermatozoa is related to decreased in vivo fertility

Supplementary Material. Free-radical production after post-thaw incubation of ram spermatozoa is related to decreased in vivo fertility 10.1071/RD14043_AC CSIRO 2015 Supplementary Material: Reproduction, Fertility and Development, 2015, 27(8), 1187 1196. Supplementary Material Free-radical production after post-thaw incubation of ram spermatozoa

More information

Compensation Basics - Bagwell. Compensation Basics. C. Bruce Bagwell MD, Ph.D. Verity Software House, Inc.

Compensation Basics - Bagwell. Compensation Basics. C. Bruce Bagwell MD, Ph.D. Verity Software House, Inc. Compensation Basics C. Bruce Bagwell MD, Ph.D. Verity Software House, Inc. 2003 1 Intrinsic or Autofluorescence p2 ac 1,2 c 1 ac 1,1 p1 In order to describe how the general process of signal cross-over

More information

Supplementary Materials for

Supplementary Materials for www.sciencesignaling.org/cgi/content/full/7/339/ra80/dc1 Supplementary Materials for Manipulation of receptor oligomerization as a strategy to inhibit signaling by TNF superfamily members Julia T. Warren,

More information

Distances, Clustering, and Classification. Heatmaps

Distances, Clustering, and Classification. Heatmaps Distances, Clustering, and Classification Heatmaps 1 Distance Clustering organizes things that are close into groups What does it mean for two genes to be close? What does it mean for two samples to be

More information

Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software

Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software Page 327 Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software Andrew Choo, Thorsten Saeger TriQuint Semiconductor Corporation 2300 NE Brookwood Parkway, Hillsboro, OR 97124

More information

There are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows:

There are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows: Statistics: Rosie Cornish. 2007. 3.1 Cluster Analysis 1 Introduction This handout is designed to provide only a brief introduction to cluster analysis and how it is done. Books giving further details are

More information