flowtrans: A Package for Optimizing Data Transformations for Flow Cytometry



Similar documents
FlowMergeCluster Documentation

Automated Quadratic Characterization of Flow Cytometer Instrument Sensitivity (flowqb Package: Introductory Processing Using Data NIH))

Analyzing Flow Cytometry Data with Bioconductor

THE BIOCONDUCTOR PACKAGE FLOWCORE, A SHARED DEVELOPMENT PLATFORM FOR FLOW CYTOMETRY DATA ANALYSIS IN R

>


Using CyTOF Data with FlowJo Version Revised 2/3/14

Copyright Soleran, Inc. esalestrack On-Demand CRM. Trademarks and all rights reserved. esalestrack is a Soleran product Privacy Statement

Gates/filters in Flow Cytometry Data Visualization

COMPENSATION MIT Flow Cytometry Core Facility

Immunophenotyping Flow Cytometry Tutorial. Contents. Experimental Requirements...1. Data Storage...2. Voltage Adjustments...3. Compensation...

flowcore: data structures package for flow cytometry data

Selected Topics in Electrical Engineering: Flow Cytometry Data Analysis

CyTOF2. Mass cytometry system. Unveil new cell types and function with high-parameter protein detection

CELL CYCLE BASICS. G0/1 = 1X S Phase G2/M = 2X DYE FLUORESCENCE

Introduction to the BioConductor framework Algorithmic Analysis of Flow Cytometry Data (Part 1)

CELL CYCLE BASICS. G0/1 = 1X S Phase G2/M = 2X DYE FLUORESCENCE

Data exploration with Microsoft Excel: analysing more than one variable

Spline Toolbox Release Notes

Technical Bulletin. Threshold and Analysis of Small Particles on the BD Accuri C6 Flow Cytometer

Andrew Ilyas and Nikhil Patil - Optical Illusions - Does Colour affect the illusion?

No-wash, no-lyse detection of leukocytes in human whole blood on the Attune NxT Flow Cytometer

--Verity Software House, ModFit for Proliferation offers online tutorials

FCAP Array v3.0 Software: A New Tool to Analyze BD Cytometric Bead Array (CBA) Data. Monisha Sundarrajan, PhD BD Biosciences

Lecture 2. Summarizing the Sample

BD FACSComp Software Tutorial

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

Principal Component Analysis

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Cell Cycle Tutorial. Contents

Canonical Correlation Analysis

Linear Algebra Review. Vectors

Parametric Technology Corporation. Pro/ENGINEER Wildfire 4.0 Tolerance Analysis Extension Powered by CETOL Technology Reference Guide

Imputing Missing Data using SAS

INSIDE THE BLACK BOX

CyAn : 11 Parameter Desktop Flow Cytometer

DELPHI 27 V 2016 CYTOMETRY STRATEGIES IN THE DIAGNOSIS OF HEMATOLOGICAL DISEASES

Using self-organizing maps for visualization and interpretation of cytometry data

Flow Cytometry for Everyone Else Susan McQuiston, J.D., MLS(ASCP), C.Cy.

Structural Analysis of Network Traffic Flows Eric Kolaczyk

Chapter 3. The Normal Distribution

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Biggar High School Mathematics Department. National 5 Learning Intentions & Success Criteria: Assessing My Progress

MarkerView Software for Metabolomic and Biomarker Profiling Analysis

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

BD FACSDiva TUTORIAL TSRI FLOW CYTOMETRY CORE FACILITY

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

Module 3: Correlation and Covariance

Variables Control Charts

an introduction to VISUALIZING DATA by joel laumans

MODULE 7: FINANCIAL REPORTING AND ANALYSIS

Introduction to flow cytometry

Introduction to Matrix Algebra

ARE YOU A RADICAL OR JUST A SQUARE ROOT? EXAMPLES

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

Compensation Basics - Bagwell. Compensation Basics. C. Bruce Bagwell MD, Ph.D. Verity Software House, Inc.

MBA 611 STATISTICS AND QUANTITATIVE METHODS

+ The Killer Question For Every Manager

Viewing Ecological data using R graphics

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

FIVE STEPS TO MANAGE THE CUSTOMER JOURNEY FOR B2B SUCCESS. ebook

Characteristics of Binomial Distributions

Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements

Multivariate Analysis of Variance (MANOVA)

An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Research Methodology: Tools

Week 1. Exploratory Data Analysis

VDPs. Setting Up Google Analytics to Measure Your Dealership s Website Performance. A Digital Marketer s Guide

CADENCE LAYOUT TUTORIAL

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Exhibit memory of previously-learned materials by recalling facts, terms, basic concepts, and answers. Key Words

Data logger and analysis tools

Create a PivotTable or PivotChart report

123count ebeads Catalog Number: Also known as: Absolute cell count beads GPR: General Purpose Reagents. For Laboratory Use.

Estimating Industry Multiples

Data Analysis Tools. Tools for Summarizing Data

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Tolerance Charts. Dr. Pulak M. Pandey.

SPSS Introduction. Yi Li

Article from: Risk Management. June 2009 Issue 16

If there is not a Data Analysis option under the DATA menu, you will need to install the Data Analysis ToolPak as an add-in for Microsoft Excel.

App coverage. ericsson White paper Uen Rev B August 2015

What Personality Measures Could Predict Credit Repayment Behavior?

Channel Bonding in DOCSIS 3.0. Greg White Lead Architect Broadband Access CableLabs

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Katharina Lückerath (AG Dr. Martin Zörnig) adapted from Dr. Jörg Hildmann BD Biosciences,Customer Service

IMPROVING THE CRM SYSTEM IN HEALTHCARE ORGANIZATION

The Visual Statistics System. ViSta

Dongfeng Li. Autumn 2010

Chapter 17. System Adoption

Cognos 8 Best Practices

Microeconomic Theory: Basic Math Concepts

Transcription:

flowtrans: A Package for Optimizing Data Transformations for Flow Cytometry Greg Finak, Raphael Gottardo October 13, 2014 greg.finak@ircm.qc.ca, raphael.gottardo@ircm.qc.ca Contents 1 Licensing 2 2 Overview 2 3 Example: Optimizing Data Transformations for the GvHD Data Set 2 4 Future Improvements 4 1

1 Licensing Under the Artistic License, you are free to use and redistribute this software. Greg Finak, Juan Manuel Perez, Andrew Weng, Raphael Gottardo. Optimizing Data Transformations for Flow Cytometry. (In Preparation) 2 Overview flowtrans is an R package for optimizing the parameters of the most commonly used flow cytometry data transformations with the goal of making the data appear more normally distributed, have decreased skewness or kurtosis. The end result is often a transformation that makes flow cytometry cell populations appear more symmetric, better resolved, thus simplifying population gating. In high throughput experiments, flowtrans reduces the variability in the location of discovered cell populations when compared to untransformed data, or data transformed with default transformation parameters. The package implements a single user callable function flowtrans, which takes a flowframe as input, as well as the names of the dimensions to be transformed, and the name of the transformation function to be applied. There are four transformations available in flowtrans. The transforms are multivariate, are meant to transform two or more dimensions, and will estimate a common set of transformation parameters for all dimensions simultaneously via profile maximum likelihood, with the global distribution of the transformed data tending towards multivariate normal. All the transformation functions in the package are summarized in Table 1. Function Name Transformation Dimensionality Parameters Optimization Criteria mclmultivboxcox Box Cox Multivariate Common Normality mclmultivarcsinh ArcSinh Multivariate Common Normality mclmultivbiexp Biexponential Multivariate Common Normality mclmultivlinlog LinLog Multivariate Common Normality Table 1: A summary of the transformations that can be called via the flowtrans function. 3 Example: Optimizing Data Transformations for the GvHD Data Set We present a simple example, transforming the samples in the GvHD data set. We begin by attaching the data, then transforming each sample in the forward and side scatter dimensions with the parameter optimized multivariate arcsinh transform. 2

> library(flowtrans) > data(gvhd) > transformed<-lapply(as(gvhd[1:4],"list"),function(x)flowtrans(dat=x,fun="mclmultivarcsinh" We can extract the parameters used for the transformation of each dimension in each sample. The returned object is a list of transformed samples, each sample is itself composed of a list with the first element corresponding to the transformed flowframe, and the second element a list of the transformation parameters applied to each dimension. For example, to extract the parameters used to transform sample 2, dimension 1 (FSC) we would call extractparams(x=transformed[[2]],dims=colnames(transformed[[2]])[1]). > parameters<-do.call(rbind,lapply(transformed,function(x)extractparams(x)[[1]])) > parameters; a b c s5a01 0.9999977 0.99999390 0 s5a02 0.0000000 0.28811803 0 s5a03 0.0000000 0.05392773 0 s5a04 1.0000000 0.12207816 0 We see that the parameters, a and b, of arcsinh vary considerably across the different samples, due to the data-dependence of the optimal transformation. We can plot the transformed and untransformed samples for comparison (samples 2 and 4 in this case). > par(mfrow = c(2, 2)) > plot(gvhd[[2]], c("fsc-h", "SSC-H"), main = "Untransformed sample") > contour(gvhd[[2]], c("fsc-h","ssc-h"),add=true); > plot(transformed[[2]]$result, c("fsc-h", "SSC-H"), main = "Transformed sample") > contour(transformed[[2]]$result,c("fsc-h","ssc-h"),add=true); > plot(gvhd[[4]], c("fsc-h", "SSC-H"), main = "Untransformed sample") > contour(gvhd[[4]], c("fsc-h","ssc-h"),add=true); > plot(transformed[[4]]$result, c("fsc-h", "SSC-H"), main = "Transformed sample") > contour(transformed[[4]]$result, c("fsc-h","ssc-h"),add=true); 3

Untransformed sample Transformed sample 0 400 800 1 2 3 4 5 6 0 200 400 600 800 3.5 4.0 4.5 5.0 5.5 6.0 Untransformed sample Transformed sample 0 400 800 2 3 4 5 0 200 400 600 800 3.0 3.5 4.0 4.5 5.0 5.5 Similarly, we can apply the multivariate transformation of the forward and side scatter channels, but return only the parameters, for inclusion in a flowcore workflow. > transformed2<-flowtrans(dat=gvhd[[2]],fun="mclmultivarcsinh",dims=c("fsc-h","ssc-h"),n2f=f > transformed2 a b c 0.000000 0.288118 0.000000 4 Future Improvements In the near future, we plan to implement inverse transformation functions via S4 classes and objects, such that transformed data can be inverse transformed 4

without the need to call internal flowtrans functions. 5