Comparing Functional Data Analysis Approach and Nonparametric Mixed-Effects Modeling Approach for Longitudinal Data Analysis
|
|
- Lucas Hensley
- 8 years ago
- Views:
Transcription
1 Comparing Functional Data Analysis Approach and Nonparametric Mixed-Effects Modeling Approach for Longitudinal Data Analysis Hulin Wu, PhD, Professor (with Dr. Shuang Wu) Department of Biostatistics & Computational Biology University of Rochester Medical Center October, 1 Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
2 Table of contents 1 Introduction Comparisons: NPME vs. fpca-pace 3 Comparisons: Individual Smoothing vs. fpca-integration Method Summary and Conclusion Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
3 Question to Address Nonparametric longitudinal data analysis methods: Nonparametric mixed-effects models Functional PCA analysis Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 3 / 31
4 Analysis of longitudinal studies Parametric mixed-effects models: LME and NLME models: e.g. y i = X i β + Z i b i + ɛ i, b i N (, D), ɛ i N (, R i ), i = 1,,..., n Parametric Restrictive Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
5 Nonparametric mixed-effects (NPME) model y i (t) = µ(t) + ν i (t) + ɛ i (t) = p q β j B j (t) + b ik Bk (t) + ɛ i(t) j=1 k=1 Regression splines: Various choices of basis functions, known Mixed-effects modeling: Borrow information from across-subjects (curves), shrink to the mean Estimation: MLE or REML (SAS, R) Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
6 Functional approach based on principal component analysis Y ij = X i (t ij ) + ɛ ij K = µ(t ij ) + ξ ik φ k (t ij ) + ɛ ij k=1 Mean function µ(t): any nonparametric smoothing method Between-subject (curve) variation K ξ ik φ k (t ij ): Karhunen-Loeve k=1 approximation Both PC scores (ξ ik ) and basis functions (eigenfunctions φ k (t)): need to be estimated from data PC scores (coefficients): estimated by PACE: mixed-effects modeling idea to borrow information across subjects (curves) Integration method: individual estimate for each subject (curve) Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 6 / 31
7 Simulation Comparisons: NPME and fpca-pace y i (t) = a i + a i1 cos(πt) + a i sin(πt) + ɛ i (t), a i = [a i, a i1, a i ] T N [(1,, 1), diag(σ, σ1, σ)], ɛ i (t) N [, σɛ (1 + t)], i = 1,,..., n t j = j/(m + 1), j = 1,,..., m n =, m = Unbalanced data: r miss =.,.,.8 ISE = (ˆµ(t) µ(t)) dt MISE = 1 n n (ŷ i (t) y i (t)) dt i=1 Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 7 / 31
8 Simulation I: small variation, (σ, σ 1, σ ) = (, 1, 1) 8 6 y i t Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 8 / 31
9 Simulation I: small variation, (σ, σ 1, σ ) = (, 1, 1) r miss Model Mean function Individual fits LPME.1 (.19).3733 (.88) % RSME.13 (.118).3733 (.88) PACE.177 (.133).38 (.118) LPME.169 (.116).618 (.813) % RSME.19 (.98).618 (.813) PACE.177 (.18).693 (.18) LPME. (.19) 1.3 (.76) 8% RSME.131 (.11) 1.3 (.76) PACE.1 (.189) (.691) Winner: Nonparametric mixed-effects (NPME) models Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 9 / 31
10 Simulation II: large variation, (σ, σ 1, σ ) = (3, 3, 3) 1 1 y i t Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
11 Simulation II: large variation, (σ, σ 1, σ ) = (3, 3, 3) r miss Model Mean function Individual fits LPME.31 (.77) 1.96 (.31) % RSME.36 (.797) 1.96 (.31) PACE.3639 (.31).11 (.67) LPME.31 (.6) (.66) % RSME.397 (.7) (.66) PACE.388 (.97) (.6) LPME.16 (.3) (1.36) 8% RSME.69 (.6) (1.36) PACE.616 (.36) (.31) Mean function estimate winner: NPME model Individual function estimate winner: fpca-pace Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 11 / 31
12 Example 1: Viral load in AIDS clinical trials viral load time(day) n = 6 patients, n i is 1, with a median of 8. Mean function estimates: RSME (blue), FPCA (red). Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
13 Viral load: individual fits 6 Patient 3 6 Patient 9 6 Patient 13 Patient 18 6 Patient 3 6 Patient 6 Patient 6 Patient 3 6 Patient 6 6 Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 13 / 31
14 Example : Yeast cell cycle gene expressions 3 gene expression time(min) 67 genes, t j = 7 (j 1) (minute), j = 1,,..., 18. Gene expressions are centered by mean of each gene; contains missing data. Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
15 Yeast gene expressions: individual fits Gene 6 Gene 1937 Gene Gene Gene 3 1 Gene 1 Gene 6 1 Gene 71 1 Gene Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
16 Time-course microarray gene expressions Independent sampling: one measurement from each subject, e.g. mice Longitudinal sampling: repeated measurements from same subject, e.g. human Features of data: number of genes n very large, usually several thousands number of time points m small (m 1) very few replications at each time point, usually or 3 noisy, possibly with missing data Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 16 / 31
17 Time-course microarray gene expressions Problem interested: identify differentially expressed genes One group: difference from baseline; variation over time Two or more groups: difference between groups Methods: ANOVA approach: treat time variable as a particular experimental factor (instant extension from static microarray experiments) Continuous approach: treat gene expressions as noisy measurements from an underlying function; nonparametric estimation of the underlying function (possibly with random effects) Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 17 / 31
18 Time-course microarray gene expressions y ijk = x i (t j ) + ɛ ijk, i = 1,..., n; j = 1,,..., m; k = 1,..., K L x i (t) = β il φ l (t), ɛ ijk (, σ ) l= H : x i (t) =, i = 1,..., n φ l (t): spline basis or PC basis In real data, no clear cut Statistics that provide a good ranking Multiple testing adjustment to control error rare, e.g. False Discovery Rate (FDR) Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 18 / 31
19 Methods Individual nonparametric smoothing (EDGE) φ l (t) as fixed basis statistics: goodness-of-fit (F statistics); area under curve (AUC) fpca-integration method (individual estimate of PC scores) φ l (t) as as eigenfunctions, estimated from entire samples statistics: area under curve (AUC) Both use bootstrap to calculate the null distribution of the statistics Significance cut-off by controlling FDR Applicable to both independence and longitudinal cases Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 19 / 31
20 Simulation study n=1, m = 1, K = 3 observations equidistant in [, 1] proportion of significant genes p =.1 Under H : y ijk = ɛ ijk, ɛ ijk N (,. ) Under H 1 : y ijk = a i sin(ω i π(t j b i )) + ɛ ijk, where a i, ω i U(., ), b i U(, 1). simulations Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
21 Simulation I Error under H 1 : ɛ ijk N (,. ) EDGE num rejected corr rejected FDR FNR FDR= FDR= FDR= PCA num rejected corr rejected FDR FNR FDR= FDR= FDR= Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 1 / 31
22 Simulation II Error under H 1 : ɛ ijk N (, (. v i ) ), v i is a dispersion factor EDGE num rejected corr rejected FDR FNR FDR= FDR= FDR= PCA num rejected corr rejected FDR FNR FDR= FDR= FDR= Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
23 Gene data from lungs of mice number of probes: n = 37 days post infection (DPI):, 1,..., 1 (m = 11) repetition: 3 for DPI= 1,..., 1, 6 for DPI= (3 no flu virus, 3 killed immediately after receiving flu virus) normalized by Welle lab using the PLIER normalization method; log-transformation H : x i (t) = baseline, t Baseline 1: gene expression for DPI=, no flu virus Baseline : gene expression for DPI=, immediately after receiving flu virus Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 3 / 31
24 Gene data from lungs of mice: Baseline 1 EDGE (F) EDGE (AUC) PCA (AUC) 397 (FDR=.1) (FDR=.) 7133 (FDR=.) EDGE fails: oversmoothed observe an increase in gene expression between DPI=, no flu virus and DPI=, immediately after receiving flu virus = stress genes Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
25 Baseline 1: top 9 genes selected by PCA, not by EDGE (AUC) Gene 136 Gene 31 Gene Gene Gene Gene Gene Gene Gene Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 / 31
26 Baseline 1: top 9 genes selected by EDGE(AUC), not by PCA Gene Gene Gene Gene 33 1 Gene Gene Gene Gene Gene Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 6 / 31
27 Gene data from lungs of mice: Baseline EDGE (F) EDGE (AUC) PCA (AUC) 119 (FDR=.1) 1 (FDR=.) 3 (FDR=.) 3 p values by PCA 1 p values by EDGE (auc) Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 7 / 31
28 Baseline : top 9 genes selected by PCA, not by EDGE (AUC) Gene 1136 Gene 67 Gene Gene Gene Gene 17 1 Gene Gene 61 1 Gene Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 8 / 31
29 Baseline : top 9 genes selected by EDGE (AUC), not by PCA 1 Gene Gene 1379 Gene Gene 68 1 Gene Gene Gene Gene 6778 Gene Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 9 / 31
30 Summary Nonparametric longitudinal data analysis methods: Individual nonparametric smoothing Not borrow information across subjects (curves) at all Deal with complete different curves for different subjects FPCA-individual estimates of PC scores Weakly borrow information across subjects via PC basis estimate PC basis: adaptive for some between-subject (Curve) variations FPCA-PACE Borrow information across subjects via mixed-effects PC score estimate PC basis: adaptive for large between-subject (Curve) variation Nonparametric mixed-effects (NPME) modeling Strongly borrow information across subjects (curves) Deal with longitudinal data with similar patterns Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 3 / 31
31 References Storey et al. () Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences, 1, Wu, H. and Zhang, J.-T. (6) Nonparametric regression methods for longitudinal data analysis: mixed-effects modeling approaches. John Wiley & Sons, New York. Yao, F., Müller, H.-G., and Wang, J.-L. () Functional linear regression analysis for longitudinal data. The Annals of Statistics, 33, Hulin Wu, PhD, Professor (with Dr. Shuang Wu) FDA (UR) and NPME for Longitudinal Data Analysis October, 1 31 / 31
Study Design Sample Size Calculation & Power Analysis. RCMAR/CHIME April 21, 2014 Honghu Liu, PhD Professor University of California Los Angeles
Study Design Sample Size Calculation & Power Analysis RCMAR/CHIME April 21, 2014 Honghu Liu, PhD Professor University of California Los Angeles Contents 1. Background 2. Common Designs 3. Examples 4. Computer
More informationStatistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
More informationPackage empiricalfdr.deseq2
Type Package Package empiricalfdr.deseq2 May 27, 2015 Title Simulation-Based False Discovery Rate in RNA-Seq Version 1.0.3 Date 2015-05-26 Author Mikhail V. Matz Maintainer Mikhail V. Matz
More informationStudy Design and Statistical Analysis
Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationMIC - Detecting Novel Associations in Large Data Sets. by Nico Güttler, Andreas Ströhlein and Matt Huska
MIC - Detecting Novel Associations in Large Data Sets by Nico Güttler, Andreas Ströhlein and Matt Huska Outline Motivation Method Results Criticism Conclusions Motivation - Goal Determine important undiscovered
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationAppendix 1: Time series analysis of peak-rate years and synchrony testing.
Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationINFERRING GENE DEPENDENCY NETWORKS FROM GENOMIC LONGITUDINAL DATA: A FUNCTIONAL DATA APPROACH
REVSTAT Statistical Journal Volume 4, Number 1, March 2006, 53 65 INFERRING GENE DEPENDENCY NETWORKS FROM GENOMIC LONGITUDINAL DATA: A FUNCTIONAL DATA APPROACH Authors: Rainer Opgen-Rhein Department of
More informationFunctional Data Analysis for Sparse Longitudinal Data
Fang YAO, Hans-Georg MÜLLER, and Jane-Ling WANG Functional Data Analysis for Sparse Longitudinal Data We propose a nonparametric method to perform functional principal components analysis for the case
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationStatistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant
Statistical Analysis NBAF-B Metabolomics Masterclass Mark Viant 1. Introduction 2. Univariate analysis Overview of lecture 3. Unsupervised multivariate analysis Principal components analysis (PCA) Interpreting
More informationTwo-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...
Two-Way ANOVA tests Contents at a glance I. Definition and Applications...2 II. Two-Way ANOVA prerequisites...2 III. How to use the Two-Way ANOVA tool?...3 A. Parametric test, assume variances equal....4
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationFunctional Analysis of Real World Truck Fuel Consumption Data
Technical Report, IDE0806, January 2008 Functional Analysis of Real World Truck Fuel Consumption Data Master s Thesis in Computer Systems Engineering Georg Vogetseder School of Information Science, Computer
More informationStatistics in Medicine Research Lecture Series CSMC Fall 2014
Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationIntroduction to Biostatistics PhD programs
Introduction to Biostatistics PhD programs Takumi Saegusa University of Washington Department of Biostatistics April 17 2013 Takumi Saegusa (UW Biostat) Biostat PhD April 17 2013 1 / 35 Outline What is
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationIntegrating DNA Motif Discovery and Genome-Wide Expression Analysis. Erin M. Conlon
Integrating DNA Motif Discovery and Genome-Wide Expression Analysis Department of Mathematics and Statistics University of Massachusetts Amherst Statistics in Functional Genomics Workshop Ascona, Switzerland
More informationRNA-seq. Quantification and Differential Expression. Genomics: Lecture #12
(2) Quantification and Differential Expression Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #12 Today (2) Gene Expression per Sources of bias,
More informationBIO 226: APPLIED LONGITUDINAL ANALYSIS COURSE SYLLABUS. Spring 2015
BIO 226: APPLIED LONGITUDINAL ANALYSIS COURSE SYLLABUS Spring 2015 Instructor: Teaching Assistants: Dr. Brent Coull HSPH Building II, Room 413 Phone: (617) 432-2376 E-mail: bcoull@hsph.harvard.edu Office
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationGene Expression Analysis
Gene Expression Analysis Jie Peng Department of Statistics University of California, Davis May 2012 RNA expression technologies High-throughput technologies to measure the expression levels of thousands
More informationPublication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore
Publication List Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publications Journal Papers 1. Y. He and Z. Chen (2014). A sequential procedure for feature selection
More informationFunctional Principal Components Analysis with Survey Data
First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 Functional Principal Components Analysis with Survey Data Hervé CARDOT, Mohamed CHAOUCH ( ), Camelia GOGA
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More information0 value2. 3. Assign labels to the proteins in each interval of the ranked list
Ranked list of protein degrees in decreasing order 0 100 Proteins with few connections Proteins with many connections 1. Sort a random number between 80 and 98 (value1) 0 value1 2. Sort a random number
More informationPREDA S4-classes. Francesco Ferrari October 13, 2015
PREDA S4-classes Francesco Ferrari October 13, 2015 Abstract This document provides a description of custom S4 classes used to manage data structures for PREDA: an R package for Position RElated Data Analysis.
More informationCancer Biostatistics Workshop Science of Doing Science - Biostatistics
Cancer Biostatistics Workshop Science of Doing Science - Biostatistics Yu Shyr, PhD Jan. 18, 2008 Cancer Biostatistics Center Vanderbilt-Ingram Cancer Center Yu.Shyr@vanderbilt.edu Aims Cancer Biostatistics
More informationLongitudinal Data Analysis
Longitudinal Data Analysis Acknowledge: Professor Garrett Fitzmaurice INSTRUCTOR: Rino Bellocco Department of Statistics & Quantitative Methods University of Milano-Bicocca Department of Medical Epidemiology
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationFunctional Models and Principle Components Analysis Through Conditional Expectation
Functional Data Analysis for Sparse Longitudinal Data Short title: FDA for Sparse Longitudinal Data Fang Yao, Hans-Georg Müller and Jane-Ling Wang Final Version September 8, 24 Fang Yao is Assistant Professor,
More informationThis can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
More informationMolecular Genetics: Challenges for Statistical Practice. J.K. Lindsey
Molecular Genetics: Challenges for Statistical Practice J.K. Lindsey 1. What is a Microarray? 2. Design Questions 3. Modelling Questions 4. Longitudinal Data 5. Conclusions 1. What is a microarray? A microarray
More informationFalse Discovery Rates
False Discovery Rates John D. Storey Princeton University, Princeton, USA January 2010 Multiple Hypothesis Testing In hypothesis testing, statistical significance is typically based on calculations involving
More informationDATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University
DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science
More informationQuantitative proteomics background
Proteomics data analysis seminar Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post transcriptional regulation of key cellular processes de Groot, M., Daran
More informationExploratory data analysis for microarray data
Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany heydebre@molgen.mpg.de Visualization
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationStatistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
More informationIntroduction to data analysis: Supervised analysis
Introduction to data analysis: Supervised analysis Introduction to Microarray Technology course May 2011 Solveig Mjelstad Olafsrud solveig@microarray.no Most slides adapted/borrowed from presentations
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationTime series experiments
Time series experiments Time series experiments Why is this a separate lecture: The price of microarrays are decreasing more time series experiments are coming Often a more complex experimental design
More informationPost-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.
Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main
More informationStatistics in Applications III. Distribution Theory and Inference
2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied
More informationIntroducing the Multilevel Model for Change
Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationNonparametric Regression Methods for Longitudinal Data Analysis
Nonparametric Regression Methods for Longitudinal Data Analysis HULIN WU University of Rochester Dept. of Biostatistics and Computer Biology Rochester, New York JIN-TING ZHANG National University of Singapore
More informationApplying Statistics Recommended by Regulatory Documents
Applying Statistics Recommended by Regulatory Documents Steven Walfish President, Statistical Outsourcing Services steven@statisticaloutsourcingservices.com 301-325 325-31293129 About the Speaker Mr. Steven
More informationSTA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationCourse on Functional Analysis. ::: Gene Set Enrichment Analysis - GSEA -
Course on Functional Analysis ::: Madrid, June 31st, 2007. Gonzalo Gómez, PhD. ggomez@cnio.es Bioinformatics Unit CNIO ::: Contents. 1. Introduction. 2. GSEA Software 3. Data Formats 4. Using GSEA 5. GSEA
More informationComparative genomic hybridization Because arrays are more than just a tool for expression analysis
Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Carsten Friis ( with several slides from
More informationDeterministic and Stochastic Modeling of Insulin Sensitivity
Deterministic and Stochastic Modeling of Insulin Sensitivity Master s Thesis in Engineering Mathematics and Computational Science ELÍN ÖSP VILHJÁLMSDÓTTIR Department of Mathematical Science Chalmers University
More informationChapter 1. Longitudinal Data Analysis. 1.1 Introduction
Chapter 1 Longitudinal Data Analysis 1.1 Introduction One of the most common medical research designs is a pre-post study in which a single baseline health status measurement is obtained, an intervention
More informationIntroduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
More informationThe Friedman Test with MS Excel. In 3 Simple Steps. Kilem L. Gwet, Ph.D.
The Friedman Test with MS Excel In 3 Simple Steps Kilem L. Gwet, Ph.D. Copyright c 2011 by Kilem Li Gwet, Ph.D. All rights reserved. Published by Advanced Analytics, LLC A single copy of this document
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationCOURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences. 2015-2016 Academic Year Qualification.
COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences 2015-2016 Academic Year Qualification. Master's Degree 1. Description of the subject Subject name: Biomedical Data
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More informationPackage ERP. December 14, 2015
Type Package Package ERP December 14, 2015 Title Significance Analysis of Event-Related Potentials Data Version 1.1 Date 2015-12-11 Author David Causeur (Agrocampus, Rennes, France) and Ching-Fan Sheu
More informationVersion 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky
Version 4.0 Statistics Guide Statistical analyses for laboratory and clinical researchers Harvey Motulsky 1999-2005 GraphPad Software, Inc. All rights reserved. Third printing February 2005 GraphPad Prism
More informationTests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationOur Philosophy. Authentic Contexts. Provide relevant and meaningful courseware to promote deeper understanding
AcademyR Revolution Analytics partners with leading minds and industry experts to offer professional training courses designed to give your organization a quick start in building high performance analytical
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationAnalysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to
More informationMixed-effects regression and eye-tracking data
Mixed-effects regression and eye-tracking data Lecture 2 of advanced regression methods for linguists Martijn Wieling and Jacolien van Rij Seminar für Sprachwissenschaft University of Tübingen LOT Summer
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions
More informationRARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS
RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS I. Basic Course Information A. Course Number and Title: MATH 111H Statistics II Honors B. New or Modified Course:
More informationParametric and Nonparametric: Demystifying the Terms
Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationTrust, Job Satisfaction, Organizational Commitment, and the Volunteer s Psychological Contract
Trust, Job Satisfaction, Commitment, and the Volunteer s Psychological Contract Becky J. Starnes, Ph.D. Austin Peay State University Clarksville, Tennessee, USA starnesb@apsu.edu Abstract Studies indicate
More informationFunctional Data Analysis for Volatility
Functional Data Analysis for Volatility July 211 Hans-Georg Müller 1 University of California, Davis Rituparna Sen 2 University of California, Davis Ulrich Stadtmüller 3 University of Ulm Abstract We introduce
More informationStatistical Analysis Strategies for Shotgun Proteomics Data
Statistical Analysis Strategies for Shotgun Proteomics Data Ming Li, Ph.D. Cancer Biostatistics Center Vanderbilt University Medical Center Ayers Institute Biomarker Pipeline normal shotgun proteome analysis
More information1) The table lists the smoking habits of a group of college students. Answer: 0.218
FINAL EXAM REVIEW Name ) The table lists the smoking habits of a group of college students. Sex Non-smoker Regular Smoker Heavy Smoker Total Man 5 52 5 92 Woman 8 2 2 220 Total 22 2 If a student is chosen
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More informationTutorial for proteome data analysis using the Perseus software platform
Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationCourse Agenda. First Day. 4 th February - Monday 14.30-19.00. 14:30-15.30 Students Registration Polo Didattico Laterino
Course Agenda First Day 4 th February - Monday 14.30-19.00 14:30-15.30 Students Registration Main Entrance Registration Desk 15.30-17.00 Opening Works Teacher presentation Brief Students presentation Course
More informationBioavailability / Bioequivalence
Selection of CROs Selection of a Reference Product Metrics (AUC, C max /t max, Shape of Profile) Acceptance Ranges (0.80 1.25 and beyond) Sample Size Planning (Literature References, Pilot Studies) Steps
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationFrom Reads to Differentially Expressed Genes. The statistics of differential gene expression analysis using RNA-seq data
From Reads to Differentially Expressed Genes The statistics of differential gene expression analysis using RNA-seq data experimental design data collection modeling statistical testing biological heterogeneity
More informationMeasuring the HIV Reservoir BINGO Review Activity
Measuring the HIV Reservoir BINGO Review Activity Objectives Describe the differences in current technologies available for measuring the HIV reservoir Discuss the risk and benefit of each technology Methods
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationProbabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models
Fakultät IV Department Mathematik Probabilistic of Medium-Term Electricity Demand: A Comparison of Time Series Kevin Berk and Alfred Müller SPA 2015, Oxford July 2015 Load forecasting Probabilistic forecasting
More informationFunctional Data Analysis of MALDI TOF Protein Spectra
Functional Data Analysis of MALDI TOF Protein Spectra Dean Billheimer dean.billheimer@vanderbilt.edu. Department of Biostatistics Vanderbilt University Vanderbilt Ingram Cancer Center FDA for MALDI TOF
More informationPrinciples of Hypothesis Testing for Public Health
Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationHighlights the connections between different class of widely used models in psychological and biomedical studies. Multiple Regression
GLMM tutor Outline 1 Highlights the connections between different class of widely used models in psychological and biomedical studies. ANOVA Multiple Regression LM Logistic Regression GLM Correlated data
More informationProgramme du parcours Clinical Epidemiology 2014-2015. UMR 1. Methods in therapeutic evaluation A Dechartres/A Flahault
Programme du parcours Clinical Epidemiology 2014-2015 UR 1. ethods in therapeutic evaluation A /A Date cours Horaires 15/10/2014 14-17h General principal of therapeutic evaluation (1) 22/10/2014 14-17h
More informationANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups
ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of
More information