Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.



Similar documents
Independent t- Test (Comparing Two Means)

Simple Linear Regression Inference

Descriptive Statistics

Sample size estimation is an important concern

Statistics in Medicine Research Lecture Series CSMC Fall 2014

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Chapter 5 Analysis of variance SPSS Analysis of variance

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Section 13, Part 1 ANOVA. Analysis Of Variance

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Understanding and Quantifying EFFECT SIZES

12: Analysis of Variance. Introduction

Two Related Samples t Test

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Study Guide for the Final Exam

Two-sample hypothesis testing, II /16/2004

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

HYPOTHESIS TESTING WITH SPSS:

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Introduction to Longitudinal Data Analysis

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

How To Check For Differences In The One Way Anova

Sample Size Planning, Calculation, and Justification

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Projects Involving Statistics (& SPSS)

Study Design Sample Size Calculation & Power Analysis. RCMAR/CHIME April 21, 2014 Honghu Liu, PhD Professor University of California Los Angeles

Confidence Intervals for Cp

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Confidence Intervals on Effect Size David C. Howell University of Vermont

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Assessing Measurement System Variation

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Illustration (and the use of HLM)

Normality Testing in Excel

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 7 Section 7.1: Inference for the Mean of a Population

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

II. DISTRIBUTIONS distribution normal distribution. standard scores

Principles of Hypothesis Testing for Public Health

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Randomized Block Analysis of Variance

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Additional sources Compilation of sources:

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

CALCULATIONS & STATISTICS


When to use Excel. When NOT to use Excel 9/24/2014

Chapter Eight: Quantitative Methods

Regression Analysis: A Complete Example

AP STATISTICS (Warm-Up Exercises)

Statistics Review PSY379

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

The Statistics Tutor s Quick Guide to

THE KRUSKAL WALLLIS TEST

Moderation. Moderation

Math 108 Exam 3 Solutions Spring 00

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

11. Analysis of Case-control Studies Logistic Regression

SPSS TUTORIAL & EXERCISE BOOK

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

HYPOTHESIS TESTING: POWER OF THE TEST

January 26, 2009 The Faculty Center for Teaching and Learning

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Analysis of Variance. MINITAB User s Guide 2 3-1

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

2 Sample t-test (unequal sample sizes and unequal variances)

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Pearson's Correlation Tests

How To Run Statistical Tests in Excel

ABSORBENCY OF PAPER TOWELS

1.5 Oneway Analysis of Variance

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

Part 2: Analysis of Relationship Between Two Variables

Lean Six Sigma Black Belt-EngineRoom

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Lecture Notes Module 1

Statistics in Retail Finance. Chapter 2: Statistical models of default

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Data Analysis in SPSS. February 21, If you wish to cite the contents of this document, the APA reference for them would be

individualdifferences

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Is it statistically significant? The chi-square test

Introduction to Statistics and Quantitative Research Methods

SPSS Guide: Regression Analysis

Transcription:

Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts an experiment comparing two new methods with a well established method. Eight subjects are randomly assigned to one of the three methods. The data analysis reveals a p- value of 0.08 for the effect of method. What now? March 30, 2006 An insignificant effect: two possibilities There may truly be no effect. There may truly be an effect. If this experiment is repeated, what is the probability of detecting a significant effect, given that it truly exists? POWER What is the power of this experiment? Based on these results, how many subjects are required to increase the probability of detecting a significant effect in the future? Overview Definition of power and related terms Estimating parameters needed in power calculations Overview of available software 1

Power is a critical design component Level of significance: alpha Power increases as alpha increases Effect size: What is a meaningful change in the response? Power increases as effect size increases How many subjects are required? Power increase as sample size increases Decision Power as defined by hypothesis testing Fail to reject Null Reject Null State of Nature Null is True Correct Type I Error α Alternative is True Type II Error β Correct (power) Typical Values α =0.05 1-β=0.8-0.9 Relationship between Power and Alpha β As alpha increases Power increases Relationship between Power and Effect Size Effect size quantifies what we are hoping to detect => change in treatment means, group proportions, etc. x A =80 and x B =100 then the difference=20 Standardize to remove units: (difference) divided by standard deviation Researcher determines effect size What change would be of scientific interest? Statistical significance doesn t imply practical significance Critical value 2

Relationship between Power and Effect Size Relationship between Power and Sample Size β As the standardized difference between the null and alternative means increases, power increases Sample size, n, is used to estimate the 2 ( y y 2 reliability of our statistics: i i ) s = n 1 When creating a confidence interval, we use a standard error: SE = X 2 s / n As the sample size increases, these measures of variability decrease => more confident in our results Relationship between Power and Sample Size As variance decreases, beta (green) shrinks and power increases σ =0.9 σ =1.5 What questions will a power analysis answer? We can estimate power, effect size, or sample size; given any two, the third can be calculated 1. Experiment detected ES=0.45 with n=20 subjects; what is the power? 2. If we have 20 subjects and a power of 0.85, what ES can we detect? 3. An ES of 0.45 would be of scientific interest and we desire a power of 0.85, how many subjects are required? 3

Calculating power The next step is to estimate values needed to conduct a power analysis. Estimating the effect size Power calculations are part art, part science Catch-22: Formulae need means, population variance. But if we knew these values, we wouldn t need to do study! Remember that sample size calculations require estimates and assumptions. While these need to be close to the true values, they do not need to be perfect. Consider an ANOVA with three methods (treatment). xmax xmin ES = σ 1. Use means from previous studies 2. Estimate the mean of each new method. 3. Estimate what will be the largest mean and the smallest mean. 4. Estimate what percent change in the means will be of interest. For example, will a 15% difference be scientifically significance? Estimate the population standard deviation 1. Use values from previous studies, from the control method 2. From previous work, estimate the magnitude of the variance 3. Consider what the maximum and minimum variance values may be, and use the average of these two values 4. Consider the possible range of data values, and estimate range of values σ = 4 4

Standard Effect Sizes Cohen has suggested standard effect sizes ANOVA Correlation Regression Small 0.10 0.10 0.02 Medium 0.25 0.30 0.15 Large 0.40 0.50 0.35 Conducting the analysis: choice of software Minitab GPower Basic tests (t-tests, one proportion, one way ANOVA) Basic tests, two way ANOVA, multiple regression, chi-square Potentially misleading Does not incorporate knowledge about the specific study parameters Cohen urges caution; recommends using only when study specific values are not available SAS PASS Basic tests, two and higher order ANOVA Most extensive: repeated measure, random effects ANOVA, logistic regression, multiple regression, survival analysis Two sample t-test in Minitab G-Power ES entered as differences and standard deviation Enter multiple values 2-Sample t Test Alpha = 0.05 Assumed standard deviation = 2 Sample Target Difference Size Power Actual Power 2.5 12 0.80 0.83 2.5 13 0.85 0.86 2.5 15 0.90 0.91 3.0 9 0.80 0.85 3.0 10 0.85 0.89 3.0 11 0.90 0.92 The sample size is for each group. G-Power a priori, post-hoc, and compromise power More options A priori before conducting the experiment Post-hoc after data analysis; lower than a priori power. Test shows insignificant result; what was the power? Compromise when N is really large or really small (Cohen) Based on researchers view of whether Type I or Type II analysis is more serious. Accuracy versus speed The speed option is fast but inaccurate and the accuracy option is very accurate (up to five significant digits at least). The accuracy option may take a little longer to compute but it usually is only a couple of seconds. Download from: www.psycho.uni-duesseldorf.de/aap/projects/gpower/ 5

One way ANOVA in G-Power Click Tests-> F-test (ANOVA) Calculate effect size in another window Nice option - graphs One way ANOVA in G-Power Cohen s effects G-Power: Calculate effects G-Power: Specify graphs 6

G-Power: Power curves Power Analyses in SAS Power 1.00 0.95 0.90 0.85 0.80 0.75 0.70 0.65 20 30 40 50 60 70 Total Sample Size Several procedures UnifyPow Macro -> Proc Power (v9.1) Many procedures Proc GLMPower Calculates interactions in two-way and higher ANOVA models. SAS Proc GLMPower code data one; input gender $ condition $ level mean @@; datalines; m A 1 6.5 m A 2 4.8 m B 1 7.0 m B 2 5.5 f A 1 4.5 f A 2 3.6 f B 1 4.8 f B 2 5.5 ; run; proc glmpower data=one; class gender condition level; model mean = gender condition level gender*level; power alpha = 0.05 stddev = 2 power = 0.80 0.90 ntotal =.; run; Enter a valid GLM model SAS Proc GLMPower Output The GLMPOWER Procedure Dependent Variable mean Alpha 0.05 Error Standard Deviation 2 Computed N Total Nominal Test Error Actual N Source Power DF DF Power Total gender 0.8 1 67 0.806 72 gender 0.9 1 91 0.905 96 condition 0.8 1 171 0.800 176 condition 0.9 1 235 0.906 240 level 0.8 1 171 0.800 176 level 0.9 1 235 0.906 240 gender*level 0.8 1 227 0.812 232 gender*level 0.9 1 299 0.903 304 7

In Summary Proper planning of a study, including a solid power analysis, is an essential step of a good research study Run the analyses several times, with varying input parameters Remember that you need good estimates, not perfect ones One advantage of being a statistician is that we only need to be right 95% of the time References Cohen, J., Statistical Power Analysis for the Behavioral Sciences, 2 nd ed., New Jersey: Lawrence Erlbaum Associates, 1988. Faul, F. and Erdfelder, E. (1992) GPOWER: A priori, post-hoc and compromise power analysis for MS-DOS [Computer program]. Bonn, FRG: Bonn University, Dept. of Psychology. SCC Workshops (Fall 2006) www.stat.psu.edu/~scc Workshop Name Dates SAS Data Management SAS Introduction to Procedures Overview of Minitab, SPSS (Regression, ANOVA, ANCOVA) EDA, Proc Summary GLM vs. Mixed Categorical Data Analysis Power Analysis 1) September 12 th 2) October10 1) September 19 th & 21 st 2) October 17 th & 19 th 1) September 12 th & 14 th 2) October 10 th & 12 th October 6 th October 6 th October 7 th October 14 th More information on our web site after 8/1/06 8