2 Precision-based sample size calculations



Similar documents
General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

3.4 Statistical inference for 2 populations based on two samples

Math 108 Exam 3 Solutions Spring 00

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Confidence Intervals for One Standard Deviation Using Standard Deviation

Sample Size Planning, Calculation, and Justification

Unit 26: Small Sample Inference for One Mean

Research Methods & Experimental Design

Social Studies 201 Notes for November 19, 2003

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

HYPOTHESIS TESTING: POWER OF THE TEST

Lecture Notes Module 1

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Simple Linear Regression Inference

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Fixed-Effect Versus Random-Effects Models

11. Analysis of Case-control Studies Logistic Regression

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Principles of Hypothesis Testing for Public Health

What are confidence intervals and p-values?

IS 30 THE MAGIC NUMBER? ISSUES IN SAMPLE SIZE ESTIMATION

Confidence Intervals for the Difference Between Two Means

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Sample Size and Power in Clinical Trials

Binary Diagnostic Tests Two Independent Samples

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction to Hypothesis Testing

Name: Date: Use the following to answer questions 3-4:

Confidence Intervals for Cpk

Statistics in Medicine Research Lecture Series CSMC Fall 2014

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Statistics Review PSY379

Mind on Statistics. Chapter 13

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 2. Hypothesis testing in one population

Chapter 7 - Practice Problems 1

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Confidence Intervals for Cp

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Confidence Intervals for Spearman s Rank Correlation

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Biostatistics: Types of Data Analysis

Tests for Two Proportions

Using Stata for One Sample Tests

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

p ˆ (sample mean and sample

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Chapter Eight: Quantitative Methods

Confidence Intervals for Exponential Reliability

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Sample size estimation is an important concern

Chapter 3. Sampling. Sampling Methods

How To Check For Differences In The One Way Anova

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS

Erik Parner 14 September Basic Biostatistics - Day 2-21 September,

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics

August 2012 EXAMINATIONS Solution Part I

Guide to Biostatistics

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Unit 26 Estimation with Confidence Intervals

DATA INTERPRETATION AND STATISTICS

Introduction to Hypothesis Testing OPRE 6301

Data Analysis Tools. Tools for Summarizing Data

PRACTICE PROBLEMS FOR BIOSTATISTICS

CHAPTER 7 PRESENTATION AND ANALYSIS OF THE RESEARCH FINDINGS

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Recall this chart that showed how most of our course would be organized:

Testing a claim about a population mean

How to get accurate sample size and power with nquery Advisor R

Standard Deviation Estimator

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Study Guide for the Final Exam

1-3 id id no. of respondents respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

Permutation Tests for Comparing Two Populations

4. Continuous Random Variables, the Pareto and Normal Distributions

Chapter 23 Inferences About Means

Chapter 7. One-way ANOVA

Hypothesis testing - Steps

Point Biserial Correlation Tests

Chapter 7: Simple linear regression Learning Objectives

CLUSTER SAMPLE SIZE CALCULATOR USER MANUAL

Two Related Samples t Test

1. How different is the t distribution from the normal?

Analysis and Interpretation of Clinical Trials. How to conclude?

Transcription:

Statistics: An introduction to sample size calculations Rosie Cornish. 2006. 1 Introduction One crucial aspect of study design is deciding how big your sample should be. If you increase your sample size you increase the precision of your estimates, which means that, for any given estimate / size of effect, the greater the sample size the more statistically significant the result will be. In other words, if an investigation is too small then it will not detect results that are in fact important. Conversely, if a very large sample is used, even tiny deviations from the null hypothesis will be statistically significant, even if these are not, in fact, practically important. In practice, this means that before carrying out any investigation you should have an idea of what kind of change from the null hypothesis would be regarded as practically important. The smaller the difference you regard as important to detect, the greater the sample size required. Factors such as time, cost, and how many subjects are actually available are constraints that often have to be taken account of when designing a study, but these should not dictate the sample size there is no point in carrying out a study that is too small, only to come up with results that are inconclusive, since you will then need to carry out another study to confirm or refute your initial results. There are two approaches to sample size calculations: Precision-based With what precision do you want to estimate the proportion, mean difference... (or whatever it is you are measuring)? Power-based How small a difference is it important to detect and with what degree of certainty? 2 Precision-based sample size calculations Suppose you want to be able to estimate your unknown parameter with a certain degree of precision. What you are essentially saying is that you want your confidence interval to be a certain width. In general a 95% confidence interval is given by the formula: Estimate ± 2(approx) 1 SE where SE is the standard error of whatever you are estimating. 1. This is because 95% confidence intervals are usually based on the normal distribution or a t-distribution for a normal distribution the value is 1.96; for t-distributions the value is generally just over 2. 1

The formula for any standard error always contains n, the sample size. Therefore, if you specify the width of the 95% confidence interval, you have a formula that you can solve to find n. Example 1 Suppose you wish to carry out a trial of a new treatment for hypertension (high blood pressure) among men aged between 50 and 60. You randomly select 2n subjects. n of these receive the new treatment and n receive a the standard treatment, then you measure each subject s systolic blood pressure. You will analyse your data by comparing the mean blood pressure in the two groups i.e. carrying out an unpaired t-test and calculating a 95% confidence interval for the true difference in means. You would like your 95% confidence interval to have width 10 mmhg (i.e. you want to be 95% sure that the true difference in means is within ± 5 mmhg of your estimated difference in means. How many subjects will you need to include in your study? We know that the 95% confidence interval for a difference in means is given by ( x 1 x 2 ) ± 2(approx) s p 1 n 1 + 1 n 2 Hence, we want 2 s p 1 n 1 + 1 n 2 to be equal to 5 s p 1n1 + 1 n 2 = s p 2n 2.5 (since we are aiming for groups of the same size). In order to work out our sample sizes we therefore need to know what s p is likely to be. This is either known from (a) previous experience (i.e. knowledge of the distribution of systolic blood pressure among men with hypertension in this age group), (b) using other published papers on blood pressure studies in a similar group of people or (c) carrying out a pilot study. I have used option (b) to get a likely value for s p of 20 mmhg. This gives 2 2.5 = 20 n n ( ) 20 2 2 = n = 128 (in each group) 2.5 If you wanted your true difference in means to be within ±2.5 mmhg rather than ±5 mmhg of your estimate, this would become ( ) n 20 2 2 = n = 512 1.25 i.e. if you want to increase your precision by a factor of 2, you have to increase your sample size by a factor of 4. In general, if you want to increase your precision by a factor k, you will need to increase your sample size by a factor k 2. This applies across the board i.e. whether you are estimating a proportion, a mean, a difference in means, etc. etc. Example 2 Supposing you are investigating a particular intervention to reduce the risk of malaria mortality among young children under the age of five in The Gambia, in West Africa. You know that the risk of dying from malaria in this age group is about 10% and you want the risk difference to be estimated to within ±2%. A 95% confidence interval for a difference in proportions is 2

given by p1 (1 p 1 ) (p 1 p 2 ) ± 1.96 + p 2(1 p 2 ) p1 (1 p 1 ) + p 2 (1 p 2 ) = (p 1 p 2 ) ± 1.96 n 1 n 2 n if the sample size in each group is the same. As stated previously, we normall approximate 1.96 by 2. We therefore want p1 (1 p 1 ) + p 2 (1 p 2 ) n 0.02/2 = 0.01 To work out the required sample size, we usually take p 1 = p 2 = the value closer to 0.5, since this would give rise to a larger standard error and therefore a larger sample size (it is always better to err on the side of caution in sample size calculations because (a) you often get drop-outs, so it s better to have too many rather than too few in your sample to start with and (b) they are never 100% exact anyway, since you base them on estimates of the standard error, not on known values. So, in this case we have 2(0.1)0.9) n = 0.01 n = 2(0.1)(0.9) 0.01 2 = 1800 (in each group) To summarise, in order to carry out any precision-based sample size calculation you need to decide how wide you want your confidence interval to be and you need to know the formula for the relevant standard error. Putting these together will give you a formula which can be rearranged to find n. 3 Power-based sample size calculations We have seen above that precision-based sample size calculations relate to estimation. Powerbased sample size calculations, on the other hand, relate to hypothesis testing. In this handout, the formulae for power-based sample size calculations will not be derived, just presented. Definitions Type I error (false positive) Concluding that there is an effect (e.g. that two treatments differ) when they do not α = P(type I error) = level of statistical significance [= P (reject H 0 H 0 true)] Type II error (false negative) Concluding that there is NO effect (e.g. that there is no difference between treatments) when there actually is. β = P(type II error) [= P (accept H 0 H 1 true)] Power The (statistical) power of a trial is defined to be 1 β [= P (reject H 0 H 1 true)] 3

3.1 Power calculations: quantitative data Suppose you want to compare the mean in one group to the mean in another (i.e. carry out an unpaired t-test). The number, n, required in each group is given by n = f(α, β) 2s2 δ 2 Where: α is the significance level (using a two-sided test) i.e. your cut-off for regarding the result as statistically significant. 1 β is the power of your test. f(α, β) is a value calculated from α and β see table below. δ is the smallest difference in means that you regard as being important to be able to detect. s is the standard deviation of whatever it is we re measuring this will need to be estimated from previous studies. Example f(α, β) for the most commonly used values for α and β β α 0.05 0.1 0.2 0.5 0.05 13.0 10.5 7.9 3.8 0.01 17.8 14.9 11.7 6.6 Returning to the blood pressure example. Suppose we want to be 90% sure of detecting a difference in mean blood pressure of 10 mmhg as significant at the 5% level (i.e. power = 0.9, β = 0.1, α = 0.05). We have, from above, s = 20 mmhg. Using the table, we get f(α, β) = 10.5. This gives You would need 84 subjects in each group. n = f(α, β) 2s2 2(20)2 = 10.5 = 84 δ2 10 2 Obviously, if you increase the power or want to use a lower value for α as your cut-off for statistical significance, you will need to increase the sample size. 3.2 Power calculations: categorical data Suppose we are comparing a binary outcome in two groups of size n. Let p 1 = proportion of events (deaths/responses/recoveries etc.) in one group p 2 = proportion of events in the other group We need to choose a value for p 1 p 2, the smallest practically important difference in proportions that we would like to detect (as significant). We also need to have some estimate of the proportion of events expected. This can often be obtained from routinely collected data or previous studies. The number of subjects required for each group is given by n = p 1(1 p 1 ) + p 2 (1 p 2 ) (p 1 p 2 ) 2 f(α, β) 4

Example A new treatment has been developed for patients who ve had a heart attack. It is known that 10% of people who ve suffered from a heart attack die within one year. It is thought that a reduction in deaths from 10% to 5% would be clinically important to detect. Again we will use α = 0.05 and β = 0.1. We have p 1 = proportion of deaths in placebo group = 0.1, p 2 = proportion of deaths in treatment group = 0.05. This gives n = 0.1(0.9) + 0.05(0.95) (0.1 0.05) 2 10.5 = 578 Thus, 578 patients would be needed in each treatment group to be 90% sure of being able to detect a reduction in mortality of 5% as significant at the 5% level. The size of difference regarded as clinically important to be able to detect has a strong effect on the required sample size. For example, to detect a reduction from 10% to 8% as significant (using the same α and β as above) 4295 patients would be needed in each treatment group. This may be clinically reasonable (since any reduction in mortality would probably be regarded as important) but perhaps too expensive. Conversely, to detect a reduction from 10% to 1% as significant, 130 patients would be needed in each group. This kind of reduction would perhaps be over-optimistic and therefore a trial of this size would be unlikely to be conclusive. 4 Further notes 1. Sample size formulae in other situations In this handout I have only presented sample size calculations for certain types of analysis. Similar formulae can be obtained for other types of analysis by reference to appropriate texts. 2. Increasing power or precision without having to increase sample size As indicated previously, certain constraints (cost etc.) may mean that the required sample size is impossible to achieve. In these situations it may be possible to increase precision or power by changing the design of the study. For example, using a paired design can often be more efficient since you are considering individual (within-subject) differences rather than between-subject differences; the former are usually less variable than the latter (i.e. the standard error is lower) and therefore the sample size required to detect a given difference will be lower. 3. Allowing for non-response / withdrawals In most studies, particularly those involving humans, there is likely to be a certain amount of data lost (or never gathered) from the original sample. This could be for a variety of different reasons: non-response (e.g. to a survey); subjects deliberately withdrawing from a study or getting lost in some other way (e.g. cannot be traced); subjects in a clinical trial not following their allocated treatment; or missing data (e.g. on a questionnaire). Allowance should be made for this when determining the sample size i.e. the sample size should be increased accordingly. The extent to which this is needed should be guided by previous experience or a pilot study. 5