Power and Sample Size Concepts and Calculations

Similar documents
Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Confidence Intervals for One Mean

Hypothesis testing. Null and alternative hypotheses

5: Introduction to Estimation

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Determining the sample size

PSYCHOLOGICAL STATISTICS

1. C. The formula for the confidence interval for a population mean is: x t, which was

One-sample test of proportions

Incremental calculation of weighted mean and variance

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

I. Chi-squared Distributions

Lesson 15 ANOVA (analysis of variance)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Practice Problems for Test 3

Output Analysis (2, Chapters 10 &11 Law)

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

Statistical inference: example 1. Inferential Statistics

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Lesson 17 Pearson s Correlation Coefficient

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Math C067 Sampling Distributions


1 Correlation and Regression Analysis

Chapter 7 Methods of Finding Estimators

Chapter 14 Nonparametric Statistics

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Normal Distribution.

Now here is the important step

Chapter 7: Confidence Interval and Sample Size

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Hypergeometric Distributions

Quadrat Sampling in Population Ecology

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

1 Computing the Standard Deviation of Sample Means

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Sampling Distribution And Central Limit Theorem

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 11 Financial mathematics

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Measures of Spread and Boxplots Discrete Math, Section 9.4

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

Confidence intervals and hypothesis tests

Confidence Intervals for Linear Regression Slope

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Properties of MLE: consistency, asymptotic normality. Fisher information.

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

CHAPTER 3 THE TIME VALUE OF MONEY

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

OMG! Excessive Texting Tied to Risky Teen Behaviors

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

A Guide to the Pricing Conventions of SFE Interest Rate Products

Pre-Suit Collection Strategies

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

Maximum Likelihood Estimators.

The Stable Marriage Problem

G r a d e. 2 M a t h e M a t i c s. statistics and Probability

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Mathematical goals. Starting points. Materials required. Time needed

Soving Recurrence Relations

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

LECTURE 13: Cross-validation

The Forgotten Middle. research readiness results. Executive Summary

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

Hypothesis testing using complex survey data

Exploratory Data Analysis

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Institute of Actuaries of India Subject CT1 Financial Mathematics

A Recursive Formula for Moments of a Binomial Distribution

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

Section 11.3: The Integral Test

ODBC. Getting Started With Sage Timberline Office ODBC

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Solving Logarithms and Exponential Equations

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Example: Probability ($1 million in S&P 500 Index will decline by more than 20% within a

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

3 Basic Definitions of Probability Theory

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

FM4 CREDIT AND BORROWING

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Transcription:

Power ad Sample Size Cocepts ad Calculatios For a variety of statistical settigs, oe commoly approaches power ad sample size issues from oe of three startig poits: What sample size is required to achieve a cofidece iterval of specified width? What sample size is ecessary to achieve a specified power? What power is achieved give a specified sample size? I order to do ay of these calculatios, a estimate of the stadard deviatio i your sample (or possibly i each of several samples) is required. I may situatios, power ad sample size calculatios are doe before collectig data (i fact, it is usually a good idea to do so). Fortuately, it is possible to provide a estimate of the SD without havig see ay data. If you ca estimate the rage (largest mius smallest) of values you will likely see, that rage divided by three or four is a decet guess for the SD.

Sample Size Determied by Precisio Goals The width, which we will characterize by the margi of error, of cofidece itervals is liked directly to sample size. A smaller desired margi of error requires greater precisio (i.e. a smaller stadard error), which i tur requires a larger sample size. I these otes, we assume that the samplig distributio of the relevat statistic is ormal. We ll address the cocepts usig as example oe of the simplest of settigs, a cofidece iterval (CI) for a populatio mea, based o a sigle sample mea, assumig Normality for the samplig distributio of the mea. If you are t familiar with basic cofidece iterval costructio ad termiology,. There are three essetially equivalet ways you ca impose some chose precisio o the CI. The oe we ll use here is the margi of error, m.e.: me.. = t 1,.95 s. The margi of error is called by some the cofidece iterval halfwidth. Aother commo approach (techically equivalet) is to state the desired precisio i terms of the CI width w: ( w 2( me..)) =.

A third way is as a percet, as i, We wat to be withi plus or mius 10%. Here the desired precisio (we ll call it relative margi of error, ad label it with rme) is relative to the size of the mea; the coectio to m.e. is.. rme = 100% me. y The take-home poit: cofidece iterval precisio i terms of CI margi of error, width, or relative margi of error are all equivalet. It is easy to traslate from ay oe style to either of the other two. Give a choice of margi of error, the coectio to sample size is straightforward coceptually, ad, at first glace, looks like it leads to a easy computatio: me.. = t 1,.975 s ( square bothsides;swap ad me..) 2 s = t 1,.975 me... The problem with this formula is that oe eeds to kow i order to kow t (sice the correct t-distributio depeds o the degrees of freedom) i order to calculate i order to compute t i order headache material. A way out of the problem: it is a fact that as icreases, the multiplier t b g gets closer ad closer to z b1 α 2g (the correspodig 1, 1 α 2

value from a stadard Normal distributio). I fact, t b 1, 1 α 2 g is always goig to be larger tha zb 1 α 2 g sample size estimate usig zb 1 α 2 g for ay sample size. If we do a iitial i the formula (which we ca do easily), we will have a sample size that is smaller (likely ot by much) tha actually required. We the iteratively icrease our estimated sample size util we reach the desired precisio. A example may make this clearer. Example. A biologist wishes to make a 95% CI for a populatio mea, ad wats the iterval to be plus or mius 10% of the target mea. She estimates, after coferrig with colleagues, that the mea will be about 50. These together mea that her desired CI margi of error is 5. She ad her colleagues also estimate that the stadard deviatio will be about 10. Steps Formula Calculatio Use basic formula to get a (slight uder-) estimate z s me...975 = 2 = 2 F196. 10I HG K J = 5 15. 37 Use 16 (rouded up from 15.37) to see what m.e. is attaied. me.. = t 16 1,.975 s 2.13 10 me.. = = 5.325. 16 That s bigger tha our target. Add 1 to the sample size; try agai me.. = t 1,.975 s 2.12 10 me.. = = 5.142. 17

That s still bigger tha our target. Add 1 to the sample size; try agai. This time it works. me.. = t 1,.975 s 2.11 10 me.. = = 4.973. 18 Clearly this would be pretty tedious to do routiely. To do these calculatios easily ad effortlessly,. Sample Size to Achieve Specified Power The power of a statistical test is the probability of correctly detectig a existig effect (we ll defie effect shortly). Power is a fuctio of 1. the effect size (power icreases as effect size icreases); ad 2. the chose alpha level (power icreases with larger alpha levels); 3. the sample size (power icreases as sample size icreases). Two type of errors ca occur whe doig a hypothesis test. Oe, a socalled false sigificace (or false rejectios of the ull hypothesis), has a specified chace, alpha, of occurrig. The other, a false retetio of the

ull, occurs whe oe fails to reject the ull hypothesis, but i fact some alterate is true. This probability goes by the ame beta. Power is 1 beta. to visually study the iterplay betwee these two probabililities. Power of a Statistical Test I this discussio, we will assume that the samplig distributio of the relevat statistic is ormal. The power of a test is the probability of correctly rejectig the ull hypothesis whe a specified alterate is true. Swift Fox Example: (adapted from a study by Olso (1999)). Oe questio of iterest i a study of swift foxes was whether home rage size icreased from summer to fall. A sample of 10 summer home rages yielded a mea of 1000ha, with a SD of 320ha. The samplig distributio of the mea is assumed to be ormal, with stadard deviatio estimated from the stadard error of the summer sample to be se( x ) s 320 = = 100. 10 This is a oe-tailed hypothesis test (they are oly iterested i icreases), for which they chose to use the covetioal α = 0.05.

Effect Size I order to do a power aalysis, oe eeds to establish effect sizes for which to do the calculatios. Effect size is the differece betwee the value of a parameter uder the ull hypothesis ad its value uder a alterate. Effect size is a commoly used geeric term, the cotextdepedet meaig of which will hopefully be made clear through the followig examples. Example 1. I a fisheries study, 15 radomly chose et sites were chose. A oe-sided, oe-sample t-test was used to test whether the mea umber of trout per et was 6 (the historical average), or if, i fact, the mea umber appears to be decliig. The biologist wated to be sure to have high power to detect a declie of 2 fish per et, if ideed the declie was that large. Here, the effect size of iterest was a chage of 2 i the mea umber of trout caught per et. Example 2. A study was doe to compare male ad female salaries amog biologists of similar rak ad seiority i a federal agecy. They performed a two-sided, two-sample t-test for the ull hypothesis of o differece. I discussio, it was decided that if the true differece was more tha $2,000 per year, they wated to be sure to detect it. Here, the effect size, was a differece of $2000 i the mea salaries.

I a regressio study, the effect size might be a differece i slopes (that is, some chose differece from the value beig tested i the ull hypothesis). Effect size, is a geeric term that truly gais its meaig from the particular cotext. Oe way to approach the questio of effect size is to choose a rage of effect sizes. Pick a effect that is sufficietly small so that ay effects (should they exist) that are ay smaller are ot of much iterest. Pick a effect that is sufficietly large so that ay effects that are ay larger are quite importat to detect. Aalyze power/sample sizes for that rage of effects. Swift Fox Example. For thi study, a icrease of 10% (100ha) was chose for the smaller value, ad 50% (500ha) for the larger. Statistical Decisio Rule For the swift fox example, the researchers used α = 0.05, which dictates thatp-values less tha 0.05 would lead to rejectio of the ull hypothesis, while for p-values larger tha 0.05, we would fail to do so (at least, this is the formal, mechaistic way to use p-values ad alpha). I a Normal distributio, 5% of the values are larger tha 1.65 stadard deviatios above the mea. Give our SE of 100, we would, expect

meas larger tha 1165 about 5% of the time if the ull hypothesis (o icrease i home rage size) were true. The key poit to keep i mid is that a mea home rage size larger tha 1165 will have a p-value less tha α = 0.05, ad will lead to rejectio of the ull hypothesis. Let s cosider the cosequeces of this decisio rule uder the assumptio that, i fact, the average home rage i the fall is 1100ha (for a effect size of 100); ad let s examie the cosequeces for for 1500ha (effect size is 500). The other pieces we eed for the calculatios are: SD = 320; = 10; α =.05; ad the fact that the test is oe-tailed. Write those figures o a scrap of paper, ad. What power is achieved give a specified sample size? The power of a statistical test is a fuctio of the effect size (power icreases as effect size icreases); the alpha level (power icreases with larger alpha levels); ad the sample size (power icreases as sample size icreases). This questio is the complemet of calculatig the sample size required to achieve a give power. Usually, this questio is asked for existig studies, to seek a better uderstadig of their limitatios ad for plaig for future studies. The swift fox example had home rage size estimates from = 10 fox pairs. What size of chage (if ay) i home rage size ca they detect with that sample size?

Calculatios for Paired data A paired t-aalysis proceeds as a aalysis of a sigle sample, after havig subtracted the values i oe sample from those i the other. If you have existig data to work with, the use the sample of differeces to directly estimate the stadard deviatio of the differeces SD d ad proceed followig the sigle sample istructios. The difficulty i workig through power ad sample size calculatios without existig data is that the stadard deviatio of the differeces is a fuctio of the stadard deviatio i each sample ad the correlatio betwee the samples, which is really hard to estimate without data. That said, if you are blessed with eough isight ito your data, you ca use as a estimate SD = S 2 2r, where S is the (assumed to be the same for both) d stadard deviatio withi each sample, ad r is the estimated correlatio betwee the samples. The followig is a tool that icorporates several desigs (two idepedet samples, paired, ad a mixture of the two) alog with flexible variace structures (assumed equal variaces, arbitrary (your choice) variaces, SD proportioal to meas, etcetera):.