# Understanding the Statistical Power of a Test

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 1 Understanding the Statistical Power of a Test Hun Myoung Park Software Consultant UITS Center for Statistical and Mathematical Computing How powerful is my study (test)? How many observations do I need to have for what I want to get from the study? The statistical power analysis estimates the power of the test to detect a meaningful effect, given sample size, test size (significance level), and standardized effect size. Sample size analysis determines the sample size required to get a significant result, given statistical power, test size, and standardized effect size. These analyses examine the sensitivity of statistical power and sample size to other components, enabling researchers to efficiently use the research resources. 1. What Is a Hypothesis? A hypothesis is a specific conjecture (statement) about a property of population. There is a null hypothesis and an alternative (or research) hypothesis. Researchers often expect that evidence supports the alternative hypothesis. The null hypothesis, a specific baseline statement to be tested, usually takes such forms as no effect or no difference. 1 A hypothesis is either two-tailed (e.g., H 0 : µ = 0 ) or one-tailed (e.g., H 0 : µ 0 or H : µ 0 ). 2 0 Three points deserve being taken into account in making a hypothesis. A hypothesis should be specific enough to be falsifiable; otherwise, the hypothesis cannot be tested successfully. Second, a hypothesis is a conjecture about a population (parameter), not about a sample (statistic). Thus, H 0 : x = 0 is not valid because we can compute and know the sample mean x from a sample. Finally, a valid hypothesis is not based on the sample to be used to test the hypothesis. This tautological logic does not generate any productive information Size and Power of a Test The size of a test, often called significance level, is the probability of Type I error. The Type I error occurs when a null hypothesis is rejected when it is true (Table 1). This test size is denoted by α (alpha). The 1- α is called the confidence level. 1 Because it is easy to calculate test statistics (standardized effect sizes) and interpret the test results (Murphy 1998). 2 µ (Mu) represents population mean, while x denotes sample mean. 3 This behavior, often called data fishing, just hunts a model that best fits the sample, not the population.

2 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 2 In a two-tailed test, the test size (significance level) is the sum of the two symmetric areas at the tails of a probability distribution. See the shaded areas of two standard normal distributions in Figure 1. These areas are called null hypothesis rejection regions in the sense that we reject the null hypothesis if a test statistic falls into these regions. The test size is a subjective criterion, although the.10,.05, and.01 levels are conventionally used. Table 1. Size and Power of a Test Do not reject H 0 Reject H 0 H 0 is true Correct Decision 1-α: Confidence level Type I Error Size of a test α: Significance level H 0 is false Type II Error β Correct Decision 1-β: Power of a test Think about the.05 test size (see B in Figure 1). We need to know a particular value from which the sum of the areas up to the both infinities is.05. The value is called the critical value of the significance level. Critical values depend on test statistics, probability distributions, and test types (one-tailed versus two-tailed). Figure 1. Test Size (Significance Level) and Critical Value For example, the 1.96 (-1.96) in the standard normal distribution is the critical value of the two-tailed test at the.05 test size (significance level). Thus, the test size is the sum of probabilities that a sample statistic goes beyond the critical value (larger than 1.96 and less than -1.96). In the standard normal distribution, the critical value of a two-tailed test at the.10 significance level is (see A in Figure 1) and 2.58 for the test size.01. As

3 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 3 test size (significance level) decreases, the critical value is shifted to the extremes; the rejection areas become smaller; it is less likely to reject the null hypothesis. How do we substantively understand the test size or significance level? It is the extent that we are willing to take a risk of making wrong conclusion. The.05 level means that we are taking a risk of being wrong five times per 100 trials. A hypothesis test using a lenient test size like the.10 is more likely to reject the null hypothesis, but its conclusion is less convincing. In contrast, a stringent test size like.01 reports significant effects only when the effect size (deviation from the baseline) is large. Instead, the conclusion is more convincing (less risky). For example, the 1.90 in Figure 1 is considered exceptional at the.10 level (A), but not statistically discernable at the.05 level (B). What is the power of a test? The power of a statistical test is the probability that it will correctly lead to the rejection of a false null hypothesis (Greene 2000). The statistical power is the ability of a test to detect an effect, if the effect actually exists (High 2000). Cohen (1988) says, it is the probability that it will result in the conclusion that the phenomenon exists (p.4). A statistical power analysis is either retrospective (post hoc) or prospective (a priori). The statistical power is denoted by 1 ß, where ß (beta) is the Type II error, the probability of failing to reject the null hypothesis when it is false (Table 1). See the shaded areas of B and B in Figure 2. Conventionally a test with power greater than.8 level (or ß<=.2) is considered statistically powerful. 3. Components of the Statistical Power Analysis What do we need to consider when conducting the statistical power analyses? There are six components: 1. Model (test) 2. Standardized effect size: (1) effect size and (2) variation (variability) 3. Sample size (n) 4. Test size (significance level α) 5. Power of the test (1 ß) A research design contains specific models (tests) on which their statistical powers are based. Different models (tests) have different formulas to compute test statistics. For example, the T-test uses the T distribution to determine its statistical power, while ANOVA depends on the F distribution. A standardized effect size, a test statistic (e.g., T and F scores) is computed by combining the effect size and variation. 4 An effect size in actual units of the response is the degree to which the phenomenon exists (Cohen 1988). Alternatively, an effect size is the deviation of hypothesized value in the alternative hypothesis from the baseline in the null 4 In T-test, for example, the deviation (effect size) is divided by the standard error.

4 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 4 hypothesis. Variation (variability) is the standard deviation of population. Cohen (1988) calls it the reliability of sample results. This variation usually comes from previous research or pilot studies; otherwise, it needs to be estimated. Sample size (N) is the number of observations (cases) in a sample. As mentioned in the previous section, the test size or significance level (α) is the probability of rejecting the null hypothesis that is true. The power of the test (1 ß) is the probability of correctly rejecting a false null hypothesis. 4. Power Analysis Using an Example The computation of statistical power depends on specific a model (or test). The easiest model is the T-test with a relatively simple formula. Imagine a random variable like the number of deaths per 100 thousand people from lung cancer. Suppose that the variable is known to be normally distributed with a mean of 20 and a standard deviation of 4 ( H 0 : µ = 20 ). See the probability distribution A in Figure 2. Now, we believe that the mean is not 20, but 22 with the same standard deviation. See the probability distribution B in Figure 2. We also think that test size.05 will be fine. So, we are going to conduct a two-tailed T-test at the.05 significance level with the alternative hypothesis that the population mean is 22 ( H a : µ = 22 ). How can we test our conjecture (alternative hypothesis)? Suppose we took a random sample with 44 observations from the population. Think about the ordinary T-test first. We need to know how far the 22 is deviated from the baseline 20. Of course, the distance (effect size) is 2 (=22-20). But we do not know how big the effect size 2 is. Put differently, we do not know exactly how likely such a sample mean 22 can be observed if the true mean is 20. This is why people try to take advantage of using standardized probability distributions (e.g., T, F, and Chi-squared). By looking through these distribution tables, we are able to know the likelihood of observing a sample statistic in fairly easy manner. The A in Figure 2 is a standardized probability distribution of A. The 20 in A corresponds to 0 in A and in A is equivalent to in A. x µ The t statistic here is : t = n = 44. This value is located all s 4 the way to the right in A, indicating the p-value is extremely small. 5 If the null hypothesis of population mean 20 is true, it is quite unlikely to observe the sample mean 22 (p<=.01). Obviously, the conjecture of population mean 20 is not likely. Thus, the null hypothesis is undoubtedly rejected in favor of the alternative hypothesis. 5 The p-value of a test statistic is the sum of probabilities that the statistic and more exceptional (closer to both extremes) sample statistics are observed when the null hypothesis is true.

5 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 5 However, this test does not tell anything about the power of the test. We may want to know the extent that a test can detect the effect, if any. So, the question here is, How powerful is this T-test? There are four steps to compute the statistical power. Figure 2. Statistical Power of a T-test First, find the critical value in the original probability distribution with mean 20 and standard deviation 4. Let us use the standardized probability distribution A to know the corresponding critical value in the original distribution A. By looking at the T distribution table (df=43 at the.05 level), we know that the probability of being greater than or equal to is.025 (see A in Figure 2). Note that another.025 area is located at the

6 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 6 opposite direction because it is a two-tailed test. The is equivalent to = * in the original probability distribution A. 6 Next, imagine an alternative probability distribution with mean 22 and standard deviation 4 (see B in Figure 2). It is the original probability distribution shifted to the right by 2 units. What is the probability of being less than the in the alternative probability distribution B? The probability is ß. In order to know ß, again let us convert the original value into the standardized probability distribution of the alternative probability distribution. The question is, What is the T value for the in the alternative probability distribution? (B and B in Figure 2) The computations are made by one of the two approaches. We got t a t a = = = 44 = Third, find ß from the T distribution table with 43 (=N-1) degree of freedom. We can read This ß is the probability of falling into the region less than or equal to the in the standardized alternative probability distribution (see B in Figure 2). Finally, compute the statistical power, 1 β = P( t ta). The power is (= ). See the shaded areas of B and B in Figure 2. This high statistical power indicates that this T-test is highly likely to detect the effect or reject the null hypothesis that the population mean is Relationship among the Components Now, let us talk about how the components mentions in section 3 are related to each other. First, a model (or a test) dictates the formula for standardized effect sizes (test statistics). x µ s For example, T-test computes a standardized T score as t = n, where is an s n estimated population standard deviation (variation). How do standardized effect sizes affect the statistical power? As a standardized effect size increases, the power increases (positive relationship). Imagine an alternative probability distribution with mean 23. In order word, shift the alternative probability distribution B to the right by one unit, holding original probability distribution constant. Effect size becomes larger (3=23-20). It is obvious that ß becomes smaller and the statistical power, in turn, increases. By the same token, the power becomes smaller if we 6? = 44 4

7 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 7 shift the alternative distribution to the left by one to have mean 21. If the standardized effect size is small, it is difficult to detect effects even when they exist; it is less powerful. Now, how does the size of a test (significance level) affect the statistical power? If a researcher has a lenient significance level like the.10 rather than the.05 or.01, the statistical power of the test becomes larger (positive relationship). Imagine a line connecting in A, in A and B, and in B. And then shift the line to the left, leaving all probability distributions untouched. What will happen? The test size (alpha=significance level) increases; critical values in A and A are shifted to the left, increasing rejection areas; ß decreases; and finally statistical power increases. 7 Conversely, if we have a stringent significance level, the power of the test decreases; we are moving the line to the right! Figure 3. Relationships among the Components How about the sample size? A larger sample size generally leads to a parameter estimate with smaller variances, a larger standardized effect size, eventually, a greater ability to detect a significant difference (positive relationship). Look at the T statistic formula. In general, the most important component affecting the statistical power is the sample size. In fact, there is a little room to change a test size (significance level). It is also 7 There is a trade-off between the Type I error (alpha) and Type II error (beta). Moving the line to the left increases the Type I error, reducing the Type II error.

8 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 8 difficult to control effect sizes in many cases. It is costly and time-consuming to get more observations, of course. But the frequently asked question in practice is how many observations need to be collected. However, if too many observations are used (or if a test is too powerful with a large sample size), even a trivial effect will be mistakenly detected as a significant one. Thus, virtually anything can be proved regardless of actual effects. (High 2000). By contrast, if too few observations are used, the hypothesis test will result in low statistical power. 8 There may be little chance to detect a meaningful effect even when it exists there. How do we know if the number of our observations is reasonable? Sample size analysis can answer. 6. Applications We have conceptually discussed what the statistical power is. Now, let us analyze statistical power and sample size in several models using the SAS POWER procedure. a. Power analysis of a one-sample T-test Go back to the example mentioned in section four. Let me summarize the components of the statistical power analysis first. The goal is to compute statistical power of information about other components. Table 2. Summary Information for a Statistical Power Analysis Test Sample Size Test Size Power Effect Size Variation T-test 44.05? 2 (=22-20) 4 Let us use the SAS POWER procedure to conduct the same power analysis. You have to specify the type of a test first. The ONESAMPLEMEANS statement in the following example indicates the one-sample T-test. PROC POWER; ONESAMPLEMEANS ALPHA=.05 SIDES=2 NULLM=20 MEAN=22 STDDEV=4 NTOTAL=44 POWER=.; RUN; The ALPHA and SIDES options say a two-tailed test at the.05 significance level (test size). The NULLM option specifies the mean value of the null hypothesis (H 0 ), while MEAN option specifies hypothesized mean value (H a ). The STDDEV and the NTOTAL options respectively 8 Note that there is no clear cut-point of too many and too few, since it depends on models and specifications. For instance, if a model has many parameters to be estimated, or if a model uses the maximum likelihood estimation method, the model needs more observations than otherwise.

9 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 9 indicate the standard deviation (variation) and sample size (N). Finally, the POWER option ending with a period (.) asks SAS to compute the statistical power of this test. Take a look at the SAS output. The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 2 Null Mean 20 Alpha 0.05 Mean 22 Standard Deviation 4 Total Sample Size 44 Computed Power Power SAS summarizes the information about core components, and then returns the value of statistical power of the test. b. Sample size analysis of a one-sample T-test Let us change the scenario for a sample size analysis. Suppose we realize that some observations have unreliable information due to measurement errors, and that the population standard deviation is 3, not 4. Boss s guideline requires the.01 significance level in the upper one-tailed test and a lower power level.8. Now, we want to know the minimum sample size that can satisfy these conditions. Table 3. Summary Information for a Sample Size Analysis Test Sample Size Test Size Power Effect Size Variation T-test? (=22-20) 3 Look at the following SAS POWER procedure and its output. Note that the U of the SIDES option represents the upper one-tailed test with alternative value greater than the null value (L means a lower one-sided test). The test size (significance level) and standard deviation are corrected as requested. The NTOTAL option has a period (.), while the POWER option specifies the target level of statistical power. PROC POWER; ONESAMPLEMEANS ALPHA=.01 SIDES=U NULLM=20 MEAN=22 STDDEV=3 NTOTAL=.

10 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 10 RUN; POWER=.8; The following output says that only 26 observations are needed to reach the targeting power. The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Number of Sides U Null Mean 20 Alpha 0.01 Mean 22 Standard Deviation 3 Nominal Power 0.8 Computed N Total Actual Power N Total c. Power analysis of a paired-sample T-test Let us move on to the paired sample T-test. The PAIREDMEANS statement and the TEST=DIFF option are needed. The CORR option is also required to specify the correlation coefficient of the two paired variables, while the NPAIRS option specifies the number of pairs. You may list hypothesized numbers of pairs to see how statistical powers are sensitive to the number of pairs. PROC POWER; PAIREDMEANS TEST=DIFF ALPHA=.01 SIDES=2 MEANDIFF=3 STDDEV=3.5 CORR=.2 NPAIRS= POWER=.; RUN; Look at the following SAS output that produces three statistical powers according to the hypothesized sample sizes. The POWER Procedure Paired t Test for Mean Difference Fixed Scenario Elements Distribution Normal

11 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 11 Method Exact Number of Sides 2 Alpha 0.01 Mean Difference 3 Standard Deviation 3.5 Correlation 0.2 Null Difference 0 Computed Power N Index Pairs Power d. Power analysis of a two independent samples T-test The following example conducts a statistical power analysis for the two independent samples T-test. The TWOSAMPLEMEANS statement indicates the two independent samples T- test. The GROUPMEANS option species the means of two groups. The PLOT statement with X=N option draws a plot of statistical powers as N in the X-axis changes. PROC POWER; TWOSAMPLEMEANS ALPHA=.01 SIDES=2 GROUPMEANS= ( ) STDDEV= NTOTAL=44 POWER=.; PLOT X=N MIN=20 MAX=100 KEY=BYCURVE(NUMBERS=OFF POS=INSET); RUN; You may list more than one hypothesized standard deviation in the STDDEV option in order to know how statistical power of the test is sensitive to the standard deviations. The output is as follows. The POWER Procedure Two-sample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 2 Alpha 0.01 Group 1 Mean 2.79 Group 2 Mean 4.12 Total Sample Size 44 Null Difference 0 Group 1 Weight 1 Group 2 Weight 1

12 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 12 Computed Power Std Index Dev Power A plot of statistical powers and sample sizes visualizes the sensitivity of a test. The following plot shows that the power for standard deviation 1.5 is more sensitive to sample size than those for larger standard deviations when N is less than 60. Figure 4. A Plot of Statistical Power and Sample Size 1. 0 St d Dev=1. 5 St d Dev=2 St d Dev= Tot al Sampl e Si ze e. Power analysis of a one-way ANOVA Now, consider a one-way ANOVA. The POWER procedure has the ONEWAYANOVA statement for its analysis. Note that the NPERGROUP option specifies the number of observations of a group, assuming balanced data. 9 9 When each group has the same number of observations, we call them balanced data; otherwise, unbalanced data.

13 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 13 PROC POWER; ONEWAYANOVA TEST=OVERALL ALPHA=.05 GROUPMEANS= ( ) STDDEV= NPERGROUP= POWER=.; RUN; The above POWER procedure analyzes ANOVA with four different groups. Listing several standard deviations and numbers of observations conducts sensitivity analysis of the statistical power, producing the six combinations (=3 X 2). The POWER Procedure Overall F Test for One-Way ANOVA Fixed Scenario Elements Method Exact Alpha 0.05 Group Means Computed Power Std N Per Index Dev Group Power The SAS GLMPOWER procedure also conducts a power analysis for one-way ANOVA. Unlike the POWER procedure, this procedure requires an existing SAS data set. Like the ANOVA procedure, the CLASS and the MODEL statements are required in this GLMPOWER procedure. Note that the POWER is not an option, but a statement in this procedure. PROC GLMPOWER DATA=power.car; CLASS mode; MODEL credits=mode; POWER STDDEV=10 ALPHA=.01 NTOTAL=1000 POWER=.; RUN; The output of the GLMPOWER procedure is similar to that of the POWER procedure. The GLMPOWER Procedure Fixed Scenario Elements

14 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 14 Dependent Variable credits Source mode Alpha 0.01 Error Standard Deviation 10 Total Sample Size 1000 Test Degrees of Freedom 3 Error Degrees of Freedom 996 Computed Power Power f. Power analysis of a two-way ANOVA In order to conduct power analysis of the two-way ANOVA in SAS, we need to use the GLMPOWER procedure because the POWER procedure does not support this model. Again, note that the CLASS and MODEL statements are requited. PROC GLMPOWER DATA=power.car; CLASS owncar sex; MODEL credits= owncar sex; POWER STDDEV=10 ALPHA=.01 NTOTAL=. POWER=.8; RUN; Here is the output of the GLMPOWER procedure. The GLMPOWER Procedure Fixed Scenario Elements Dependent Variable credits Alpha 0.01 Error Standard Deviation 10 Nominal Power 0.8 Computed N Total Test Actual Index Source DF Error DF Power N Total 1 owncar sex Software Issues There are various software packages for statistical power and sample size analyses. Among them are the POWER and GLMPOWER of SAS/STAT, SPSS SamplePower 2.0,

15 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 15 and G*Power. 10 See the following for the list and review of statistical power analysis software. These software packages vary in scope, accuracy, flexibility, and interface (Thomas and Krebs 1997). Some packages may support a test, while others may not. They may be general purpose statistical software with such functions embedded (e.g., SAS) or professional software with specialty in statistical power and/or sample size analysis (e.g., G*Power). Some software package like SAS/STAT Power and Sample Size (PSS) is not stand-alone, but Web-based. Following Web site has a statistical power calculator than enables you to compute power online. Some software like G*Power runs under DOS mode. The majority of existing power analysis software works in Microsoft Windows. Some are written for UNIX machine. Most software uses the point-and-click interface, while others depend on command line. They may adopt different algorithms so that their results may be different. Figure 5 illustrates how to conduct the power analysis of a one-way ANOVA (Section 6.e) using G*Power. The result.9723 corresponds to the first case that has standard deviation 4 and 10 observations in each group. Figure 5. Power Analysis of a one-way ANOVA in G*Power Figure 6 is a screenshot of SamplePower 2.0, replicating the power analysis of a onesample T-test (Section 6.a). The summary pop-up window wraps up the power analysis. 10 G*Power is available on

16 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 16 Figure 6. Power Analysis of a one-sample T-test in SamplePower 2.0

17 2004 by Jeeshim and KUCC625 (11/28/2004) Understanding the Statistical Power: 17 References Cohen, Jacob Statistical Power Analysis for the Behavioral Sciences, 2 nd ed. Hillsdale, NJ: L. Erlbaum Associates. Greene, William H Econometric Analysis, 4th ed. Prentice Hall. High, Robin Kirk, Roger E Experimental Design: Procedures for the Behavioral Science, 3 rd ed. Pacific Grove, CA: Brooks/Cole Publishing. Murphy, Kevin R Statistical Power Analysis: A Simple and General Model for Traditional and Modern Hypothesis Tests. Mahwah, NJ: L. Erlbaum Associates. SAS Institute Getting Started with the SAS Power and Sample Size Application. Cary, NC: SAS Institute. SAS Institute SAS/STAT User s Guide, Version 9.1. Cary, NC: SAS Institute. Thomas, Len, and Charlies J. Krebs A Review of Statistical Power Analysis Software, Bulletin of the Ecological Society of America, 78 (2): Available at

### Hypothesis Testing and Statistical Power of a Test *

2004-2008, The Trustees of Indiana University Hypothesis Testing and Statistical Power: 1 Hypothesis Testing and Statistical Power of a Test * Hun Myoung Park (kucc625@indiana.edu) This document provides

### 5. 가설과통계추론 5.1 가설이란무엇인가

2008-2009 Jeeshim & KUCC625 (08/04/2009) Statistical Data Analysis Using R:16 5. 가설과통계추론 이장에서는가설이무엇이고, 통계추론을어떻게하는지세가지방법을설명한다. 5.1 가설이란무엇인가 A hypothesis is a specific conjecture (statement) about a characteristic

### OVERVIEW AND EXAMPLES OF POWER AND SAMPLE SIZE DETERMINATION USING SAS PROC POWER

OVERVIEW AND EXAMPLES OF POWER AND SAMPLE SIZE DETERMINATION USING SAS PROC POWER Design, Epidemiology, and Biostatistics Core Pennington Biomedical Research Center Baton Rouge, LA Supported by 1 U54 GM104940

### Chapter 56 The POWER Procedure (Experimental) Chapter Contents

Chapter 56 The POWER Procedure (Experimental) Chapter Contents OVERVIEW................................... 3171 GETTING STARTED.............................. 3172 Computing Power for a One-Sample t Test..................

### Testing Hypotheses using SPSS

Is the mean hourly rate of male workers \$2.00? T-Test One-Sample Statistics Std. Error N Mean Std. Deviation Mean 2997 2.0522 6.6282.2 One-Sample Test Test Value = 2 95% Confidence Interval Mean of the

### Hypothesis Testing. Chapter 7

Hypothesis Testing Chapter 7 Hypothesis Testing Time to make the educated guess after answering: What the population is, how to extract the sample, what characteristics to measure in the sample, After

### UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

### Lecture Topic 6: Chapter 9 Hypothesis Testing

Lecture Topic 6: Chapter 9 Hypothesis Testing 9.1 Developing Null and Alternative Hypotheses Hypothesis testing can be used to determine whether a statement about the value of a population parameter should

### Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### SAMPLE SIZE ESTIMATION USING KREJCIE AND MORGAN AND COHEN STATISTICAL POWER ANALYSIS: A COMPARISON. Chua Lee Chuan Jabatan Penyelidikan ABSTRACT

SAMPLE SIZE ESTIMATION USING KREJCIE AND MORGAN AND COHEN STATISTICAL POWER ANALYSIS: A COMPARISON Chua Lee Chuan Jabatan Penyelidikan ABSTRACT In most situations, researchers do not have access to an

### Chapter 8 Introduction to Hypothesis Testing

Chapter 8 Student Lecture Notes 8-1 Chapter 8 Introduction to Hypothesis Testing Fall 26 Fundamentals of Business Statistics 1 Chapter Goals After completing this chapter, you should be able to: Formulate

### Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### 9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

Chapter 8 Hypothesis Tests Chapter Table of Contents Introduction...157 One-Sample t-test...158 Paired t-test...164 Two-Sample Test for Proportions...169 Two-Sample Test for Variances...172 Discussion

### Hypothesis Testing. Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University

Hypothesis Testing Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 AMU / Bon-Tech, LLC, Journi-Tech Corporation Copyright 2015 Learning Objectives Upon successful

### STT 430/630/ES 760 Lecture Notes: Chapter 6: Hypothesis Testing 1. February 23, 2009 Chapter 6: Introduction to Hypothesis Testing

STT 430/630/ES 760 Lecture Notes: Chapter 6: Hypothesis Testing 1 February 23, 2009 Chapter 6: Introduction to Hypothesis Testing One of the primary uses of statistics is to use data to infer something

### Tests for Two Correlated Proportions (McNemar Test)

Chapter 150 Tests for Two Correlated Proportions (McNemar Test) Introduction McNemar s test compares the proportions for two correlated dichotomous variables. These two variables may be two responses on

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### ANOVA MULTIPLE CHOICE QUESTIONS. In the following multiple-choice questions, select the best answer.

ANOVA MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, select the best answer. 1. Analysis of variance is a statistical method of comparing the of several populations. a. standard

### HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

### SUGI 29 Statistics and Data Analysis

Paper 195-29 "Proc Power in SAS 9.1" Debbie Bauer, MS, Sanofi-Synthelabo, Russell Lavery, Contractor, About Consulting, Chadds Ford, PA ABSTRACT Features were added to SAS V9.1 STAT that will allow many

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### c. The factor is the type of TV program that was watched. The treatment is the embedded commercials in the TV programs.

STAT E-150 - Statistical Methods Assignment 9 Solutions Exercises 12.8, 12.13, 12.75 For each test: Include appropriate graphs to see that the conditions are met. Use Tukey's Honestly Significant Difference

### Hypothesis Testing hypothesis testing approach formulation of the test statistic

Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing

About Hypothesis Testing TABLE OF CONTENTS About Hypothesis Testing... 1 What is a HYPOTHESIS TEST?... 1 Hypothesis Testing... 1 Hypothesis Testing... 1 Steps in Hypothesis Testing... 2 Steps in Hypothesis

### Point Biserial Correlation Tests

Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

### Week 7 Lecture: Two-way Analysis of Variance (Chapter 12) Two-way ANOVA with Equal Replication (see Zar s section 12.1)

Week 7 Lecture: Two-way Analysis of Variance (Chapter ) We can extend the idea of a one-way ANOVA, which tests the effects of one factor on a response variable, to a two-way ANOVA which tests the effects

### One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools

One-Sample t-test Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### Box plots & t-tests. Example

Box plots & t-tests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as box-and-whisker diagrams. Any data that you can

### General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

### Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

### Multiple One-Sample or Paired T-Tests

Chapter 610 Multiple One-Sample or Paired T-Tests Introduction This chapter describes how to estimate power and sample size (number of arrays) for paired and one sample highthroughput studies using the.

### Statistical design of experiments

Faculty of Health Sciences Statistical design of experiments Basic statistics for experimental researchers 2015 Julie Lyng Forman Department of Biostatistics, University of Copenhagen Contents Considerations

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR

HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR Hypothesis is a conjecture (an inferring) about one or more population parameters. Null Hypothesis (H 0 ) is a statement of no difference or no relationship

### Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling

Chapter 12 Sample Size and Power Calculations Chapter Table of Contents Introduction...253 Hypothesis Testing...255 Confidence Intervals...260 Equivalence Tests...264 One-Way ANOVA...269 Power Computation

### Biostatistics Lab Notes

Biostatistics Lab Notes Page 1 Lab 1: Measurement and Sampling Biostatistics Lab Notes Because we used a chance mechanism to select our sample, each sample will differ. My data set (GerstmanB.sav), looks

### Correlational Research

Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

### Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Sample Size Calculation and Power Analysis for Design of Experiments Using Proc GLMPOWER Chii-Dean Joey Lin, SDSU, San Diego, CA

Sample Size Calculation and Power Analysis for Design of Experiments Using Proc GLMPOWER Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT How many samples do I need? When a statistician or a quantitative

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing.

This is Dr. Chumney. The focus of this lecture is hypothesis testing both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 In this lecture, we will talk about both theoretical

### Hypothesis testing: Examples. AMS7, Spring 2012

Hypothesis testing: Examples AMS7, Spring 2012 Example 1: Testing a Claim about a Proportion Sect. 7.3, # 2: Survey of Drinking: In a Gallup survey, 1087 randomly selected adults were asked whether they

### MCQ TESTING OF HYPOTHESIS

MCQ TESTING OF HYPOTHESIS MCQ 13.1 A statement about a population developed for the purpose of testing is called: (a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic MCQ

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 9 - FUNDAMENTALS OF HYPOTHESIS TESTING: ONE-SAMPLE TESTS

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 302) Spring Semester 20 Chapter 9 - FUNDAMENTALS OF HYPOTHESIS

### UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### The Chi-Square Distributions

MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation " of a normally distributed measurement and to test the goodness

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

### Chapter 8: Introduction to Hypothesis Testing

Chapter 8: Introduction to Hypothesis Testing We re now at the point where we can discuss the logic of hypothesis testing. This procedure will underlie the statistical analyses that we ll use for the remainder

### Introduction to Power and Sample Size Analysis (Book Excerpt)

SAS/STAT 9.22 User s Guide Introduction to Power and Sample Size Analysis (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.22 User s Guide. The correct bibliographic

### Two Related Samples t Test

Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### I. Basics of Hypothesis Testing

Introduction to Hypothesis Testing This deals with an issue highly similar to what we did in the previous chapter. In that chapter we used sample information to make inferences about the range of possibilities

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### Hypothesis Testing or How to Decide to Decide Edpsy 580

Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Hypothesis Testing or How to Decide to Decide

### EXST SAS Lab Lab #7: Hypothesis testing with Paired t-tests and One-tailed t-tests

EXST SAS Lab Lab #7: Hypothesis testing with Paired t-tests and One-tailed t-tests Objectives 1. Infile two external data sets (TXT files) 2. Calculate a difference between two variables in the data step

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### PASS Sample Size Software. Linear Regression

Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes

### Testing: is my coin fair?

Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### Chapter 6: t test for dependent samples

Chapter 6: t test for dependent samples ****This chapter corresponds to chapter 11 of your book ( t(ea) for Two (Again) ). What it is: The t test for dependent samples is used to determine whether the

### Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

### PASS Sample Size Software

Chapter 250 Introduction The Chi-square test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common instances are tests of goodness of fit using multinomial

### Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

### Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Null and Alternative Hypotheses. Lecture # 3. Steps in Conducting a Hypothesis Test (Cont d) Steps in Conducting a Hypothesis Test

Lecture # 3 Significance Testing Is there a significant difference between a measured and a standard amount (that can not be accounted for by random error alone)? aka Hypothesis testing- H 0 (null hypothesis)

### Statistics for Clinical Trial SAS Programmers 1: paired t-test Kevin Lee, Covance Inc., Conshohocken, PA

Statistics for Clinical Trial SAS Programmers 1: paired t-test Kevin Lee, Covance Inc., Conshohocken, PA ABSTRACT This paper is intended for SAS programmers who are interested in understanding common statistical

### Hypothesis Testing Summary

Hypothesis Testing Summary Hypothesis testing begins with the drawing of a sample and calculating its characteristics (aka, statistics ). A statistical test (a specific form of a hypothesis test) is an

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### JMP for Basic Univariate and Multivariate Statistics

JMP for Basic Univariate and Multivariate Statistics Methods for Researchers and Social Scientists Second Edition Ann Lehman, Norm O Rourke, Larry Hatcher and Edward J. Stepanski Lehman, Ann, Norm O Rourke,

### Title. Description. Quick start. Menu. stata.com. ztest z tests (mean-comparison tests, known variance)

Title stata.com ztest z tests (mean-comparison tests, known variance) Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas References Also see Description

### Power & Effect Size power Effect Size

Power & Effect Size Until recently, researchers were primarily concerned with controlling Type I errors (i.e. finding a difference when one does not truly exist). Although it is important to make sure

### Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

### Hypothesis Testing 1

Hypothesis Testing 1 Statistical procedures for addressing research questions involves formulating a concise statement of the hypothesis to be tested. The hypothesis to be tested is referred to as the

### Water Quality Problem. Hypothesis Testing of Means. Water Quality Example. Water Quality Example. Water quality example. Water Quality Example

Water Quality Problem Hypothesis Testing of Means Dr. Tom Ilvento FREC 408 Suppose I am concerned about the quality of drinking water for people who use wells in a particular geographic area I will test

### SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS)

SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS) State of the course address: The Final exam is Aug 9, 3:30pm 6:30pm in B9201 in the Burnaby Campus. (One

### Chapter 8. Hypothesis Testing

Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### MAT X Hypothesis Testing - Part I

MAT 2379 3X Hypothesis Testing - Part I Definition : A hypothesis is a conjecture concerning a value of a population parameter (or the shape of the population). The hypothesis will be tested by evaluating

### One-sample normal hypothesis Testing, paired t-test, two-sample normal inference, normal probability plots

1 / 27 One-sample normal hypothesis Testing, paired t-test, two-sample normal inference, normal probability plots Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis

### Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

### Power and Sample Size

Power and Sample Size John M. Boltri, M.D. Mercer University School of Medicine Cindy Passmore, M.S. Waco Faculty Development Center Robert L. Vogel, Ph.D. Jiann-Ping Hsu School of Public Health Georgia

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Single sample hypothesis testing, II 9.07 3/02/2004

Single sample hypothesis testing, II 9.07 3/02/2004 Outline Very brief review One-tailed vs. two-tailed tests Small sample testing Significance & multiple tests II: Data snooping What do our results mean?

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Chapter 7. Section Introduction to Hypothesis Testing

Section 7.1 - Introduction to Hypothesis Testing Chapter 7 Objectives: State a null hypothesis and an alternative hypothesis Identify type I and type II errors and interpret the level of significance Determine