G*Power 3: A flexible statistical power analysis program. for the social, behavioral, and biomedical sciences



Similar documents
Point Biserial Correlation Tests

Tests for Two Proportions

Study Guide for the Final Exam

Pearson's Correlation Tests

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Additional sources Compilation of sources:

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

How To Check For Differences In The One Way Anova

Data Analysis in SPSS. February 21, If you wish to cite the contents of this document, the APA reference for them would be

Statistics in Medicine Research Lecture Series CSMC Fall 2014

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 5 Analysis of variance SPSS Analysis of variance

Quantitative Methods for Finance

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

3 The spreadsheet execution model and its consequences

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Multivariate Analysis of Variance (MANOVA): I. Theory

Hewlett-Packard 12C Tutorial

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

January 26, 2009 The Faculty Center for Teaching and Learning

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Descriptive Statistics

ON AN APPROACH TO STATEMENT OF SQL QUESTIONS IN THE NATURAL LANGUAGE FOR CYBER2 KNOWLEDGE DEMONSTRATION AND ASSESSMENT SYSTEM

Data Analysis Tools. Tools for Summarizing Data

SPSS Tests for Versions 9 to 13

Using Excel for Statistics Tips and Warnings

Tests for One Proportion

Tutorial 5: Hypothesis Testing

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

Using Freezing-Point Depression to Find Molecular Weight

Non-Inferiority Tests for One Mean

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Inference for Multivariate Means

Randomized Block Analysis of Variance

Non-Inferiority Tests for Two Means using Differences

When to use Excel. When NOT to use Excel 9/24/2014

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Chapter 13 Introduction to Linear Regression and Correlation Analysis

SAS Software to Fit the Generalized Linear Model

Analysis of Variance. MINITAB User s Guide 2 3-1

AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlogd p

How To Write A Report In Xbarl

Application of discriminant analysis to predict the class of degree for graduating students in a university system

Using MS Excel to Analyze Data: A Tutorial

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

MBA 611 STATISTICS AND QUANTITATIVE METHODS

TIPS FOR DOING STATISTICS IN EXCEL

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

SPSS Explore procedure

Reporting Statistics in Psychology

On the Influence of the Prediction Horizon in Dynamic Matrix Control

Confidence Intervals for the Difference Between Two Means

1 Overview and background

Recall this chart that showed how most of our course would be organized:

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Statistics Review PSY379

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

E3: PROBABILITY AND STATISTICS lecture notes

Two Correlated Proportions (McNemar Test)

The Dummy s Guide to Data Analysis Using SPSS

Analysis of categorical data: Course quiz instructions for SPSS

Multivariate Analysis of Variance (MANOVA)

Introduction to Regression and Data Analysis

Chapter 7. One-way ANOVA

Non-Inferiority Tests for Two Proportions

Using Excel for inferential statistics

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Simple Linear Regression Inference

NCSS Statistical Software

Didacticiel - Études de cas

Poisson Models for Count Data

HYPOTHESIS TESTING: POWER OF THE TEST

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; and Dr. J.A. Dobelman

Data Mining: Algorithms and Applications Matrix Math Review

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Module 3: Correlation and Covariance

Efficient and Provably Secure Ciphers for Storage Device Block Level Encryption

Windows-Based Meta-Analysis Software. Package. Version 2.0

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

DISCRIMINANT FUNCTION ANALYSIS (DA)

Section 13, Part 1 ANOVA. Analysis Of Variance

Total Portfolio Performance Attribution Methodology

UNIQUE Business for SaaS

Two Related Samples t Test

S i m p l i c i t y c o n t r o l P o w e r. Alcatel-Lucent OmniTouch Contact Center Premium Edition

Power and sample size in multilevel modeling

One-Way Analysis of Variance

Transcription:

Runnin Head: G*Power 3 G*Power 3: A fleible statistical power analysis proram for the social, behavioral, and biomedical sciences (in press. Behavior Research Methods. Franz Faul Christian-Albrechts-Universität Kiel Kiel, Germany Edar Erdfelder Universität Mannheim Mannheim, Germany Albert-Geor Lan and Ael Buchner Heinrich-Heine-Universität Düsseldorf Düsseldorf, Germany Please send correspondence to: Prof. Dr. Edar Erdfelder Lehrstuhl für Psycholoie III Universität Mannheim Schloss Ehrenhof Ost 55 D-683 Mannheim, Germany Email: erdfelder@psycholoie.uni-mannheim.de Phone 4 6 / 8 46 Fa: 4 6 / 8-37

G*Power 3 (BSC70 Pae Abstract G*Power (Erdfelder, Faul, & Buchner, Behavior Research Methods, Instruments, & Computers, 6 was desined as a eneral stand-alone power analysis proram for statistical tests commonly used in social and behavioral research. G*Power 3 is a major etension of, and improvement over, the previous versions. It runs on widely used computer platforms (Windows XP, Windows Vista, Mac OS X 0.4 and covers many different statistical tests of the t-, F-, and! -test families. In addition, it includes power analyses for z tests and some eact tests. G*Power 3 provides improved effect size calculators and raphic options, it supports both a distribution-based and a desin-based input mode, and it offers all types of power analyses users miht be interested in. Like its predecessors, G*Power 3 is free.

G*Power 3 (BSC70 Pae 3 G*Power 3: A fleible statistical power analysis proram for the social, behavioral, and biomedical sciences Statistics tetbooks in the social, behavioral, and biomedical sciences typically stress the importance of power analyses. By definition, the power of a statistical test is the probability of rejectin its null hypothesis iven that it is in fact false. Obviously, sinificance tests lackin statistical power are of limited use because they cannot reliably discriminate between the null hypothesis (H 0 and the alternative hypothesis (H of interest. However, althouh power analyses are indispensable for rational statistical decisions, it took until the late 80s until power charts (e.., Scheffé, 5 and power tables (e.., Cohen, 88 were supplemented by more efficient, precise, and easy-to-use power analysis prorams for personal computers (Goldstein, 8. G*Power (Erdfelder, Faul, & Buchner, 6 can be seen as a second-eneration power analysis proram desined as a stand-alone application to handle several types of statistical tests commonly used in social and behavioral research. In the past ten years, this proram has been found useful not only in the social and behavioral sciences but also in many other disciplines that routinely apply statistical tests, for eample, bioloy (Baeza & Stotz, 003, enetics (Akkad et al., 006, ecoloy (Sheppard,, forest- and wildlife research (Mellina, Hinch, Donaldson, & Pearson, 005, the eosciences (Busbey,, pharmacoloy (Quednow et al., 004, and medical research (Gleissner, Clusmann, Sassen, Eler, & Helmstaedter, 006. G*Power was evaluated positively in the reviews we are aware of (Kornbrot, 7; Ortseifen, Bruckner, Burke, & Kieser, 7; Thomas & Krebs, 7, and it has been used in several power tutorials (e.., Buchner, Erdfelder, & Faul, 6, 7; Erdfelder, Buchner, Faul & Brandt, 004; Levin, 7; Sheppard, as well as in statistics tetbooks (e.., Field, 005; Keppel & Wickens, 004; Myers & Well, 003; Rasch, Friese, Hofmann, & Naumann, 006a, 006b. Nevertheless, the user feedback that we received convered with our own eperience in showin some limitations and weaknesses of G*Power that required a major etension and revision. The present article describes G*Power 3, a proram that was desined to address the problems of G*Power. We will first outline the major improvements in G*Power 3 (Section before we discuss the types of power analyses covered by this proram (Section. Net, we

G*Power 3 (BSC70 Pae 4 describe proram handlin (Section 3 and the types of statistical tests to which it can be applied (Section 4. The fifth section is devoted to the statistical alorithms of G*Power 3 and their accuracy. Finally, proram availability and some internet resources supportin users of G*Power 3 are described in Section 6. Improvements in G*Power 3 compared to G*Power G*Power 3 is an improvement over G*Power in five major aspects. First, whereas G*Power requires the DOS and Mac-OS 7- operatin systems that were common in the 0s but are now outdated, G*Power 3 runs on the currently most widely used personal computer platforms, that is, Windows XP, Windows Vista, and Mac OS X 0.4. The Windows and the Mac versions of the proram are essentially equivalent. They use the same computational routines and share very similar user interfaces. For this reason, we will not differentiate between both versions in the sequel; users simply have to make sure to download the version appropriate for their operatin system. Second, whereas G*Power is limited to three types of power analyses, G*Power 3 supports five different ways to assess statistical power. In addition to a priori analyses, post hoc analyses, and compromise power analyses that were already covered by G*Power, the new proram also offers sensitivity analyses and criterion analyses. Third, G*Power 3 provides dedicated power analysis options for a variety of frequentlyused t tests, F tests, z tests,! tests, and eact tests, rather than just the standard tests covered by G*Power. The tests captured by G*Power 3 are described in Section 3 alon with their effect size parameters. Importantly, users are not limited to these tests because G*Power 3 also offers power analyses for eneric t-, F-, z-,!, and binomial tests for which the noncentrality parameter of the distribution under H may be entered directly. In this way, users are provided with a fleible tool for computin the power of basically any statistical test that uses t-, F-, z-,!, or binomial reference distributions. Fourth, statistical tests can be specified in G*Power 3 usin two different approaches, the distribution-based approach and the desin-based approach. In the distribution-based approach,

G*Power 3 (BSC70 Pae 5 users select (a the family of the test statistic (i.e., t-, F-, z-,!, or eact test and (b the particular test within this family. This is the way in which power analyses were specified in G*Power. Additionally, a separate menu in G*Power 3 provides access to power analyses via the desinbased approach: Users select (a the parameter class the statistical test refers to (i.e., correlations, means, proportions, reression coefficients, variances and (b the desin of the study (e.., number of roups, independent vs. dependent samples, etc.. Based on the feedback we received to G*Power, we epect that some users miht find the desin-based input mode more intuitive and easier to use. Fifth, G*Power 3 supports users with enhanced raphics features. The details of these features will be outlined in Section 3, alon with a description of proram handlin. Types of Statistical Power Analyses The power (- of a statistical test is the complement of, which denotes the type- or beta error probability of falsely retainin an incorrect H 0. Statistical power depends on three classes of parameters: ( the sinificance level (or, synonymously, the type- error probability of the test, ( the size(s of the sample(s used for the test, and (3 an effect size parameter definin H and thus indein the deree of deviation from H 0 in the underlyin population. Dependin on the available resources, the actual phase of the research process, and the specific research question, five different types of power analysis can be reasonable (cf. Erdfelder et al., 004; Erdfelder, Faul, & Buchner, 005. We describe these methods and their uses in turn. In a priori power analyses (Cohen, 88, the sample size N is computed as a function of the required power level (-, the pre-specified sinificance level, and the population effect size to be detected with probability (-. A priori analyses provide an efficient method of controllin statistical power before a study is actually conducted (e.., Bredenkamp, 6; Haer, 006 and can be recommended whenever resources such as time and money required for data collection are not critical. In contrast, post hoc power analyses (Cohen, 88 often make sense after a study has already been conducted. In post hoc analyses, the power (- is computed as a function of, the

G*Power 3 (BSC70 Pae 6 population effect size parameter, and the sample size(s used in a study. It thus becomes possible to assess whether a published statistical test in fact had a fair chance to reject an incorrect H 0. Importantly, post-hoc analyses, like a priori analyses, require an H effect size specification for the underlyin population. They should not be confused with so-called retrospective power analyses in which the effect size is estimated from sample data and used to calculate the observed power, a sample estimate of the true power. Retrospective power analyses are based on the hihly questionable assumption that the sample effect size is essentially identical to the effect size in the population from which it was drawn (Zumbo & Hubley, 8. Obviously, this assumption is likely to be false, the more so the smaller the sample. In addition, sample effect sizes are typically biased estimates of their population counterparts (Richardson, 6. For these reasons, we aree with other critics of retrospective power analyses (e.., Gerard, Smith & Weerakkody, 8; Hoeni & Heisey, 00; Kromrey & Hoarty, 000; Lenth, 00; Steidl, Hayes, & Schauber, 7. Rather than usin retrospective power analyses, researchers should specify population effect sizes on a priori rounds. Effect size specification simply means to define the minimum deree of violation of H 0 a researcher would like to detect with a probability not less than (-. Cohen s (88 definitions of small, medium, and lare effects can be helpful in such effect size specifications (see, e.., Smith & Bayen, 005. However, researchers should be aware of the fact that these conventions may have different meanins for different tests (cf. Erdfelder et al., 005. (3 In compromise power analyses (Erdfelder, 84; Erdfelder et al., 6; Müller, Manz, & Hoyer, 00, both and - are computed as functions of the effect size, N, and an error probability ratio q = /. To illustrate, q = would mean that the researcher prefers balanced type- and type- error risks ( =, whereas q = 4 would imply that = 4 (cf. Cohen, 88. Compromise power analyses can be useful both before and after data collection. For eample, an a priori power analysis miht result in a sample size that eceeds the available resources. In such a situation, a researcher could specify the maimum affordable sample size and, usin a compromise power analysis, compute and (- associated with, say, q! = / = 4. Alternatively, if a study has already been conducted but not yet been analyzed, a researcher could ask for a reasonable decision criterion that uarantees perfectly balanced error risks (i.e. =, iven the size of this

G*Power 3 (BSC70 Pae 7 sample and a critical effect size she is interested in. Of course, compromise power analyses can easily result in unconventional sinificance levels larer than =.05 (in case of small samples or effect sizes or less than =.00 (in case of lare samples or effect sizes. However, we believe that the benefit of balanced type- and type- error risks often offsets the costs of violatin sinificance level conventions (cf. Gierenzer, Kraus, & Vitouch, 004. (4 In sensitivity analyses the critical population effect size is computed as a function of, -, and N. Sensitivity analyses may be particularly useful for evaluatin published research. They provide answers to questions like What is the effect size a study was able to detect with a power of - =.80, iven its sample size and as specified by the author? In other words, what is the minimum effect size the test was sufficiently sensitive to? In addition, sensitivity analyses may be useful before conductin a study to see whether, iven a limited N, the size of the effect that can be detected is at all realistic (or, for instance, way too lare to be epected realistically. (5 Finally, criterion analyses compute (and the associated decision criterion as a function of -, the effect size, and a iven sample size. Criterion analyses are alternatives to posthoc power analyses after a study has already been conducted. They may be reasonable whenever the control of is less important than the control of. In case of oodness-of-fit tests for statistical models, for eample, it is most important to minimize the -risk of wron decisions in favor of the model (H 0. Researchers could thus use criterion analyses to compute the sinificance level compatible with =.05 for a small effect size. Whereas G*Power was limited to the first three types of power analysis, G*Power 3 now covers all five types. Based on the feedback we received from G*Power users, we believe that any question related to statistical power that occurs in research practice can be translated into one of these analysis types.

G*Power 3 (BSC70 Pae 8 3 Proram Handlin Usin G*Power 3 typically involves the followin four steps: ( Select the statistical test appropriate for your problem, ( choose one of the five types of power analysis defined in the previous section, (3 provide the input parameters required for the analysis, and (4 click on calculate to obtain the results. In the first step, the statistical test is chosen usin the distribution-based or the desin-based approach. G*Power users probably have adapted to the distribution-based approach: One first selects the family of the test statistic (i.e., t-, F-, z-,!, or eact test usin the Test family menu in the main window. The Statistical test menu adapts accordinly, showin a list of all tests available for the test family. For the two roups t test, for eample, one would first select the t family of distributions and then Means: Differences between two independent means (two roups in the Statistical test menu (see Fiure. Alternatively, one miht use the desinbased approach of test selection. With the Tests pull-down menu in the top row it is possible to select (a the parameter class the statistical test refers to (i.e., correlations, means, proportions, reression coefficients, variances and (b the desin of the study (e.., number of roups, independent vs. dependent samples, etc.. For eample, researchers would select Means! Two independent roups to specify the two-roups t test (see Fiure. The desin-based approach has the advantae that test options referrin to the same parameter class (e.., means are located in close proimity, whereas they may be scattered across different distribution families in the distribution-based approach. Please insert Fiures and about here. In the second step the Type of power analysis menu in the center of the main window should be used to choose the appropriate analysis type. In the third step, the power analysis input parameters are specified in the lower left of the main window. To illustrate, an a priori power analysis for a two roups t test would require a decision between a one-tailed and a two-tailed test,

G*Power 3 (BSC70 Pae a specification of Cohen s (88 effect size measure d under H, the sinificance level, the required power (- of the test, and the preferred roup size allocation ratio n /n. The final step consists of clickin Calculate to obtain the output in the lower riht of the main window. For instance, input parameters specifyin a one-tailed t test, a medium effect size of d =.5, =.05, (- =.5, and an allocation ratio of n /n = would result in a total sample size of N=76 (88 observation units in each roup; see Fiures and. The noncentrality parameter $ definin the t distribution under H, the decision criterion to be used (i.e., the critical value of the t statistic, the derees of freedom of the t test and the actual power value are also displayed. Note that the actual power will often be slihtly larer than the pre-specified power in a priori power analyses. The reason is that non-inteer sample sizes are always rounded up by G*Power to obtain inteer values consistent with a power level not less than the pre-specified one. In addition to the numerical output, G*Power 3 displays the central (H 0 and the noncentral (H test statistic distributions alon with the decision criterion and the associated error probabilities in the upper part of the main window (see Fiure 3. This supports understandin the effects of the input parameters and is likely to be a useful visualization tool in the teachin of, or the learnin about, inferential statistics. The distributions plot may be printed, saved, or copied by clickin the riht mouse button inside the plot area. The input and output of each power calculation in a G*Power session is automatically written to a protocol that can be displayed by selectin the Protocol of power analyses tab in the main window. It is possible to clear the protocol, or to print, save, and copy the protocol in the same way as the distributions plot. Because Cohen s (88 book on power analysis appears to be well known in the social and behavioral sciences, we made use of his effect size measures whenever possible. Researchers not familiar with these measures and users preferrin to compute Cohen s measures from more basic parameters can click on the Determine button to the left the effect size input field (see Fiures and. A drawer will open net to the main window and provide access to an effect size calculator tailored to the selected test (see Fiure. For the two-roups t test, for eample, users can specify the means (µ, µ and the common standard deviation (% in the populations underlyin the roups

G*Power 3 (BSC70 Pae 0 to calculate Cohen s d = µ -µ /%. Clickin the Calculate and transfer to main window button copies the computed effect size to the appropriate field in the main window. Please insert Fiure 3 about here. Another useful option is the Power Plot window (see Fiure 3 which is opened by clickin the X-Y plot for a rane of values in the lower riht corner of the main window (see Fiures and. By selectin the appropriate parameters for the y- and the -ais, one parameter (, power (, effect size, or sample size can be plotted as a function of any other parameter. Of the remainin two parameters, one can be chosen to draw a family of raphs, while the fourth parameter is kept constant. For instance, power ( can be drawn as a function of the sample size for several different population effects sizes, keepin at a particular value. The plot may be printed, saved, or copied by clickin the riht mouse button inside the plot area. Selectin the table tab reveals the data underlyin the plot; they may be copied to other applications. The Power Plot window inherits all input parameters of the analysis that is active when the X-Y plot for a rane of values button is pressed. Only some of these parameters can be directly manipulated in the Power Plot window. For instance, switchin from a plot of a two-tailed test to that of a one-tailed test requires choosin the Tail(s: One option in the main window, followed by pressin the X-Y plot for a rane of values button. 4 Types of Statistical Tests. G*Power 3 provides power analyses for test statistics followin t-, F-,! - or standard normal distributions under H 0 (either eact or asymptotic and noncentral distributions of the same test families under H. In addition, it includes power analyses for some eact tests. In Tables to we briefly describe the tests currently covered by G*Power 3. Table lists the symbols and their meanins as used in Tables to.

G*Power 3 (BSC70 Pae Tests for Correlation and Reression Table summarizes the procedures supported for testin hypotheses on correlation and reression. One-sample tests are provided for the point-biserial model 4, i.e. correlations between a binary variable and a continuous variable, and for correlations between two normally distributed variables (Cohen, 88, Chapter 3. The latter test uses the eact sample correlation coefficient distribution (Barabesi & Greco, 00 or, optionally, a lare sample approimation based on Fisher s r-to-z transformation. The two-sample test for differences between two correlations uses Cohen s effect size q (Cohen, 88, Chapter 4 and is based on Fisher s r-to-z transformation. Cohen defines q =.0, q =.30, and q =.50 as small, medium, and lare effects, respectively. The two procedures available for the multiple reression model handle the cases of (a a test of an overall effect, i.e. the hypothesis that the population value of R is different from zero, and (b a test of the hypothesis that addin additional predictors increases the value of R (Cohen, 88, Chapter. Accordin to Cohen s criteria effects of size f = 0.0, f = 0.5, and f = 0.35 are considered small, medium, and lare, respectively. Tests for Means (univariate case Table 3 summarizes the power analysis procedures for tests on means. G*Power 3 supports all cases of the t test for means described by Cohen (88, Chapter : the test for independent means, the test of the null hypothesis that the population mean equals some specified value (one sample case, and the test on the means of two dependent samples (matched pairs. Cohen s d and d z are used as effect size indices. Cohen defines d = 0., d = 0.5, and d = 0.8 as small, medium, and lare effects, respectively. Effect size dialos are available to compute the appropriate effect size parameter from means and standard deviations. For eample, assume we want to compare visual search times for tarets embedded in rare versus frequent local contets in a within-subject desin (cf. Hoffmann & Sebald, 005, Ep.. It is epected that the mean search time for tarets in rare contets (say, 600 ms should decrease by at least 0 ms in frequent contets (i.e., to 50 ms as a consequence of local contetual cuin. If prior evidence suests population standard deviations

G*Power 3 (BSC70 Pae of, say, % = 5 ms in each of the conditions and a correlation of & =.70 between search times in both conditions we can use the effect size drawer of G*Power 3 for the matched pairs t test to calculate the effect size d z =.56 (see Table 3, row, for the formula. By selectin a post hoc power analysis for one-tailed matched pairs t tests, we easily see that for d z =.56, =.05, and N = 6 participants the power is only - =.47. Thus, provided that the above assumptions are appropriate, the nonsinificant statistic t(5 =.475 obtained by Hoffmann and Sebald (005, Ep., p. 34 miht in fact be due to a type error. This interpretation would be consistent with the fact that Hoffmann and Sebald (005 observed sinificant local contetual cuin effects in all other four eperiments they reported. The procedures provided by G*Power 3 to test effects in between-subjects desins with more than two roups (i.e., one-way ANOVA desins and eneral main effects and interactions in factorial ANOVA desins of any order are identical to those in G*Power (Erdfelder et al., 6. In all these cases the effect size f as defined in Cohen (88 is used. In a one-way ANOVA the effect size drawer can be used to compute f from the means and roup sizes of k roups and a standard deviation common to all roups. For tests of effects in factorial desins, the effect size drawer offers the possibility to compute the effect size f from the variance eplained by the tested effect and the error variance. Cohen defines f = 0., f = 0.5, and f = 0.4 as small, medium, and lare effects, respectively. New in G*Power 3 are procedures for analyzin main effects and interactions for A! B mied desins, where A is a between-subjects factor (or an enumeration of the roups enerated by cross-classification of several between-subject factors and B is a within-subjects factor (or an enumeration of the repeated measures enerated by cross-classification of several within-subject factors. Both the univariate and the multivariate approach to repeated measures (O Brien & Kaiser, 85 are supported. The multivariate approach will be discussed below. The univariate approach is based on the sphericity assumption. This assumption is correct if (in the population all variances of the repeated measurements are equal and all correlations between pairs of repeated measurements are equal. If all the distributional assumptions are met, then the univariate approach is the most powerful method (Muller & Barton, 8; O Brien & Kaiser, 85. Unfortunately, especially the assumption of equal correlations is often violated, which can lead to very misleadin

G*Power 3 (BSC70 Pae 3 results. In order to compensate for such adverse effects in tests of within effects or between-within interactions, the noncentrality parameter and the derees of freedom of the F-distribution can be multiplied by a correction factor ' (Geisser & Greenhouse, 58; Huynh & Feldt, 70. ' is if the sphericity assumption is met and approaches /(m- with increasin derees of violation of sphericity, where m denotes the number of repeated measurements. G*Power provides three separate yet very similar routines to calculate power in the univariate approach for between effects, within effects, and interactions. If the to-be-detected effect size f is known, these procedures are very easy to apply. To illustrate, Berti, Münzer, Schröer, and Pechmann (006 compared the pitch discrimination ability of 0 musicians and 0 control subjects (between-subjects factor A for 0 different interference conditions (within-subject factor B. Assumin that A, B, and A! B effects of medium size (f =.5, cf. Cohen, 88; Table 3 should be detected iven a correlation of & =.50 between repeated measures and a sinificance level of =.05, the power values of the F tests for the A main effect, the B main effect, and the A! B interaction are easily computed as.30,.5, and.5, respectively, by insertin f =.5, =.05, the total sample size (0, the number of roups (, the number of repetitions (0 and & =.50 into the appropriate input fields of the procedures desined for these tests. If the to-be-detected effect size f is unknown, it is necessary to compute f from more basic parameters characterizin the epected population scenario under H. To demonstrate the eneral procedure we will show how to do post-hoc power analyses in the scenario illustrated in Fiure 4 assumin the variance and correlations structure defined in matri SR. We first consider the power of the within effect: We select the F tests family, the ANOVA: Repeated measures, within factors test, and Post hoc as the type of power analysis. Both the Number of roups and the Repetitions fields are set to 3. The Total sample size is set to 0 and the error probability to 0.05. Referrin to matri SR, we insert 0.3 in the input field corr amon rep measures, and -- since sphericity obviously holds in this case -- we set the nonsphericity correction ' to. To determine the effect size f, we first calculate! µ, the variance of the within effect. From the three column means µ j of matri M and the rand mean µ we et! µ = ((0-.88 (3-.88 (5.667-.88 /3 = 5.3567. Clickin the Determine button net to the effect size label opens the effect size drawer. We choose From variances and set the

G*Power 3 (BSC70 Pae 4 Variance eplained by special effect to 5.357 and the Variance within roups to = 8. Clickin the Calculate and transfer to main window button calculates an effect size f = 0.57 and transfers f to the effect size field in the main window. Clickin Calculate yields the results: The power is 0.7, the critical F value with df = and df =74 is 3.048, and the noncentrality parameter ( is 5.5. The procedure for tests of between-within interactions effects ( ANOVA: Repeated measures, within-between interaction is almost identical to that just described. The only difference is in how the effect size f is computed: Here we first calculate the variance of the residual values µ ij! µ i! µ j µ of matri M:! µ = ((0-0-5.88 (-5.667-.333.88 /.0 =.03. Usin the effect size drawer in the same way as above we et an effect size f = 0.53 which results in a power of 0.653. To test between effects we choose ANOVA: Repeated measures, between factors and set all parameters to the same values as before. Note that in this case we do not need to specify ' no correction is necessary because tests of between factors do not require the sphericity assumption. To calculate the effect size, we use Effect size from means in the effect size drawer. We select 3 roups, set SD % within each roup to, and insert for each roup the correspondin row mean µ i of M (5,.3333,.3333 and an equal roup size of 30. An effect size f = 0.757 is calculated and the resultin power is 0.488. Note that G*Power 3 can easily handle pure repeated measures desins without any between-subject factors (e.., Frins & Wentura, 005; Schwarz & Müller, 006 by choosin the ANOVA: Repeated Measures, within factors procedure and settin the number of roups to. Tests for Means (multivariate case G*Power 3 contains several procedures for performin power analyses in multivariate desins (see Table 3. All these tests belon to the F test family. The Hotellin T tests are etensions of univariate t tests to the multivariate case, where more than one dependent variable is measured: Instead of two sinle means two mean vectors are compared, and instead of a sinle variance, a variance-covariance matri is considered (Rencher, 8. In the one-sample case H 0 posits that the vector of population means is identical to a

G*Power 3 (BSC70 Pae 5 specified constant mean vector. The effect size drawer can be used to calculate the effect size from the difference r µ! c r and the epected variance-covariance matri under H. For eample, assume that we have two variables, a difference vector r µ! c r = {.88,.88} under H, variances! = 56.7,! =.8, and a covariance of.8 (Rencher, 8, p. 06. To perform a post hoc power analysis, choose F tests, then Hotellins T : One roup mean vector and set the analysis type to Post hoc. Enter in the Response variables field, and then press the Determine button net to effect size label. In the effect size drawer at Input method: Means and choose variance-covariance matri and press Specify/Edit input values. Under the Means tab insert.88 in both input fields; under the Cov Sima tab insert 56.7 and.8 in the main diaonal and.8 as off-diaonal element in the lower left cell. Pressin the button Calculate and transfer to main window initiates the calculation of the effect size (0.380 and transfers it to the main window. For this effect size, = 0.05, and a total sample size of N = 00 the power amounts to.8. The procedure in the two-roup case is eactly the same, with the followin eceptions. First, in the effect size drawer two mean vectors have to be specified. Second, the roup sizes may differ. The MANOVA tests in G*Power 3 refer to the multivariate eneral linear model (O Brien & Muller, 3; O Brien & Shieh, : Y = XB, where Y is N! p of rank p; X is N! r of rank r ; and the r! p matri B contains fied coefficients. The rows of are taken to be independent p-variate normal random! vectors with mean 0 and p! p positive-definite covariance matri. The multivariate eneral linear hypothesis is H 0: CBA! = 0, where C is c! r with full row rank, and A is p! a with full column rank (in G*Power 3, 0 is assumed to be zero. H 0 has! df = a! c derees of freedom. All tests of the hypothesis! H 0 refer to the matrices H = N(CBU 0 T [C( X T W X! C T ] (CBU 0 = NH * and! where X & is a w E T = U U( N! rx q! q essence model matri, W is a q! q diaonal matri containin weihts = n N, and X T X ( X& T WX& * * = N (see O Brien & Shieh,, p.4. Let {!, K,! } be the j j /, s s = min( a, c eienvalues of E! H * and {,,! s }! K the s eienvalues of E! H /( N! r X, i.e. * = N /( N! r. i i X

G*Power 3 (BSC70 Pae 6 G*Power 3 offers power analyses for the multivariate model followin either the approach outlined in Muller and Peterson (84; Muller, LaVane, Landesmann-Ramey & Ramey, or alternatively the approach of O Brien and Shieh (; Shieh, 003. Both approaches approimate the eact distributions of Wilks U (Rao, 5, the Hotellin-Lawley T (Pillai & Samson, 5, the Hotellin-Lawley T (McKeon, 74, and Pillai s V (Pillai & Mijares, 5 by F distributions and are asymptotically equivalent. Table 5 outlines details of both approimations. The type of statistic (U, T, T, V and the approach (Muller-Peterson or O Brien- Shieh can be selected in an Options dialo that can be evoked by clickin the button Options at the bottom of the main window. The approach of Muller and Peterson (84 has found widespread use; for instance, it has been adopted in the SPSS software packae. We nevertheless recommend the approach of O Brien and Shieh ( because it has a number of advantaes: (a Unlike the method of Muller and Peterson it provides the eact noncentral F distribution whenever the hypothesis involves at most s = positive eienvalues, (b the approimations for s > eienvalues are almost always more accurate then the Muller and Peterson method (the Muller and Peterson method systematically underestimates power, and (c it provides a simpler form of the noncentrality parameter, i.e. ( = ( * N, where ( * is not a function of the total sample size. G*Power 3 provides procedures to calculate the power for lobal effects in a one-way MANOVA and for special effects and interactions in factorial MANOVA desins. These procedures are the direct multivariate analoues of the ANOVA routines described above. Table 5 summarizes information that is needed in addition to the formulae iven above to calculate the effect size f from hypothesized values for the mean matri M (correspondin to matri B in the model, the covariance matri *, and the contrast matri C describin the effect under scrutiny. The effect size drawer can be used to calculate f from known values of the statistic U, T, T, or V. Note, however, that the transformation of T to f depends on the sample size. Thus this test statistic seems not very well suited for a priori analyses. In line with Bredenkamp and Erdfelder (85 we recommend V as the multivariate test statistic. Another roup of procedures in G*Power 3 supports the multivariate approach to power analyses of repeated measures desins. G*Power provides separate but very similar routines for

G*Power 3 (BSC70 Pae 7 the analysis of between effects, within effects and interactions in simple A! B desins, where A is a between-subjects factor and B a within-subjects factor. To illustrate the eneral procedure we describe in some detail a post hoc analysis of the within-effect for the scenario illustrated in Fi. 4 assumin the variance and correlations structure defined in matri SR. We first choose F tests, then MANOVA: Repeated measures, within factors. In the Type of power analysis menu we choose Post hoc. We click the Options button to open a dialo in which we deselect the Use mean correlation in effect size calculation option. We choose the Pillai V statistic and the O Brien and Shieh alorithm. Back to the main window we set both Number of roups and Repetitions to 3, the Total sample size to 0, and the error probability to 0.05. To compute the effect size f(v for the Pillai statistic we open the effect size drawer by clickin on the Determine button net to the effect size label. In the effect size drawer select, as procedure, Effect size from mean and variance-covariance matri and, as input method, SD and correlation matri. Clickin on Specify/Edit matrices opens another window in which we specify the hypothesized parameters. In the means tab we insert our means matri M, in the Cov Sima tab we choose SD and Correlation and insert the values of SR. Because this matri is always symmetric, it suffices to specify the lower diaonal values. After closin the dialo and clickin on Calculate and transfer to main window we et a value 0.7 for Pillai s V and the effect size f(v = 0.467. Clickin on Calculate shows that the power is 0.80. The analysis of between effects and interaction effects is done in an analoous way. Tests for Proportions The support for tests on proportions has been reatly enhanced in G*Power 3. Table 6 summarizes the tests that are currently implemented. In particular, all tests on proportions considered by Cohen (88 are now available: (a the sin test (Cohen, 88, Chapter 5, (b the z-test for the difference between two proportions (Cohen, 88, Chapter 6, and (c the! -test for oodness-offit and continency tables (Cohen, 88, Chapter 7. The sin test is implemented as a special case (c = 0.5 of the more eneral binomial test (also available in G*Power 3 that a sinle proportion has a specified value c. In both procedures

G*Power 3 (BSC70 Pae 8 Cohen s effect size is used and eact power values based on the binomial distribution are calculated. Note, however, that due to the discrete nature of the binomial distribution the nominal value of usually cannot be realized. Since the tables in Chapter 5 of Cohen s book use the value closest to the nominal value, even if it is hiher than the nominal value, the tabulated power values are sometimes larer than those calculated by G*Power 3. G*Power 3 always requires the actual not to be larer than the nominal value. Numerous procedures have been proposed to test the null hypothesis that two independent proportions are identical (D Aostino, Chase & Belaner, 88; Cohen, 88; Suissa & Shuster, 85; Upton, 8, and G*Power 3 implements several of them. The simplest procedure is a z test with optional arcsin transformation and optional continuity correction. Besides these two computational options it can also be chosen whether Cohen s effect size measure h or, alternatively, two proportions are used to specify the alternate hypothesis. Usin the options ( Use continuity correction off, Use arcsin transform on the procedure calculates power values close to those tabulated in Cohen (88, Chapter 6. The options ( Use continuity correction off, Use arcsin transform off compute the uncorrected! -approimation (Fleiss, 8 whereas ( Use continuity correction on, Use arcsin transform off computes the corrected! -approimation (Fleiss, 8. A second variant is Fisher s eact conditional test (Haseman, 78. Normally, G*Power 3 calculates the eact unconditional power. However, despite the hihly optimized alorithm used in G*Power 3, lon computation times may result for lare sample sizes (say N > 000. Therefore a limitin N can be specified in the Options dialo that determines at which sample size G*Power 3 switches to a lare sample approimation. A third variant calculates the eact unconditional power for approimate test statistics T (Table 7 summarizes the supported statistics. The loic underlyin this procedure is to enumerate all possible outcomes for the binomial table, iven fied sample sizes n, n in the two roups. This is done by choosin as success frequency and in each roup any combination of the values 0 < n and 0 < n. Given the success probabilities, in each roup, the probability of observin a table X with success frequencies, is: P( X, = n & % ( ( n n & % ( ( n $ ' $ '

G*Power 3 (BSC70 Pae To calculate power (- and the actual type-i error,, the test statistic T is computed for each table and compared with the critical value T. If A denotes the set of all tables X rejected by this criterion, i.e. those with T > T, then the power and the level are iven by: = & X % A P( X $,$ and * = % X $ A P( X,, where denotes the success probability in both roups as assumed in the null hypothesis. Please note that the actual level can be larer that! the nominal level! The! preferred input method (proportions, difference, risk ratio, odds ratio; see Table 6 and the test statistic to use (see Table 7 can be chaned in the Options dialo. Note that the test statistic actually used to analyze the data should be chosen. For lare sample sizes the eact computation may take too much time. Therefore, a limitin N can be specified in the Options dialo that determines at which sample size G*Power switches to lare sample approimations. G*Power 3 also provides a roup of procedures to test the hypothesis that the difference/risk ratio/odds ratio of a proportion with respect to a specified reference proportion is different under H than a difference/risk ratio/odds ratio to the same reference proportion assumed in H 0. These procedures are available in the eact test family as Proportions: Inequality (offset, two independent roups (unconditional. The enumeration procedure described above for the tests on differences between proportions without offset is also used in this case. In the tests without offset, the different input parameters (differences, risk ratio, etc. are equivalent ways of specifyin two proportions. The specific choice has no influence on the results. In the case of tests with offset, however, each input method has a different set of available test statistics. The preferred input method (proportions, difference, risk ratio, odds ratio; see Table 6 and the test statistic to use (see Table 8 can be chaned in the Options dialo. As in the other eact procedures, the computation may be time consumin and a limitin N can be specified in the Options dialo that determines at which sample size G*Power switches to lare sample approimations. Also new in G*Power 3 is an eact procedure to calculate the power for the McNemar test. The null hypothesis of this test states that the proportions of successes are identical in two dependent samples. Fiure 5 shows the structure of the underlyin desin: A binary response is sampled from the same subject or a matched pair in a standard and in a treatment condition. The

G*Power 3 (BSC70 Pae 0 null hypothesis s = t is formally equivalent to the hypothesis for the odd ratio: OR =!!. / = To fully specify H we not only need to specify the odds ratio, but also the proportion D of discordant pairs, that is, the epected proportion of responses that differ in the standard and the treatment condition. The eact procedure used in G*Power 3 calculates the unconditional power for the eact conditional test, which calculates the power conditional on the number n D of discordant pairs. Let p(n D = i be the probability that the number of discordant pairs is i. Then the unconditional power is the sum over all i - {0,. N} of the conditional power for n D = i weihted with p(n D = i. This procedure is very efficient, but for very lare sample sizes the eact computation nevertheless may take too much time. Aain, a limitin N can be specified in the Options dialo that determines at which sample size G*Power switches to a lare sample approimation. The lare sample approimation calculates the power based on an ordinary one sample binomial test with Bin(N D, 0.5 as the distribution under H 0 and Bin(N D, OR/(OR as the H distribution. Tests for Variances Table summarizes important properties of the two procedures for testin hypotheses on variances that are currently supported by G*Power 3. In the one-roup case the null hypothesis is tested that the population variance % has a specified value c. The variance ratio % /c is used as the effect size. The central and noncentral distributions, correspondin to H 0 and H, respectively, are central! distributions with N derees of freedom (because H 0 and H are based on the same mean. To compare the variance distributions under both hypotheses, the H distribution is scaled with the value r postulated for the ratio % /c in the alternate hypothesis, i.e. the noncentral distribution is r! N- (Ostle & Malone, 88. In the two-roups case H 0 states that the variances in two populations are identical (% /% =. Analoous to the one sample case, two central F distributions are compared, the H distribution bein scaled by the value of the variance ratio % /% postulated in H. Generic tests

G*Power 3 (BSC70 Pae Besides the specific routines described in Tables to that cover a considerable part of the tests commonly used, G*Power 3 provides eneric power analysis routines that may be used for any test based on the t, F,!, z, and binomial distribution. In eneric routines the parameters of the central and noncentral distributions are specified directly. To demonstrate the uses and limitations of these eneric routines we will show how to do a two-tailed power analysis for the one-sample t test by usin the eneric routine. You may compare the results with those of the specific routine available in G*Power for that test. First, we select the t tests family and then Generic t test (the eneric test option is always located at the end of the list of tests. Net, we select Post hoc as the type of power analysis. We choose a two-tailed test and 0.05 as error probability. We now need to specify the noncentrality parameter $ and the derees of freedom for our test. We look up the definitions for the one-sample test in Table 3 and find = d N and df = N!. Assumin a medium effect of d = 0.5 and N = 5 we arrive at $ = 0.5 5 =.5, and df = 4. Insertin these values and clickin Calculate we obtain a power of (-! = 0.667. The critical value t =.063 corresponds to the specified. In this post hoc power analysis, the eneric routine is almost as simple as the specific routine. The main disadvantae of the eneric routines is, however, that the dependence of the noncentrality parameter on the sample size is implicit. As a consequence, we cannot perform a priori analyses automatically. Rather, we need to iterate N by hand until we find an appropriate power value. 5 Statistical Methods and Numerical Alorithms The subroutines used to compute the distribution functions (and the inverse of the noncentral t, F,!, the z and the binomial distribution are based on the C-version of the DCDFLIB (available from http://www.netlib.or/random/ which was slihtly modified for our purposes. G*Power 3 no loner provides approimate power analyses that were available in the speed mode of G*Power. Two aruments uided us in supportin eact power calculations only. First, four-diit precision of power calculations may be mandatory in many applications. For eample, both compromise power analyses for very lare samples and error probability adjustments in case of multiple tests of sinificance may result in very small values of or (Westermann & Haer, 86. Second, as a

G*Power 3 (BSC70 Pae consequence of improved computer technoloy, eact calculations have become so fast that the speed ain associated with approimate power calculations is not even noticeable. Thus, from a computational standpoint, there is little advantae to usin approimate rather than eact methods (cf. Bradley, Russel, & Reeve, 8. 6 Proram Availability and Internet Support To summarize, G*Power 3 is a major etension of, and improvement over, G*Power in that it offers easy-to-apply power analyses for a much larer variety of common statistical tests. Proram handlin is more fleible, easier to understand, and more intuitive compared to G*Power, reducin the risk of erroneous applications. The added raphical features should be useful for both research and teachin purposes. Thus, G*Power 3 is likely to become a useful tool for empirical researchers and students of applied statistics. Like its predecessor, G*Power 3 is a noncommercial proram that can be downloaded free of chare. Copies of the Mac and the Windows versions are available at http://www.psycho.uniduesseldorf.de/abteilunen/aap/power3/ only. Users interested in distributin the proram in another way must ask for permission from the authors. Commercial distribution is strictly forbidden. The G*Power 3 webpae will offer an epandin web-based tutorial describin how to use the proram, alon with eamples. Users who let us know their e-mail addresses will be informed of updates. Althouh considerable effort has been put into proram development and evaluation, there is no warranty whatsoever. Users are kindly asked to report possible bus and difficulties in proram handlin to power-feedback@uni-duesseldorf.de.

G*Power 3 (BSC70 Pae 3 References Akkad, D.A., Jaiello, P., Szyld, P., Goedde, R., Wieczorek, S., Gross, W.L., & Epplen, J.T. (006. Promoter polymorphism rs3087456 in the MHC class II transactivator ene is not associated with susceptibility for selected autoimmune diseases in German patient roups. International Journal of Immunoenetics, 33, 5-6. Back, M.D., Schmukle, S.C., & Eloff, B. (005. Measurin Task-Switchin ability in the Implicit Association Test. Eperimental Psycholoy, 5, 67-7. Baeza, J. A. & Stotz, W. (003. Host-use and selection of differently colored sea anemones by the symbiotic crab Allopetrolisthes spinifrons. Journal of Eperimental Marine Bioloy and Ecoloy, 84, 5-3. Barabesi, L. & Greco, L (00. A note on the eact computation of the Student t Snedecor F and sample correlation coefficient distribution functions. Journal of the Royal Statistical Society: Series D (The Statistician, 5, 05-0. Berti, S., Münzer, S., Schröer, E., & Pechmann, T. (006. Different interference effects in musicians and a control roup. Eperimental Psycholoy, 53, -6 Bradley, D. R., Russell, R. L., & Reeve, C.P. (8. The accuracy of four approimations to noncentral F. Behavior Research Methods, Instruments, & Computers, 30, 478-500. Bredenkamp, J. (6. Über die Anwendun von Sinifikanztests bei theorie-testenden Eperimenten [The application of sinificance tests in theory-testin eperiments]. Psycholoische Beiträe,, 75-85. Bredenkamp, J., & Erdfelder, E. (85. Multivariate Varianzanalyse nach dem V- Kriterium [Multivariate analysis of variance based on the V-criterion]. Psycholoische Beiträe, 7, 7-54. Buchner, A., Erdfelder, E., & Faul, F. (6. Teststärkeanalysen. In E. Erdfelder, R. Mausfeld, T. Meiser & G. Rudiner (Eds., Handbuch Quantitative Methoden [Handbook quantitative methods, pp. 3-36}. Weinheim, Germany: Psycholoie Verlas Union. Buchner, A., Erdfelder, E., & Faul, F. (7. How to Use G*Power [WWW document]. URL http://www.psycho.uni-duesseldorf.de/aap/projects/power/how_to_use_power.html

G*Power 3 (BSC70 Pae 4 Busbey, A. B. I. (. Macintosh shareware/freeware earthscience software. Computers & Geosciences, 5, 335-340. Cohen, J. (88. Statistical power analysis for the behavioral sciences ( nd ed.. Hillsdale, NJ, Lawrence Erlbaum Associates. D Aostino, R.B, Chase, W. & Belaner, A. (88. The appropriateness of some common procedures for testin the equality of two independent binomial populations. The American Statistican, 4,8-0. Erdfelder, E. (84. Zur Bedeutun und Kontrolle des beta-fehlers bei der inferenzstatistischen Prüfun lo-linearer Modelle [Sinificance and control of the beta error in statistical tests of lo-linear models]. Zeitschrift für Sozialpsycholoie, 5, 8-3. Erdfelder, E., Buchner, A., Faul, F., & Brandt, M. (004. GPOWER: Teststärkeanalysen leicht emacht. In E. Erdfelder & J. Funke (Eds., Allemeine Psycholoie und deduktivistische Methodoloie [Eperimental psycholoy and deductive methodoloy, pp. 48-66]. Göttinen, Germany: Vandenhoeck & Ruprecht. Erdfelder, E., Faul, F., & Buchner, A. (6. GPOWER: A eneral power analysis proram. Behavior Research Methods, Instruments, & Computers, 8, -. Erdfelder, E., Faul, F., & Buchner, A. (005. Power analysis for cateorical methods. In B. S. Everitt & D. C. Howell (Eds., Encyclopedia of statistics in behavioral science (pp. 565 570. Chichester, GB: Wiley. Farrinton & Mannin (0. Test statistics and sample size formulae for comparative binomial trials with null hypothesis of non-zero risk difference or non-unity relative risk. Statistics in Medicine,, 447-454. Field, A.P. (005. Discoverin statistics with SPSS (nd ed.. London: Sae. Fleiss, J.L. (8. Statistical methods for rates and proportions, nd ed., New York: Wiley. Frins, C. & Wentura, D. (005. Neative primin with masked distractor-only prime trials: Awareness moderates neative primin. Eperimental Psycholoy, 5, 3-3. Gerard, P.D., Smith, D.R., Weerakkody, G. (8. Limits of retrospective power analysis. Journal of Wildlife Manaement, 6, 80-807.

G*Power 3 (BSC70 Pae 5 Gart, J.J. & Nam, J. (88. Approimate interval estimation of the ratio in binomial parameters: A review and correction for skewness. Biometrics, 44, 33-338. Gart, J.J. & Nam, J. (0. Approimate interval estimation of the difference in binomial parameters: Correction for skewness and etension to multiple tables. Biometrics, 46, 637-643. Geisser, S. & Greenhouse, S.W. (58. An etension of Bo's results on the use of the F distribution in multivariate analysis. The Annals of Mathematical Statistics,, 885-8. Gierenzer, G., Krauss, S., & Vitouch, O. (004. The null ritual: What you always wanted to know about sinificance testin but were afraid to ask. In D. Kaplan (Ed., The SAGE handbook of quantitative methodoloy for the social sciences (pp. 3-408. Thousand Oaks, CA: Sae. Gleissner, U., Clusmann, H., Sassen, R., Eler, C.E., & Helmstaedter, C. (006. Postsurical outcome in pediatric patients with epilepsy: A comparison of patients with intellectual disabilities, subaverae intellience, and averae-rane intellience. Epilepsia, 47, 406-44. Goldstein, R. (8. Power and sample size via MS/PC-DOS computers. The American Statistician, 43, 53-6. Haer, W. (006. Die Fallibität empirischer Daten und die Notwendikeit der Kontrolle von falschen Entscheidunen [The fallibility of empirical data and the need for controllin for false decisions]. Zeitschrift für Psycholoie, 4, 0-3. Haseman, J.K. (78. Eact sample sizes for use with the Fisher-Irwin test for! tables. Biometrics, 34, 06-0. Hoeni, J.N. & Heisey, D.M. (00. The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician, 55, -4. Hoffmann, J. & Sebald, A. (005. Local contetual cuin in visual search. Eperimental Psycholoy, 5, 3-38. Huynh, H. & Feldt, L.S. (70. Conditions under which mean square ratios in repeated measurements desins have eact F-distribution. Journal of the American Statistical Association, 65, 58-58. Keppel, G. & Wickens, T.D. (004. Desin and analysis. A researcher s handbook (4 th ed.. Upper Saddle River, NJ, Pearson Education International.

G*Power 3 (BSC70 Pae 6 Kornbrot, D. E. (7. Review of statistical shareware G*Power. British Journal of Mathematical and Statistical Psycholoy, 50, 36-370. Kromrey, J. & Hoarty, K.Y. (000. Problems with probabilistic hindsiht: A comparison of methods for retrospective statistical power analysis. Multiple Linear Reression Viewpoints, 6, 7-4. Lenth, R.V. (00. Some practical uidelines for effective sample size determination. The American Statistician, 55, 87-3. Levin, J. R. (7. Overcomin feelins of powerlessness in ain researches: A primer on statistical power in analysis of variance desins. Psycholoy and Ain,, 84-06. McKeon, J.J. (74. F approimations to the distribution of Hotellin s T 0. Biometrika, 6, 38-383. Mellina, E., Hinch, S.G., Donaldson, E.M., & Pearson, G. (005.! Stream habitat and rainbow trout (Oncorhynchus mykiss physioloical stress responses to streamside clear-cut loin in British Columbia. Canadian Journal of Forest Research, 35, 54-556. Miettinen, O. & Nurminen, M. (85. Comparative analysis of two rates. Statistics in Medicine, 4, 3-6. Müller, J., Manz, R., & Hoyer, J. (00. Was tun, wenn die Teststärke zu erin ist? Eine praktikable Strateie für Prä-Post-Desins. [What to do if the power is low? A useful stratey for pre-post desins]. Psychotherapie, Psychosomatik, Medizinische Psycholoie, 5, 408-46. Muller, K.E. & Barton, C.N. (8. Approimate power for repeated-measures ANOVA lackin sphericity. Journal of the American Statistical Association, 84, 54-555. Muller, K.E., LaVane, L.M., Landesman-Ramey, S. & Ramey, C.T. (. Power calculations for eneral linear multivariate models includin repeated measures applications. Journal of the American Statistical Association, 87,0-6. Muller, K.E. & Peterson, B.L. (84. Practical methods for computin power in testin the multivariate eneral linear hypothesis. Computational Statistics and Data Analysis,, 43-58. Myers, J. L. & Well, A.D. (003. Research desin and statistical analysis ( nd ed.. Mahwah, NJ, Lawrence Erlbaum Associates.

G*Power 3 (BSC70 Pae 7 O Brien, R.G. & Kaiser, M.K. (85. MANOVA method for analyzin repeated measures desins: An etensive primer. Psycholoical Bulletin, 7, 36-333. O'Brien, R.G., & Muller, K.E. (3. Unified power analysis for t-tests throuh multivariate hypotheses. In Edwards, L.K. (Ed.. Applied analysis of variance in behavioral science. (pp. 7-344. New York, NY, US: Marcel Dekker. O Brien, R.G. & Shieh, G. (. Pramatic, unifyin alorithm ives power probabilities for common F tests of the multivariate eneral linear hypothesis. UnifyPow website: www.bio.ri.ccf.or/unifypow. Ortseifen, C., Bruckner, T., Burke, M. & Kieser, M. (7. An overview of software tools for sample size determination. Informatik, Biometrie und Epidemioloie in Medizin und Bioloie, 8, -8. Ostle, B., & Malone, L.C. (88. Statistics in research: Basic Concepts and Techniques for Research Workers. Fourth Edition. Ames, Iowa: Iowa State Press. Pillai, K.C.S. & Mijares, T.A. (5. On the moments of the trace of a matri and approimations to its distribution. Annals of Mathematical Statistics, 30, 35-40. Pillai, K.C.S. & Samson, P, Jr. (5. On Hotellin s eneralization of T. Biometrika, 46, 60-68. Quednow, B.B., Kuhn, K.U., Stelzenmueller, R., Hoeni, K., Maier, W., & Waner, M. (004. Effects of serotoneric and noradreneric antidepressants on auditory startle response in patients with major depression. Psychopharmacoloy, 75, 3-406. Rao, C.R. (5. An asymptotic epansion of the distribution of Wilk s criterion. Bulletin of the International Statistical Institute, 33, 77-80. Rasch, B., Friese, M., Hofmann, W.J., Naumann, E. (006a. Quantitative Methoden. Einführun in die Statistik (. Auflae [Quantitative methods. Introduction to statistics]. Heidelber: Spriner Rasch, B., Friese, M., Hofmann, W.J., Naumann, E. (006b. Quantitative Methoden. Einführun in die Statistik (. Auflae [Quantitative methods. Introduction to statistics]. Heidelber: Spriner Rencher, A.C. (8. Multivariate statistical inference and applications. New York: John

G*Power 3 (BSC70 Pae 8 Wiley & Sons. Richardson, J. T. E. (6. Measures of effect size. Behavior Research Methods, Instruments, & Computers, 8, -. Scheffé, H. (5. The analysis of variance. New York: John Wiley & Sons. Schwarz, W. & Müller, D. (006. Spatial associations in number-related tasks. A comparison of manual and pedal responses. Eperimental Psycholoy, 53, 4-5. Sheppard, C. (. How lare should my sample be? Some quick uides to sample size and the power of tests. Marine Pollution Bulletin, 38, 43-447. Shieh, G. (003. A comparative study of power and sample size calculations for multivariate eneral linear models. Multivariate Behavioral Research, 38, 85-307. Smith, R.E. & Bayen, U.J. (005. The effects of workin memory resource availability on prospective memory: A formal modelin approach. Eperimental Psycholoy, 5, 43-56. Steidl, R.J., Hayes, J.P., & Schauber, E. (7. Statistical power analysis in wildlife research. Journal of Wildlife Manaement, 6, 70-7. Suissa, S. & Shuster, J.J. (85. Eact unconditional sample sizes for! binomial trial. Journal of the Royal Statistical Society, A, 48, 37-37. Thomas, L. & Krebs, C.J. (7. A review of statistical power analysis software. Bulletin of the Ecoloical Society of America, 78, 6-3. Upton, G.J.G. (8. A comparison of alternative tests for the! comparative trial. Journal of the Royal Statistical Society, A, 45, 86-05. Westermann, R. & Haer, W. (86. Error probabilities in educational and psycholoical research. Journal of Educational Statistics,, 7-46. Zumbo, B.D. & Hubley, A.M. (8. A note on misconceptions concernin prospective and retrospective power. The Statistician, 47, 385-388.

G*Power 3 (BSC70 Pae Footnotes The observed power is reported in many frequently used computer prorams (e.., the MANOVA procedure of SPSS. We recommend to check the df s reported by G*Power, for eample, by comparin them to the df s reported by the proram used to analyze the sample data. If the df s do not match, the input provided to G*Power is incorrect and the power calculations do not apply. 3 Plots of the central and non-central distribution are only shown for tests based on the t-, F-, z-,!, or binomial distribution. No plots are shown for tests that involve an enumeration procedure (e.. the McNemar test. 4 We would like to thank Dave Kenny for makin us aware of the fact the t test (correlation power analyses of G*Power are correct only in the point-biserial case (i.e., correlations between a binary variable and continuous variable, the latter bein normally distributed for each value of the binary variable. For correlations between two continuous variables followin a bivariate normal distribution, the t test (correlation procedure of G*Power overestimates power. For this reason, G*Power 3 offers separate power analyses for point-biserial correlations (in the t family of distributions and correlations between two normally distributed variables (in the eact distribution family. However, power values will usually differ only slihtly between procedures. To illustrate, assume we are interested in the power of a two-tailed test of H 0 : & =.00 for continuously distributed measures derived from two Implicit Association Tests (IATs differin in content. Assume further that, due to method-specific variance in both versions of the IAT, the true Pearson correlation is actually & =.30 (effect size. Given =.05 and N = 57 (see Back, Schmukle, & Eloff, 005, Study 3, p. 73, an eact post-hoc power analysis for Correlations: Differences from constant (one sample case reveals the correct power value of (- =.63. Choosin the

G*Power 3 (BSC70 Pae 30 incorrect Correlation: point biserial model procedure from the t test family would result in (- =.65.

G*Power 3 (BSC70 Pae 3 Authors Note Franz Faul, Institut für Psycholoie, Christian-Albrechts-Universität, Kiel, Germany; Edar Erdfelder, Lehrstuhl für Psycholoie III, Universität Mannheim, Mannheim, Germany; Albert- Geor Lan and Ael Buchner, Institut für Eperimentelle Psycholoie, Heinrich-Heine- Universität, Düsseldorf, Germany. Manuscript preparation has been supported by rants from the Deutsche Forschunsemeinschaft (SFB 504, Project A and the state of Baden-Württember, Germany (Landesforschunsproramm Evidenzbasierte Stressprävention. Correspondence concernin this article should be addressed to Franz Faul, Institut für Psycholoie, Christian-Albrechts-Universität, Olshausenstr. 40, D-408 Kiel, Germany (email: ffaul@psycholoie.uni-kiel.de, or Edar Erdfelder, Lehrstuhl für Psycholoie III, Universität Mannheim, Schloss, D-683 Mannheim, Germany (email: erdfelder@psycholoie.unimannheim.de.

G*Power 3 (BSC70 Pae 3 Fiure Captions Fiure : The distribution-based approach of test specification in G*Power 3.0 Fiure : The desin-based approach of test specification in G*Power 3.0 and the effect size drawer Fiure 3: The Power Plot window of G*Power 3.0 Fiure 4: Sample 3 3 Repeated Measures desins: Three roups are repeatedly measured at three different times. The shaded portion of the table is the postulated matri M of population means µ ij. The last column of the table contains the sample size in each roup. The symmetric matrices SR i specify two different covariance structures between measurements taken at different times: The main diaonal contains the standard deviations of the measurements at each time, the off-diaonal elements contain the correlations between pairs of measurements taken at different times. Fiure 5: Matched binary response desin (McNemar test.

Fiure G*Power 3 (BSC70 Pae 33

Fiure G*Power 3 (BSC70 Pae 34

Fiure 3 G*Power 3 (BSC70 Pae 35

G*Power 3 (BSC70 Pae 36 Fiure 4 Time Time Time 3 µ i n i Group 0 5 0 5 30 Group 0 5.333 30 Group 3 0.333 30 µ 0 3 5.667 µ =.88 j & 0 $ SR = $ 0.3 $ % 0. 0.3 0.3 0.! 0.3! 8! & $ SR = $ 0.3 $ % 0.3 0.3 0.3 0.3! 0.3!!

G*Power 3 (BSC70 Pae 37 Fiure 5 Standard Treatment Yes No Yes t No t s s!!!!!!!! Proportion of discordant pairs: D = Hypothesis:! s =! t or equivalently!! =

Tabl% ( ymb,ls and th%ir m%anins as us%d in th% tabl%s in this articl%7 ymb,ls 8%anin!; :! i <,pulati,n m%an :in r,up i! : V%ct,r,f p,pulati,n m%ans :in r,up i ; i! <,pulati,n m%an,f th% diff%r%nc% y N n i T,tal sampl% siz% ampl% siz% in r,up i tandard d%viati,n in th% p,pulati,n tandard d%viati,n,f th% %ff%ct! tandard d%viati,n,f th% diff%r%nc% y $ N,nc%ntrality param%t%r,f th% n,nc%ntral F and! distributi,n % N,nc%ntrality param%t%r,f th% n,nc%ntral t distributi,n df D%r%%s,f fr%%d,m df ;df Num%rat,r E d%n,minat,r d%r%%s,f fr%%d,m & :& <,pulati,n c,rr%lati,n :in r,up i ; i R Y A; R quar%d multipl% c,rr%lati,n c,%ffici%nts; c,rr%sp,ndin t, th% pr,p,rti,n,f Y ' Y ' A; B varianc% that can b% acc,unt%d f,r by multipl% r%r%ssi,n,n th% s%t,f pr%dict,r variabl%s A and A ( B ; r%sp%ctiv%ly! <,pulati,n varianc%-c,varianc% matri M 8atri,f r%r%ssi,n param%t%rs :p,pulati,n m%ans C I,ntrast matri :c,ntrasts b%tw%%n r,ws,f M A I,ntrast matri :c,ntrasts b%tw%%n c,lumns,f M : <r,p,rti,n :pr,bability,f succ%ss :in r,up i ; i

Tabl% ( T%sts f,r c,rr%lati,n and r%r%ssi,n I,rr%lati,n and R%r%ssi,n T%st Diff%r%nc% fr,m z%r,( <,int bis%rial m,d%l Diff%r%nc% fr,m c,nstant :bivariat% n,rmal Pn%quality,f tw, c,rr%lati,n c,%ffici%nts T%st family t t%sts %act Null Hyp,th%sis & * O & Eff%ct iz% Nth%r <aram%t%rs & * c & I,nstant c,rr%lati,n c N,nc%ntrality <aram%t%r % df & * & * N z t%sts & * & q * z z q m * & s i z i * ln & n i n R s * : n Q: n Q ' N 8ultipl% R%r%ssi,n( d%viati,n,f R fr,m z%r, F t%sts R Y'A * O f * R Y' A RY ' A Numb%r,f pr%dict,rs p :SA $ * f N df * p df * N p 8ultipl% R%r%ssi,n( incr%as%,f R F t%sts R R Y ' A; B * Y' A f * R R Y' A; B Y' A RY ' A; B T,tal numb%r,f pr%dict,rs p :SA U SB Numb%r,f t%st%d pr%dict,rs q :SB $ * f N df * q df * N p

Tabl% Q( T%sts f,r univariat% m%ans 8%ans :univariat% T%st Diff%r%nc% fr,m c,nstant :,n% sampl% cas% Pn%quality,f tw, d%p%nd%nt m%ans :match%d pairs T%st Family t t%sts t t%sts Null Hyp,th%sis! * c! y * O Eff%ct iz% Nth%r <aram%t%rs! c d * % * d N df * N d z X! * y * y 7y y X ; & y N,nc%ntrality <aram%t%r and D%r%%s,f Fr%%d,m % * d z N df * N Pn%quality,f tw, ind%p%nd%nt m%ans t t%sts! *!!! d * % * df d nn n n * N ANNVA; fi%d %ff%cts;,n%- way( Pn%quality,f multipl% m%ans F t%sts!! * i O; i * ; ; k f! * ; k, i* *! n :!! j N i Numb%r,f r,ups k $ * f N df * k df * N k ANNVA; fi%d %ff%cts; multifact,r d%sins and plann%d c,mparis,ns F t%sts!! * i O; i * ; ; k f *! T,tal numb%r,f c%lls in th% d%sin k D%r%%s,f fr%%d,m,f th% t%st%d %ff%ct q7 $ * f N df * q df * : N k ANNVA( r%p%at%d m%asur%m%nts; b%tw%%n %ff%cts ANNVA( r%p%at%d m%asur%m%nts within %ff%cts F t%sts F t%sts!! * i O; i * ; ; k!! * i O; i * ; ; m f f *! *! Y%v%ls,f th% b%tw%%n fact,r k Y%v%ls,f th% r%p%at%d m%asur% fact,r m <,pulati,n c,rr%lati,n am,n r%p%at%d m%asur%m%nts & $ * f un- m u * : m & df * : k df * : N $ * k f un m u * & df * : m - df * : N k: m - ANNVA( r%p%at%d m%asur%m%nts b%tw%%n-within int%racti,ns F t%sts!! ij j i * ; ; k j * ; ; m i!! * O; f! * F,r within and within-b%tw%%n int%racti,ns( N,nsph%ricity c,rr%cti,n!. -. m $ * f un- m u * & df * : k : m df * : N k: m - -

Tabl% Z( T%sts f,r multivariat% m%ans 8%ans :multivariat% T%st T%st Family H,t%llin T diff%r%nc% fr,m c,nstant m%an v%ct,r H,t%llin T diff%r%nc% b%tw%%n tw, m%an v%ct,rs 8ANNVA l,bal %ff%cts 8ANNVA sp%cial %ff%cts F t%sts Null Eff%ct iz% Nth%r Hyp,th%sis <aram%t%rs! * c!!! v T / *! v Numb%r,f! r%sp,ns% v *! c variabl%s k F t%sts!! *! v T v! / *! Numb%r,f!! r%sp,ns% v * variabl%s k F t%sts CM * 0 Eff%ct siz% Numb%r,f r,ups 8%ans matri f mult M Numb%r,f d%p%nds,n th% t%st r%sp,ns% I,ntrast statistics( variabl%s k F t%sts matri C Numb%r,f [ilks U r,ups H,t%llin-Yawl%y T H,t%llin-Yawl%y T Numb%r,f <illai V pr%dict,rs p N,nc%ntrality <aram%t%r and D%r%%s,f Fr%%d,m $ * / N df * k df * N k nn $ * / n n df * k df * N k N,nc%ntrality param%t%r and d%r%%s,f fr%%d,m d%p%nd,n th% t%st statistic and al,rithm us%d :s%% %ff%ct siz% c,lumn and Tabl% R7 8ANNVA( r%p%at%d m%asur%m%nts; b%tw%%n %ff%cts 8ANNVA( r%p%at%d m%asur%m%nts within %ff%cts 8ANNVA( r%p%at%d m%asur%m%nts b%tw%%n-within int%racti,ns F t%sts F t%sts F t%sts CMA * 0 8%ans matri M B%tw%%n c,ntrast matri C [ithin c,ntrast matri A and al,rithms( 8ull%r and <%t%rs,n :^_Z N`Bri%n and hi%h :^^^ Numb%r,f r%sp,ns% variabl%s k Y%v%ls,f th% b%tw%%n fact,r k Y%v%ls,f th% r%p%at%d m%asur% fact,r m

Tabl% a( Appr,imatin univariat% statistics f,r multivariat% hyp,th%s%s Appr,imatin univariat% statistics tatistic F,rmula Num%rat,r D%r%%,f Fr%%d,m df Eff%ct iz% and N,nc%ntrality <aram%t%r [ilks U s df E * : N c U 8< U * 0: k f : U * E k * * r : a c E U * : ca E $ * f : U df [ilks U s N U * 0: k k * b 5 ca. Q E U * 4 : ca Z f : U * E ca 6 Z U 3 c a a $ * N f :U <illai V s 8< V * 0 k E: k k * <illai V s b b N V * 0 k E: k k * df * s: N r a s f : V $ * f f : V V * : s V : V df V * : s V $ * Nsf :V H,t%llin- Yawl%y T 8< T * s 0 k * k df * s: N r a f : T $ * f * T E s : T df H,t%llin- Yawl%y T N T * s 0 k * b k f : T * T E s $ * Nsf :T H,t%llin- Yawl%y T 8< H,t%llin- Yawl%y T N T * T * s 0 k * s 0 k * k b k df * Z : ca : N r : N r * : N r * c a a * c a * a: a Q Q * a Z Q Z h * : df E: N r a Q f : T $ * f f : T * T E h : T df * T E h $ * Nhf :T

Tabl% R( T%sts f,r pr,p,rti,ns <r,p,rti,ns T%st I,ntin%ncy tabl%s d e,,dn%ss,f Fit Diff%r%nc% fr,m c,nstant :,n% sampl% cas% Pn%quality,f tw, d%p%nd%nt pr,p,rti,ns :8cN%mar T%st Family Hyp,th%sis Eff%ct iz% Nth%r <aram%t%rs N,nc%ntrality <aram%t%r! t%sts i * Oi i * ;;k k, * * i i %act O w * k, : i Oi $ * w N i* Oi * c * c I,nstant pr,p,rti,n c %act E * Ndds rati, * E <r,p,rti,n,f disc,rdant pairs * in t%st %act * E * E Pn%quality,f tw, ind%p%nd%nt pr,p,rti,ns z t%sts * :A Alt%rnat% pr,p7 ; :B h * * arcsin i i :A Null pr,p7 Pn%quality,f tw, ind%p%nd%nt pr,p,rti,ns :Fish%r`s %act t%st Pn%quality,f tw, ind%p%nd%nt pr,p,rti,ns :unc,nditi,nal %act * alt%rnat% pr,p7 Null pr,p7 %act * :A Alt%rnat% pr,p7( :B Diff%r%nc%( :I Risk rati,( E :D Ndds rati,( E: E: Null pr,p7 Pn%quality with,ffs%t,f tw, ind%p%nd%nt pr,p,rti,ns :unc,nditi,nal %act * c :A Alt%rnat% pr,p( X :B Diff%r%nc%( :I Risk rati,( :D Ndds rati,( E: X H X H E: H X H E X H :A <r,p7( X H :B Diff%r%nc%( :I Risk rati,( :D Ndds rati,( E: X H O X H O E: O X H E X H O O Null pr,p7

Tabl% f( T%st statistics us%d in t%sts,f th% diff%r%nc% b%tw%%n tw, ind%p%nd%nt pr,p,rti,ns7 Th% z-t%sts in th% tabl% ar% m,r% c,mm,nly kn,wn as 7 t%sts :th% %quival%nt z t%st is us%d t, pr,vid% tw,-sid%d t%sts7 Nr Nam% tatistic z-t%st p,,l%d varianc% * z c 8 8 : ; ; < = * : n n c n n n n * z-t%st p,,l%d varianc% with c,ntinuity c,rr%cti,n 8 8 : ; ; < = * n n k z c s%% :c 3 4 5 * upp%r tail l,w%r tail k Q z-t%st unp,,l%d varianc% * z c : : n n * Z z-t%st unp,,l%d varianc% with c,ntinuity c,rr%cti,n 8 8 : ; ; < = * n n k z c s%% :Qc 3 4 5 * upp%r tail l,w%r tail k a 8ant%l-Ha%nsz%l t%st : : V E z * c N n E : : * c : : : : * N N N n n V R Yik%lih,,d rati, :Upt,n; ^_ 8 8 : ; ; < = * : : : : : : : : : N t t n t n t N t t t t t lr c ln: ( : t * f t t%st with df = N- :D`A,stin, %t al; ^ i : : j : : : n n N N t N * N,t%--- i = succ%ss fr%qu%ncy in r,up ic n i = sampl% siz% in r,up ic n n N * = t,tal sampl% siz%c i i i n E *

Tabl% _( T%st statistics us%d in t%sts,f th% diff%r%nc% with,ffs%t b%tw%%n tw, ind%p%nd%nt pr,p,rti,ns7 Nr Nam% tatistic z-t%st p,,l%d varianc% * z-t%st p,,l%d varianc% with c,nt7c,rr7 Q Z z-t%st unp,,l%d varianc% z-t%st unp,,l%d varianc% with c,nt7c,rr7 a t t%st with df = N- :D`A,stin, %t al; ^ R Yik%lih,,d sc,r% rati, :diff%r%nc% :8i%ttin%n d Nurmin%n; ^_a Farrint,n d 8annin :^^O eart d Nam :^^O % n n E c * n n z c * : > E n n? > n E? % k E E n 5 z * c s%% :c k * 4 3 % z * c * : E n : E n > n E? % k E E n 5 z * c s%% :Qc k * 4 3 t N * :: % n : : % n Kc K * : N ElNjn : n : ik l,w%r tail upp%r tail l,w%r tail upp%r tail % z * c * > m : m E n m m : E n?k 8i%ttin%n d Nurmin%n( K = NE:N-; Farrint,n d 8annin( K = m * u c,s: w b E:Qa c * m m % c @ * n E n c a * @ c b * @ @ % :@ c % : @ @ : c * % c d * % : % c Q Q Q v * b E:Qa bc E:Ra d E:a c w * :Q7Za^ c,s : v E u E Q c u * sn: v b E:Qa c E:Qa f _ Yik%lih,,d sc,r% rati, :risk rati, :8i%ttin%n d Nurmin%n; ^_a Farrint,n d 8annin :^^O eart d Nam :^ Yik%lih,,d sc,r% rati, :,dds rati, :8i%ttin%n d Nurmin%n; ^_a k%wn%ss c,rr%ct%d z n :eart d Namc z acc,rdin t, Farrint,n d 8annin( zn * > ZA : A z? E A c > m m m m V *? : E n : E n c E Q A * V E R m : m : m E n m : m : m E n >? z * c * > m : m E n m m : E n?k 8i%ttin%n d Nurmin%n( K = NE:N-; Farrint,n d 8annin( K = m m * c m * ; = b b ZN: 8: E:Nc < b * : n n c k%wn%ss c,rr%ct%d z n :eart d Namc z acc,rdin t, Farrint,n d 8annin( zn * > ZA : A z? E A c V * : m E: m : m E: m n n c E Q A * E:RV : m : m E: n m : m : m E: n m >? : m m m m m m E: : : E: : z * E: n m : m E: n m : m K 8i%ttin%n d Nurmin%n( K=NE:N-; Farrint,n d 8annin( K = m m E: m * B : B c m * ; = b b Za: 8: E:ac < a * n : B c b * n B n : : B N,t%--- i = succ%ss fr%qu%ncy in r,up ic n i = sampl% siz% in r,up ic N * n n = t,tal sampl% siz%c i * i E ni c %C*Cdiff%r%nc% b%tw%%n pr,p,rti,ns p,stulat%d in 0 $ * risk rati, p,stulat%d in H O c B *,dds rati, p,stulat%d in H O

Tabl% ^( T%sts f,r varianc%s Varianc%s T%st T%st Family Null Hyp,th%sis Eff%ct iz% Nth%r <aram%t%rs N,nc%ntrality <aram%t%r Diff%r%nc% fr,m c,nstant :,n% sampl% cas%! t%sts * Varianc% rati, c r * c $ * O :H ( c%ntral! distributi,n; scal%d with r df * N Pn%quality,f tw, varianc%s F t%sts Varianc% rati, * r * $ * O :H ( c%ntral F distributi,n; scal%d with r df df * n * n