Sample Size Calculation and Power Analysis for Design of Experiments Using Proc GLMPOWER Chii-Dean Joey Lin, SDSU, San Diego, CA

Transcription

1 Sample Size Calculation and Power Analysis for Design of Experiments Using Proc GLMPOWER Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT How many samples do I need? When a statistician or a quantitative analyst gets involved in a project, this is always one of the first questions to be asked. SAS has developed several tools to address this need. For example, PSS( and sample size), Proc POWER, and Proc GLMPOWER are all available for users to use in various designs. Introductions in PSS and Proc POWER have been presented in many SAS related conferences. However, Proc GLMPOWER is relatively unaware comparing to the other two tools. In this paper, we introduce how one can easily understand the syntax of Proc GLMPOWER for sample size calculation or analysis that are related to design of experiments. In conducting an experimental design, Proc GLM allows one to test for pre-planned contrasts. Similarly, the Proc GLMPOWER can be used to calculate either the sample size or the for any contrast one may have. The related syntax is discussed in this paper. In addition, how to take an advantage of the ful ODS features in Proc GLMPOWER is also addressed. INTRODUCTION In conducting an experiment, sample size determination or analysis is always an important topic to be addressed. It is getting easier for calculating a sample size or determining a nowadays due to innovative developments of statistical software. SAS has developed two procedures (Proc POWER and Proc GLMPOWER) in recent versions. SAS also has an application (PSS) that comes with SAS/STAT product but is installed separately. As introduced in the SAS user manual, Proc POWER covers and sample size analysis for a variety of more basic statistical analyses such as t test, equivalence test, binomial proportions, multiple regression, logistic regression, and some nonparametric tests. Proc GLMPOWER is designed to cover analyses for design of experiments that can be analyzed using Proc GLM. However, tests and contrasts involving random effects are not supported by Proc GLMPOWER. In this paper, we focus our discussions on Proc GLMPOWER. Statistical methods used for sample size and analysis include classical approach (frequentist approach) and Bayesian approach. The Proc GLMPOWER provides analyses through classical sample size calculation. To decide the sample size, one has to know the design and used test method along with information such as desired Type I error rate (α), effect size, standard deviation of the studied response variable and desired (which is equal to 1- Type II error rate (β)). The effect size is the standardized mean difference among treatments we wish to detect in our experiments. In Proc GLMPOWER, the effect size is entered using conjectured treatment means and specified standard deviation of the response. The conjectured treatment means represent the desired possible scientific meaningful difference an investigator is willing to detect. Sample size and analysis related discussions can be found in Hoenig & Heisey (2001), Length (2001), and Littell, et al (2005). In addition, O Brien (2004, 2006) provided excellent concepts and issues related to sample size analysis. O Brien (2006) also proposed the concept of crucial type I and type II error rates. In this paper, we introduce basic syntax of Proc GLMPOWER. Since the GLMPOWER procedure is still an experimental version, the user manual is still relative concise. Only two examples are presented in the manual. Though readers who are familiar with both Proc POWER and Proc GLM will find that this manual is adequate, we provide more experimental design examples for readers to follow. The Proc GLMPOWER can be seen as a combination of Proc GLM (the first part of the syntax) and Proc POWER (the second part of the syntax). The first part tells Proc GLMPOWER what design we want to compute either the or the sample size. Any pre-planned contrast tests can be specified as well. For information needed for the Proc GLMPOWER, it will be better to enter the necessary information through an exemplary data set. The exemplary data provides conjectured treatment means that will be used in Proc GLMPOWER. The available statements for Proc GLMPOWER are listed below. PROC GLMPOWER < options > ; CLASS effects ; MODEL dependent-variable(s) = effect(s) ; CONTRAST label effect values <... effect values > ; WEIGHT variable; POWER < options >; PLOT < plot-options >; 1

2 Note that the blue colored statements are the same as Proc GLM while the red colored statements can be used in Proc POWER. Also note more than one contrast statement can be specified in Proc GLMPOWER. However, the CONTRAST statement must appear after the MODEL statement. Similar to Proc GLM, the CLASS statement has to appear before the MODEL statement. If a continuous (or categorical) variable that is stated in the MODEL statement and is not in the CLASS statement, this variable will be treated as a covariate of the model. The effect of the covariate is identical to an independent variable in a regression model. A model includes both a classification effect (an effect stated in the CLASS statement) and a covariate (a variable specified in the right side of the MODEL statement but not listed in the CLASS statement) is called an Analysis of Covariance (ANCOVA). SAMPLE CODES Assume we have a two-factor factorial design (two-way ANOVA) and there is no interaction between Factor A and Factor B. In this design, we assume Y is the response variable and there are two levels of Factor A and three levels of Factor B (a total of six treatments that is decided by the combination of two levels of A and three levels of B).We are asked to calculate the necessary sample size to test the factor effects (Factor A effect and Factor B effect). In this example, both A & B are classification variables and they should be listed in the CLASS statement. Before we call Proc GLMPOWER, an easier way for entering the means is to create an exemplary data set that includes treatment means (cell means) for each combination of Factor A levels and Factor B levels. Table 1 represents the response means for the combination of Factor A levels and Factor B levels. Table 1. Response means by Factor A and Factor B Factor B Factor A B1 B2 B3 A A Now we enter the response means into a SAS data set. /* exemplary data set */ data aa; input A$ B$ Y; cards; ; To calculate the necessary sample size for detecting Factor A effect and Factor B effect, the following Proc GLMPOWER code is used to serve that purpose. In doing so, we need to provide the standard deviation for the response (assume the standard deviation is 5 in this case) from either a pilot study or from previous studies, the type I error rate (α) (we enter α = for a two-sided test) and a desired (we use 0.9). The program code can be seen in Code #1 and the output is shown in Figure 1. /* Code #1 */ title 'For Code #1'; proc glm data = aa; model y = A B; = 0.9; 2

3 Figure 1: Summary Table for Code #1 Figure 1 shows the summary output for the Proc GLMPOWER. The calculated sample size for testing Factor A effect is 54 and is 30 for testing Factor B effect. Since we did not specify the weight for each treatment, a balanced design (all treatments are assigned same number of sample size) is assumed. That is, for Factor A effect test, each treatment will receive 9 (=54/6) random samples. Similarly, each treatment will receive 6 (= 30/6) random samples for testing Factor B effect. One can see that the actual for Factor B effect test is which is much larger than our desired of 0.9. This is caused by the assumption of a balanced design. The N total will be a multiple of 6 (the number of treatments (2 levels of A X 3 levels of B)). If we allow the study to be close to balanced but not necessary to be exact, we can add an option nfractional in the POWER statement. The following Code #2 demonstrates this adding. /* Code #2 */ title3 'For Code #2'; proc glm data = aa; model y = A B; nfractional = 0.9; With the nfractional option, the output shown below display smaller total sample sizes for both Factor A and Factor B effect tests. For testing Factor A effect, a total of 53 experimental units are needed. This can be allocated by assigning 9 units to each of the 5 randomly selected treatments and assigning 8 units to the rest treatment. The sample reduction of this design is only 1 and 3 for the Factor A and Factor B factor effect tests, respectively. 3

4 Figure 2: Summary Table for Code #2 If we assume that there is an interaction between Factor A and Factor B, we can fit a model including an interaction term between A and B. The Code #3 reflects this scenario and the results are shown below. /* Code #3 */ title3 'For Code #3'; proc glm data = aa; model y = A B A*B; = 0.9; Figure 3: Summary Table for Code #3 4

5 If the interaction effect is interested, a total sample size of 192 is needed (192/6=32 units per treatment). Note that the exemplary data set was generated from a simulation with a model that an interaction effect did not exist. Thus, it will require a much larger sample size to detect small differences among treatment means. To show a curve, we can add a plot statement. See Code #4 and Figure 4 for the curves. /* Code #4 */ title3 'For Code #4'; proc glm data = aa; model y = A B; = 0.9; plot x= min =.7 max =.95; Figure 4: Power Curve for Code #4 A reference line can be added into the plot so that it will be easier to identify the necessary sample size. This can be done by adding ref and crossref options into the plot statement. plot x= min =.7 max =.95 xopts = (ref =.9 crossref = yes ); Note that more than one reference line can be placed. The graph is shown in Figure 5. 5

6 Figure 5: Power Curve With Reference Lines If we want to change the marker symbols so that different curves will have different symbols, we can add vary(symbol) option in the PLOT statement (code not shown). What we have introduced is to calculate sample sizes with a known. If we want to calculate the based on a specified Ntotal, we can use a code like Code #5. /* Code #5 */ title3 'For Code #5'; proc glm data = aa; model y = A B; ntotal = 20, 30 to 40 by 5 =. ; plot x=n min = 18 max =60 ; A calculated for each specified sample size is reported. The table below shows the s under different Ntotals for testing both Factor A effect and Factor B effect. Note that Proc POWER allows one to state either npergroup or ntotal option. However, in Proc GLMPOWER, one can only use the ntotal option but not the npergroup option. 6

7 Figure 6: Summary Table for Code #5 Figure 7: Power Curves Generated by Code #5 If a pre-planned contrast B1 vs. B3 is interested, to decide the necessary sample size for the contrast we can add the following statement into one of the previous codes. contrast 'b1 vs. b3' B 1 0-1; Note that this statement should be stated after the MODEL statement. When more than one tests are conducted, one has to be aware of the overall type I error rate. The specified typical is for a single test with two-sided hypothesis testing. For multiple contrasts, an adjustment of the type I error rate is necessary so that either the family- 7

8 wise error rate or the experiment-wise error rate will be governed at the desired level. One easy but conservative adjustment method is using the Bonferroni adjustment. That is, if there are 4 scientific meaningful contrasts to be answered, the Bonferroni adjustment of /4 will assure the family-wise error rate of 0.05.Other methods such as Holmes and Tukey method can be used for the adjustment. Analysis of Covariance (ANCOVA) In calculating a sample size for the analysis of covariance (ANCOVA), Proc GLMPOWER requires additional information for the correlation between the covariate and the dependent variable. Two options, ncovariates and corrxy, under the need to be added. See Code # 6. /* Code #6 */ title3 'For ANCOVA'; proc glm data = aa; model y = A B; ncovariates = 1 corrxy =.2.4 = 0.9; Note that only one value for the corrxy is needed. In this example, we show the calculated sample sizes when the correlations between Y and the covariate are 0.2 and 0.4. From Figure 8, we can see that if the correlation between Y and the covariate is higher, the needed sample size will be smaller. In addition, the standard deviation used to calculate the sample size is also adjusted due to an inclusion of the covariate. In our example, the adjusted standard deviations are 4.9 and 4.58 for correlations of 0.2 and 0.4, respectively, while the original specified standard deviation is 5. Figure 8: Summary Table for ANCOVA Randomized Complete Block Design (RCBD) 8

9 Similar to other designs, Proc GLMPOWER can be used for the RCBD. Assume an exemplary data set rcbd has been created. We can use the following code to calculate the necessary sample size (B is the block effect and A is the treatment effect). Note that the block effect is not a main focus in general, we can use effects = (A)under POWER statement to show the calculated sample size for the Factor A effect only. The calculated sample size for A is identical to the result without using the effects = (A) statement. /* Code for RCBD */ title3 'For RCBD'; proc glm data = rcbd; class B A; model y = B A; effects = (A) = 0.9; Nested Design In a nested design, we assume Factor B is nested in Factor A. The sample size calculation can be done by using the model statement: model y = A B(A); Note that there is no interaction term in the model statement. In a nested design, an interaction does not exist. CONCLUSION This paper briefly introduces Proc GLMPOWER. We provide sample codes for using Proc GLMPOWER under a variety of situations. Note that the Proc GLMPOWER can be viewed as a combination of both Proc GLM and Proc POWER. Most designs except models involve random effects in proc GLM can use Proc GLMPOWER for sample size calculation or analyses. While considering the sample size calculation, one has to be aware of possible dropouts during the experiment. If a dropout rate exists for a similar study, this information should be incorporated to inflate the required sample size. Please note that Proc GLMPOWER and Proc POWER are designed to provide sample size calculations and prospective analysis to be used in the pre-planned stages of an analysis. They are not used for retrospective analysis. Current version of Proc GLMPOWER does not support for sample size calculation for repeated measure analysis. However, one can use Proc MIXED and a non-central F function to manually calculate the necessary sample size. REFERENCES Hoenig and Heisey (2001). The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis. The American Statistician, 55:19-24 Lenth (2001). Some Practical Guidelines for Effective Sample Size Determination. The American Statistician, 55: Little, Milliken, Stroup, Wolfinger, and Schabenberger (2005). SAS for Mixed Models, 2 nd Ed. Gary, NC: SAS Institute Inc. O Brien and Castelloe (2004). Sample-Size Analysis in Study Planning: Concepts and Issues, with Examples Using Proc POWER and Proc GLMPOWER. In Proceedings of the SAS Users Group International (SUGI) Conference, SAS Institute (Gary, NC). O Brien, R. G. and Castelloe, J. (2007), Sample-Size Analysis for Traditional Hypothesis Testing: Concepts and Issues, in Pharmaceutical Statistics Using SAS: A Practical Guide, ed. A. Dmitrienko, C. Chuang-Stein and R. D Agostino, Cary, NC: SAS Institute Inc., Chapter 10,

10 SAS Institute Inc. Introduction to Power and Sample Size Analysis. Cary, NC: SAS Institute Inc. SAS Institute Inc. SAS/STAT 9.2 User s Guide. Cary, NC: SAS Institute Inc. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Dr. Joey Lin Department of Mathematics & Statistics, San Diego State University San Diego, CA cdlin@sciences.sdsu.edu SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 10