Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance

Transcription

1 Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Winston Richards Schering-Plough Research Institute JSM, Aug, 2002 Abstract Randomization or permutation tests have been studied extensively under the finite model framework in comparative experiments to assess whether one treatment is significantly different than the other under the null hypothesis of no treatment difference.we explore the use of the randomization principle to construct confidence intervals for the treatment differences (considered as shifts) through the conceptual recovery of the null (hypothesis) state from the alternative in the sample. In the linear model framework, they may be constructed by generating a null distribution around the estimated shift (under least squares theory, say,) using the finite model rerandomization sample space of estimates derived from residuals. These confidence intervals do not require any assumptions of normality which are quite questionable for small sample cases. They are obtained under a very a simple assumption of additivity under the finite model setting. Illustrative Coverage probabilities are presented for some applications of the methodology. We consider replicate (cross over) designs where the confidence intervals are obtained using randomization theory versus the usual normal theory assumptions and the mixed model approach with factor analytic covariance structure from the FDA guidance on average bioequivalence for pharmacokinetics. 1 Introduction In situations where the distribution of the underlying population of interest is assumed (e.g., normal distribution), the validity of the procedure used, or the probabilistic functions based on the resulting sampling distribution will depend, perhaps critically, on the assumptions. Violations of the assumptions may have very serious consequences on the correctness or justification of the inference or decision involved. In the practical world, however, one's experience or prior information about a data situation may be quite inadequate to justify the assumptions. Besides, in many cases the sample size may be too small and the appeal to general large sample theory may be without basis. In these situations, distribution free or non-parametric methods may provide suitable procedures for analyzing the data. Below we give a brief and coherent description of finite model randomization methodology. Fisher in his 1926 paper (1) and in his now classical book "The Design of Experiments" (1935) (2) put forward an idea, the principle of randomization, that has dominated comparative experiment methodology for the past 75 years. The principle of randomization under a finite model setting was illustrated in testing the null hypothesis of no treatment difference in his Lady Testing Tea experiment. There Fisher used a restricted 2885

2 randomization or permutation test conditional on fixed sample sizes for the two treatments. In this article the principle of randomization under a finite model setup is used as the framework from which we construct confidence intervals when the response variables are either ordinal or dichotomous (Richards and Gogate [7]). The principle of randomization was expounded in Fisher (1935) (2) and in greater detail by Kempthorne (1952) (3). Kempthorne and Folks(1971) (4) defined consonance intervals in the context of the inversion of tests of significance. However, only recently has the application of this principle directly in data analysis received widespread attention due to the advancement in the computer technology. Lehmann (1997) (5), has discussed construction of a confidence interval using this concept for a shift in location parameter of an infinite population as an approximation which may be considered as the treatment effect for a comparative trial. Our motivation is inter alia to challenge the rather complex forms of assumptions and complicated analyses that are now rampant in the mixed model arena to make an inference about the comparison of treatment effects. We suggest that the inference from the finite model approach is sound. It does incorporate or embrace a wide range of underlying realities under approximate additivity of treatment effect, even under the infinite model distributional assumptions. We extend this concept to construct a confidence interval for the (additive) treatment difference in (repeated measures) simple cross over design and replicate design for the case of two treatments. We investigate the extension of this approach to the analysis of covariance in the linear model framework. We suggest a way of obtaining confidence intervals of the odds in dichotomous contexts where the link function is considered linear in concomitant variables and additive in the treatment effect, for the logistic regression, say. 2 Finite Model Framework An important feature of the finite model theory is its usefulness in making inference in situations where the probability distributions of the observations are unknown. Incorporation of randomization into experimental design gives a strong basis for statistical inference, particularly, if additivity as defined later holds. The probability statements and associated inferences have definitive relations to what is conceptually observable in a situation. The reader is referred to the work of Kempthorne in the derivation of Models which may be represented in form similar to parametric forms for the factorial analysis of variance, say, with main effects and interactions. However, these definitions are given in terms of the conceptual population defined below. They reflect algebraic partitions (combinations) of the basal responses of the experimental units without appeal to added distributional assumption of errors. In our discussions below we fill use the usual parametric forms of the models in our finite model estimation procedures. Suppose there are N experimental units U 1, U 2,,U N. Since each unit can receive any of the t (sequences of) treatments, the conceptual population of responses will consist of N t possible (vector) responses. In an actual experiment, however, we are restricted by the fact that each unit is assigned to only one sequence of treatments so that we have a restricted sample from the conceptual population which we must use to make a statistical inference for the difference of treatment means, for example, via test of the null hypothesis or confidence 2886

3 interval estimation. We restrict our discussion for the case of two treatments since that is of primary interest to us. Extension to a sequence of treatments follow logically. Suppose γ 1 and γ 2 denote the true treatment effects (e.g., means or proportions) associated with the two treatments (say A (new or active treatment) and B (control or standard of therapy)). Suppose Z i denote the basal vector response of unit U i. Under additivity, response C ik (of unit U i under treatment sequence vector k) can be expressed as C ik = Z i + γ k. In the sample, y ik the i th vector response under treatment sequence k is then given by Where δ j k(i) = y ik = Σ j C jk δ j k(i) 1 if unit j is the i-th replicate unit of treatment sequence k 0 otherwise Note that P P k ( i) 1 ( δ = 1) i = N k ( i) k '( i') ( δ = 1, δ = 1) j Conceptually we use an inversion process to test an hypotheses under the finite model setup. The equivalence (or one to one correspondence) between rejection and acceptance regions shows the structure of confidence sets as the totality of parameter values for which the hypothesis H ( ) is accepted when a sample is observed (see, for example, Section 3.5 of Lehmann (1997) for more details for the case where each unit has a single response). This property may be used in constructing the confidence intervals using randomization (permutation) tests. However a conceptually equivalent approach is to use an empirical distribution of linear contrasts on mean vector response per sequence based on various suband j' 1/N(N-1) for j j and i i 0 otherwise Let y. k denote the sample mean of the observations y ik. Basedontheaboverepresentationof y ik, it follows that the E(y. k )=Z.+γ k and hence contrasts c 1 y 1 c 2 y 2 is an unbiased estimate of the true difference =c 1 γ 1 -c 2 γ 2. Also notice that the average of any subset of the sample observations on sequence k is also an unbiased estimate of the population mean Z. + γ k, and consequently, linear contrasts the differences of the sample averages are unbiased estimates of the true difference. This property suggests a natural way of constructing confidence intervals as quantiles of the empirical distribution of the difference of populations means as described in Section Construction of Confidence Intervals 2887

4 samples of the two samples to obtain a confidence interval for the true difference (of the parameters under consideration). In this, the key property of unbiasedness of the difference in sample means under the finite model setup as described in the previous section is used as a basis to obtain the empirical distribution. We will derive confidence intervals for the difference in treatment effects. The usual least squares (or generalized least squares method under compound symmetry) estimates for the parameter models may be shown to yield unbiased estimates of differences in direct treatment effects under randomization for the finite model with additivity. We may use this general approach to obtain the point estimate and randomization theory to obtain the confidence interval estimates (relative to the point estimate) using the residuals. Inversion method under the finite model set up Treatment Unit Responses We consider scalar responses to illustrate the theory. The generalization to vector responses under the simple additivity of component treatment effects follows. Suppose X 1,X 2,,X m and Y 1,Y 2,,Y n are sets of scalar observations under treatment sequences C and T respectively with true population means γ 1 and γ 2. We assume that the active treatment and the control appear in different periods for sequence B and sequence A. Without loss of generality, we assume that the first m areassignedtotreatmenta and the remaining n observations have received treatment B in what follows. Our interest is to obtain a confidence interval for the true contrast =c 1 γ 1 -c 2 γ 2. Note that, under the finite model set up with additivity, the X's and Y's are a realization of the original basal responses Z's where the actives are shifted by an amount as laid out in the following table. Unit U 1... U m U m+1 U m+2... U m+n Treat. C X 1... X m Treat. T Y 1 Y 2... Y n Basal Z 1... Z m Z m+1 + Z m Z m+n + If were known, by subtracting from the Y's we would be in a null hypothesis situation of no treatment difference and the resulting observations would be one of the m+n C m possible (treatment group) realizations of the original Z's restricted only by the observed sample size. Therefore, a natural way of finding a confidence interval for with confidence coefficient 1 - α is to find the totality of values o such that, by subtracting 0 from the Y's, the null hypothesis of no treatment difference is not rejected at a significance level α using a randomization test (based on the modified or adjusted values). The lower and upper bounds of the values 0 then constitute a confidence interval as noted earlier. The algorithmic steps can thus be summarized as follows. Compute the modified or adjusted responses from the observed values by choosing an initial value 0 to create a "null" hypothesis situation. Compute the estimate for each of the m+n C m possible assignments. 2888

5 Compute the significance level by comparing the new values of the test statistic with the value obtained for the original observations from which 0 is subtracted. Retain the value of 0 if the significance value is less than or equal to the value corresponding to the desired confidence coefficient. Otherwise, repeat the above steps until one reaches the desired significance level, perhaps, by using the method of bisection or method of tangents. The procedure may have to be carried out separately to obtain upper and lower confidence limits. Note that it may not be feasible to obtain the exact value of α because the distribution is necessarily discrete. Remark 1 The above algorithm is given mainly for pedantic purposes. A logically equivalent and more efficient way of obtaining confidence intervals is to apply the empirical differences method as described in Lehmann (1997) [5] in the finite model set up: Since each observation is an unbiased estimate of the population mean, so is the average of those observations taken r at a time where r {1,2, min(m,n)}. In turn, the differences of such averages is an unbiased estimate of the difference of the population means. Generate the distribution of all such possible differences. The quantiles Q α/2 and Q 1 α/2 of this empirical distribution will constitute a 1 - α confidence interval for.. Equivalently, generate the null distribution of all such possible differences, using the residuals from the least squares fit. The quantiles Q α/2 and Q 1 α/2 of this null empirical distribution shifted by the point estimatewill producethe1- α confidenceinterval for. previously described. Let us consider the replicate design with sequences 1, TRTR and 2, RTRT. Under additivity the conceptual vector response for patient k under sequences 1 and 2 may be written Z 1k = (u k1 u k2 u k3 u k4 )+ τ(1010)andz 2k = (u k1 u k2 u k3 u k4 )+ τ( ), respectively, where τ is the additive difference between the test treatment T, and the control R. So that if τ were known, subtracting τ ( ) from the vector response of unit k under sequence 1 or correspondingly τ ( )undersequence2 would recoverthebasalnullvectorresponse for unit k, i.e., U k= (u k1 u k2 u k3 u k4 ) Under the assumption of a compound symmetry covariance structure, the best linear unbiased estimate of the direct treatment difference between treatments T and R adjusting for possible first order carryover effects is given by the inner product T 0 = C (Y 1. Y 2. ) =( ) [(y 1.1 y 1.2 y 1.3 y 1.4 )-(y 2.1 y 2.2 y 2.3 y 2.4 )] / 20 where the y i.p are the (scalar) component for period p of the mean vector responses over subjects in sequences i=1,

6 A1-α confidence interval for τ is given by (t 1 t 2 ) such that by modifying the observed vector responses as described above for τ equal to t 1 and t 2 would result in observed significance levels for the statistic T 0 of α 1 and α 2, respectively, under the randomization distribution from the modified values, with α =α 1 + α 2. As before, this is equivalent to generating the distribution of T 0 over all subsets taken r at a time from each sequence for r=1, 2, l, where l is the minimum of n 1 and n 2 and selecting the appropriate quantiles. Alternatively, we may use the residuals from the least squares fit instead of the responses and shift the resulting null distribution distribution by the point estimate The Analysis of Covariance. In situations where concomitant values are considered to influence the response the analysis of covariance model for the response of the j-th replicate unit on treatment i is usually represented in the form Y ij =µ + α i +X i β + ε ij with ancova table and resulting ANOVA as follows: Sum of Squares Source df xx xy yy Treatments t-1 T xx T xy T yy Residual N-t R xx R xy R yy Total N-1 G xx G xy G yy ANOVA X Treat p R xy R -1 xx R xy Treat X t-1 G yy -R yy - G xy G -1 xx G xy +R xy R -1 xx R xy Error N-t-p R yy -R xy R -1 xx R xy Our interest here is to obtain a confidence interval estimate of the comparison of the treatment effects after adjusting for the nuisance parameter β, estimated under least squares by b= R -1 xx R xy, which for scalar b is equal to Σ ij (y ij -y i. )(x ij -x i. )/Σ ij (x ij -x i. ) 2. We note that even under a single null grouping the true value of β cannot in general be determined via the finite model with additivity. However, we could assign its value by convention to be equal to G -1 xx G xz, where Z is the conceptual population of responses under basal conditions. The LS estimate of the comparison between two treatments from the ANCOVA model is then given by a k -a k =y k. -y k. -(x k. -x k. )b. 2890

7 EMS ANOVA Under Randomization theory For the simple case of equal sample sizes for two treatments under Randomization theory it may be shown that both estimates below are unbiased for the treatment effect. E(a k -a k )=τ k - τ k.. (unbiased ) E(y. k )-E(y k )=τ k - τ k.. (unbiased ) However, for the analysis of variance table the expectation of means square for treatment effects and for error are not equal under null conditions, since E R (R xy R -1 xx R xy ) ± G xz G -1 xx G xz. (N-t-p)/(N-1-p) and E R (R -1 xx R xy ) ± G -1 xx G xz. Nevertheless, as the main interest is the comparison of the treatment effects the analysis of covariance model may be written in the analysis of variance form w ij = Y ij -X i β =µ + α i + ε ij Conditional on the value of the nuisance parameter β, estimated by b, we can modify the initial value of the response by removing the nuisance effects and proceed, as in the analysis of variance above, to obtain the randomization confidence interval on the adjusted values {w ki }={ y ki.- x ki.b} Confidence Intervals for the treatment effect in the Analysis of Covariance. As in the the previously described sections, the procedure for the analysis of covariance is equivalent to using the residuals from the least squares fit to generate the null distribution of the treatment difference as described above and shifting the resulting quantiles of the distribution by the point estimate. The rerandomization approach may be used in the determination of confidence intervals in binary situations where the link function is additive in the treatment effect Randomization Confidence Intervals for the treatment effect in binary logistic regression. For the case of two treatments where the response rate is considered approximately as increasing in a binary logistic regression, the model may be approximated as EY ij =π(x)= exp(δ i +X ij β)/ (1+ exp(δ i +X ij β)), where β may be considered as nuisance parameters and the δ i =0orδ is the effect for one treatment relative to the other. As in the analysis of covariance the true value of β cannot in general be determined via the finite model with additivity. However, we could proceed to estimate the δ and β jointly, and obtain a confidence interval for δ accounting for X or conditional on β. Hirji, Mehta and Patel, 1987 (6), 1995, (6a) have derived and examined Maximum likelihood estimates and Exact likelihood ratio estimates based on the sufficient statistics. 2891

8 A minor modification to their approach enables randomization confidence interval estimate for δ to be obtained assuming the nuisance parameters β areknown and equal totheir estimated values, b. The limits would be given by the quantiles from the distribution of solutions of δ using the likelihood function or estimating equations on the subsets of the data of sizes 1 to m as described above with β equated to their estimated values, b. Treatment subgroups will need both types of outcomes to obtain a solution for the treatment effect. Subgroups that violate this criterion may require an imposed modification in the relevant equations or may be assigned to null or extreme value solutions for the treatment effect depending on whether the patterns are similar or not. The coverage based on this approach is to be compared with that for the exact CI of Mehta and Patel. Results and Conclusions Replicate Designs a) For the sample sizes investigated, n=12, 16, and 20, under the finite model (permutations) the coverage of our procedure is equal to the nominal value within tolerance of a finite discrete distribution. The coverage of confidence intervals obtained by the usual cross over model with first order carryover and normal assumptions under compound symmetry and from the mixed model procedure recommended in the FDA guidance for Bioequivalence exceed the nominal values. b) Sampling from an infinite model framework shows that our procedure produces coverage that is close to the nominal value. The coverage of confidence intervals obtained by the usual normal assumptions under compound symmetry is slightly exceeded by the nominal values. While coverage from the mixed model procedure of the FDA guidance for Bioequivalence slightly exceeds the nominal values. c) Preliminary work on coverage for the analysis of covariance under a) the finite model (permutations) construction and b) the infinite model was performed for small sample sizes (12, 16 and 20). The simulations were done using truncated pseudo normal variables, and various values of the slope. The results indicate that the coverage of our procedure for ANOVA is close to the nominal value for the 99%, 95% and 99% confidence intervasl and a little smaller than the nominal values in ANCOVA.for the 95% and 90% confidence intervals. The width of the interval in ANCOVA is smaller than for ANOVA. Sampling from the finite model framework, the coverage of confidence intervals obtained using the usual normal estimates are close to thethe nominal values. The finite model rerandomization procedure gives coverages that are equal to the nominal values for the ANOVA. and slightly less than the nominal values in ANCOVA.for the 95% and 90% confidence intervals. Situations where data are from underlying nonnormal are being investigated. 2892

9 e) Computation issues As the sample sizes increase the panoply of outcomes for the sample space becomes computationally prohibitive. In application, given the data we would estimate the confidence intervals by selecting a weighted random sample of subsets of sizes 1 to m as defined above The weights are determined by the relative frequency of the subset sizes in the randomization sample space. Monte Carlo Results for the Replicate Design. The results given below are for illustrative purposes. The infinite model data were generated to reflect various inter subject and intra subject heterogeneity for the treatments. Results are based on approximately 2000 samples each. In the finite model table, results were obtained for underlying actual finite model additive data. The coverage examples for sample size 12:6-6* are based on the complete restricted set of 924 assignments of the 12 subjects to treatment groups of size 6. For sample size 16:8-8 and 20:10-10 the results are based on 2000 random selections from the complete restricted set of assignments of the subjects to treatment groups of equal sizes. Infinite model Results: Sample Size and Coverage(/2000) Sample Sizes(N:n1-n2) 12:6-6* 16:8-8 20:10-10 Method/ Confidence 99% 95% 90% 99% 95% 90% 99% 95% 90% finite model glm w.carry-ovr mixed-fda Finite model Results: Sample Size and Coverage(/2000) Sample Sizes(N:n1-n2) 12:6-6 16:8-8 20:10-10 Method/ Confidence 99% 95% 90% 99% 95% 90% 99% 95% 90% finite model glm w.carry-ovr mixed-fda

10 Monte Carlo Results for the ANOVA and ANCOVA. Again the results given below are for illustrative purposes. The infinite model data were generated using a pseudonormal random generator truncated at 2 standard deviations from the means. In the finite model table, results were obtained for underlying actual finite model additive data. The data were derived using subsets from the samples above and unique rerandomization assignments of the units into treatment groups. The coverage examples for sample size 12:6-6* are based on the complete restricted set of 924 assignments of the 12 subjects to treatment groups of size 6. For sample size 16:8-8 and 20:10-10 the results are based on 2000 random selections from the complete restricted set of assignments of the subjects to treatment groups of equal sizes. Finite model Table: Sample Size and Coverage(/2000) finite model Table: 12:6-6 16:8-8 20:10-10 Method/ Confidence 99% 95% 90% 99% 95% 90% 99% 95% 90% finite ANOVA Finite ANCOVA GLM ANOVA GLM ANCOVA References Infinite model Table: Sample Size and Coverage (/1000) infinite model Table: 12:6-6 16:8-8 20:10-10 Method/ Confidence 99% 95% 90% 99% 95% 90% 99% 95% 90% finite ANOVA Finite ANCOVA GLM ANOVA GLM ANCOVA Fisher, R. A. (1926) The Arrangement of Field Experiments. J. Min. Agric. Eng. 33: Fisher, R. A. (1935) The Design of Experiments. Oliver and Boyd, Edinborough, England 3 Kempthorne, Oscar and LeroyFolks (1952) 2894

11 Design and Analysis of Experiments 2 nd ed. John Wiley and Sons, Inc., New York 4 Kempthorne, Oscar and LeroyFolks (1971) Probability, Statistics and Data Analysis, Ames, Iowa: Iowa State Press. 5Lehman,E.J. (1997) Testing Statistical Hypothesis, 3 rd ed. John Wiley and Sons, Inc., New York 6 Hirji, Karim F., Mehta, Cyrus R., and Patel, Nitin R. (1987), Computing Distributions for Exact Logistic Regression, JASA,82 6a Mehta, C Rand Patel N, R (1995) Exact logistic regression: theory and examples. Statistics in Medicine, Richards,Wand J. Gogate (2000) Finite Model Confidence intervals for Dichotomous data, unpublished Manuscript. 2895