Bonferroni-Based Size-Correction for Nonstandard Testing Problems

Transcription

1 Bonferroni-Based Size-Correction for Nonstandard Testing Problems Adam McCloskey Brown University October 2011; Tis Version: October 2012 Abstract We develop powerful new size-correction procedures for nonstandard ypotesis testing environments in wic te asymptotic distribution of a test statistic is discontinuous in a parameter under te null ypotesis. Examples of tis form of testing problem are pervasive in econometrics and complicate inference by making size difficult to control. Tis paper introduces two sets of new size-correction metods tat correspond to two different general ypotesis testing frameworks. Te new metods are designed to maximize te power of te underlying test wile maintaining correct asymptotic size uniformly over te parameter space specified by te null ypotesis. Tey involve te construction of critical values tat make use of reasoning derived from Bonferroni bounds. Te first set of new metods provides complementary alternatives to existing size-correction metods, entailing substantially iger power for many testing problems. Te second set of new metods provides te first available asymptotically size-correct tests for te general class of testing problems to wic it applies. Tis class includes ypotesis tests on parameters after consistent model selection and tests on super-efficient/ard-tresolding estimators. We detail te construction and performance of te new tests in tree specific examples: testing after conservative model selection, testing wen a nuisance parameter may be on a boundary and testing after consistent model selection. Keywords: Hypotesis testing, uniform inference, asymptotic size, exact size, power, size-correction, model selection, boundary problems, local asymptotics A previous version of tis paper was circulated under te title Powerful Procedures wit Correct Size for Tests Statistics wit Limit Distributions tat are Discontinuous in Some Parameters. Te autor tanks Donald Andrews, Federico Bugni, Xu Ceng, Kirill Evdokimov, Iván Fernández-Val, Patrik Guggenberger, Hiroaki Kaido, Frank Kleibergen, Hannes Leeb, Blaise Melly, Ulric Müller, Serena Ng, Pierre Perron, Benedikt Pötscer, Zongjun Qu, Eric Renault and Josep Romano for elpful comments. I am especially grateful to Donald Andrews for suggesting a solution to a mistake in a previous draft. I also wis to tank Bruce Hansen for saring some preliminary results. Department of Economics, Brown University, Box B, 64 Waterman St., Providence, RI, (adam mccloskey@brown.edu, ttp:// mccloskey/home.tml).

2 1 Introduction Nonstandard econometric testing problems ave gained substantial attention in recent years. In tis paper, we focus on a very broad class of tese problems: tose for wic te null limit distribution of a test statistic is discontinuous in a parameter. Te problems falling into tis class range from tests in te potential presence of identification failure (e.g., Staiger and Stock, 1997 and Andrews and Ceng, 2012) to tests after pretesting or model selection (e.g., Leeb and Pötscer, 2005 and Guggenberger, 2010) to tests wen a parameter may lie on te boundary of its parameter space (e.g., Andrews, 1999, 2001). Toug test statistics tat do not exibit tis type of discontinuity exist for some problems (e.g., Kleibergen, 2002), tey do not for oters. Moreover, suc test statistics may not necessarily be preferred to parameter-discontinuous ones wen good size-correction procedures are available, as tey can ave low power. However, we sidestep te important issue of coosing a test statistic in tis paper, taking it as given. Te usual approximation to te size of a test is te asymptotic probability of rejecting a true null ypotesis (null rejection probability) at a fixed parameter value. For te types of problems studied in tis paper, suc an approximation is grossly misleading. In fact, te discrepancy between tis point-wise null rejection probability (NRP) and te (asymptotic) size of a test can reac unity less te nominal level. Tis problem does not disappear, and often worsens, as te sample size grows. Te inadequacy of point-wise asymptotic approximations and te resulting pitfalls for inference ave been studied extensively in te literature. See, for example, Dufour (1997) in te context of inference on te two-stage least squares estimator, Leeb and Pötscer (2005) in te context of inference after model selection, Stoye (2009) in te context of inference on partially identified parameters and Andrews and Guggenberger (2009a) (AG encefort) in te context of inference on te autoregressive parameter in a first-order autoregressive model. In te parameter-discontinuous testing framework of tis paper, one must examine te maximal NRP uniformly, over te entire parameter space, as te sample size grows in order to determine te asymptotic size of te test (see e.g., te work of Mikuseva, 2007 and Andrews and Guggenberger, 2009b, 2010b). Wen using te test statistics considered in tis paper, one typically takes a conservative approac to control size, leading to igly non-similar tests, i.e., tests for wic te pointwise NRP differs substantially across parameter values. Tis often results in very poor power. In tis paper, we develop novel size-correction metods wit te goal of minimizing te 1

3 degree of conservativeness of te test, and ence maximizing its power, wile maintaining correct asymptotic size. We do so under two different frameworks tat allow for different null limiting beavior of a given test statistic. For te first, termed te single localized limit distribution (SLLD) framework, we adopt te framework studied by AG as it is quite broad in scope. For te second, termed te multiple localized limit distributions (MLLDs) framework, we generalize te SLLD framework in order to accommodate certain complicated asymptotic beaviors of test statistics. To our knowledge, tis latter, more general framework as not yet been studied. It includes examples of testing after consistent model selection and testing on super-efficient/ard-tresolding estimators. Te basic idea beind te size-corrections we introduce is to adaptively learn from te data ow far te true parameters are from te point tat causes te discontinuity in te asymptotic beavior of te test statistic in order to construct critical values (CVs) tat control te size of te test but are not overly conservative. We do tis under a drifting sequence framework by embedding te true parameter values in a sequence indexed by te sample size and a localization parameter. Witin tis framework, we estimate a corresponding localization parameter to find a set of drifting sequences of parameters relevant to te testing problem at and. We ten examine te CVs corresponding to te null limiting quantiles of te test statistic tat obtain under te drifting sequences witin tis set. Toug te localization parameter cannot be consistently estimated under tese drifting sequences, it is often possible to obtain estimators tat are asymptotically centered about teir true values and ence to construct asymptotically valid confidence sets for te true localization parameter. Based upon tis estimator and corresponding confidence sets, we examine tree different size-correction metods in increasing order of computational complexity. For te first, we searc for te maximal CV over a confidence set, rater tan te maximal CV over te entire space of localization parameters, to reduce te degree of conservativeness of a given ypotesis test. Inerent in tis construction are two levels of uncertainty: one for te localization parameter and one for te test statistic itself. We use procedures based on Bonferroni bounds to account for bot. For te second, we also searc for a maximal CV over a confidence set but, instead of using Bonferroni bounds, we account for te two levels of uncertainty by adjusting CV levels according to te asymptotic distributions tat arise under drifting parameter sequences. Tis metod compensates for te asymptotic dependence structure between te test statistic and te CV, leading to more powerful tests. For te tird, we find te smallest CVs over sets of tose justified by te first and second, 2

4 leading to tests wit ig power over most of te parameter space. For testing problems witin te scope of te SLLD framework, our new size-correction metods can be constructed to be eiter uniformly more powerful asymptotically tan existing least-favorable (LF) metods or are more powerful over most of te relevant parameter space. In te latter case, te portions of te parameter space for wic LF metods dominate tend to be very small and suc dominance tends to be nearly undetectable even witin tese portions. Te finite-sample power dominance of our new metods can be very pronounced, sometimes reacing nearly 100% over most of te parameter space. Our size-corrections can also be constructed to direct power toward different regions of te parameter space wile sacrificing very little in oters. We also develop te first size-correction procedures we are aware of to provide tests wit correct asymptotic size for all testing problems falling witin te MLLDs framework. Since tey are adapted from te size-correction procedures used in te SLLD framework to tis generalized framework, tey also adopt many desirable power properties. Te scope of problems to wic our size-correction metods may be applied is quite wide. For illustration, we provide detailed applications to tree nonstandard testing problems. Two of tese examples concern testing after model selection/pretesting. Te first, taken from AG, involves testing after conservative model selection and falls witin te scope of te SLLD framework. Te second, taken from Leeb and Pötscer (2005), considers testing after consistent model selection and falls witin te scope of te MLLDs framework. Te oter example we detail concerns testing wen a nuisance parameter may be on a boundary of its parameter space, taken from Andrews and Guggenberger (2010b). We also briefly discuss a subset of te numerous oter examples for wic our size-corrections can be used. We focus on testing after model selection in te examples because at present, available uniformly valid inference metods tend to be extremely conservative. Inference after model selection is an important issue tat is all too frequently ignored, sometimes being referred to as te quiet scandal of statistics. See, for example, Hansen (2005a) for a discussion of te importance of tis issue. Moreover, excluding te results of AG and Kabaila (1998), te literature as been quite negative wit regards to solving tis inference problem (e.g., Andrews and Guggenberger, 2009b, Leeb and Pötscer, 2005, 2006 and 2008), especially wit regards to inference after consistent model selection. Te recently developed metods for uniform inference of AG are closely related to te metods developed ere. AG also study a given test statistic and adjust CVs according to a drifting parameter sequence framework. However, our metods tend to be (oftentimes, 3

5 muc) less conservative tan teirs. Existing inference procedures tat make use of Bonferroni bounds are also related to tose developed ere. Some of te CVs we employ can also be interpreted as smooted versions of tose based on binary decision rules tat use an inconsistent estimator of te localization parameter. Our CVs are also related to tose tat use a transition function approac to interpolate between LF and standard CVs. In contrast to tese approaces, ours do not necessitate an ad oc coice of transition function. Rater, tey use te data and te limiting beavior of te test statistic to adaptively transition between CVs. See Sections 3.1 and 3.3 for details and references on tese related procedures. Finally, te recent work of Elliott et al. (2012) takes a somewat different approac to some of te problems discussed ere by attempting to numerically determine tests tat approac an asymptotic power bound. Te remainder of tis paper is composed as follows. Section 2 describes te general class of nonstandard ypotesis testing problems we study, subsequently detailing te SLLD and MLLDs frameworks and providing examples. Section 3 goes on to specify te size-correction metods of tis paper under te two localized limit distribution frameworks and provides te conditions under wic some of tese size-corrections yield correct asymptotic size. To conserve space, some of te conditions used to sow size-correctness of procedures in te MLLDs framework are relegated to Appendix III, wic also contains some auxiliary sufficient conditions. Specifics on ow to construct some of te size-corrected CVs are provided for tree econometric examples in Sections 4, 5 and 6. Te finite sample performance of two of te examples, corresponding to testing after model selection, is also analyzed tere. Section 7 concludes. Appendix I contains proofs of te main results of tis paper wile Appendix II is composed of derivations used to sow ow te example testing problems fit te assumptions of te paper. All tables and figures can be found at te end of te document. To simplify notation, we will occasionally abuse it by letting (a 1, a 2 ) denote te vector (a 1, a 2). Te sample size is denoted by n and all limits are taken to mean as n. Let R + = {x R : x 0}, R = {x R : x 0}, R, = R { }, R +, = R + { } and R = R {, }. 1( ) denotes te indicator function. Φ( ) and φ( ) are te usual notation for te distribution and density functions of te standard normal distribution. d and p denote weak convergence and convergence in probability wile O( ), o( ), O p ( ) and o p ( ) denote te usual (stocasitc) orders of magnitude. 4

6 2 Parameter-Discontinuous Asymptotic Distributions In tis paper, we are interested in performing ypotesis tests wen te asymptotic distribution of te test statistic is discontinuous in a parameter under te null ypotesis. We take te test statistic as given and examine te tasks of controlling size and maximizing power for te given statistic. Te important separate issue of coosing a test statistic depends on te specific testing problem at and and is not te focus of tis paper. In order to analyze tis problem, we adopt te same general testing framework as AG. Consider some generic test statistic T n (θ 0 ) used for testing H 0 : θ = θ 0 for some finite-dimensional parameter θ R d. Under H 0, T n (θ 0 ) and its asymptotic distribution depend on some parameter γ Γ. Refer to tis limit distribution as F γ. We decompose γ into tree components, viz. γ = (γ 1, γ 2, γ 3 ), depending on ow eac component affects F γ as follows. Te distribution F γ is discontinuous in γ 1, a parameter in Γ 1 R p, wen one or more of te elements of γ 1 is equal to zero. It also depends on γ 2, a parameter in Γ 2 R q, but γ 2 does not affect te distance of γ to te point of discontinuity in F γ. Te tird component γ 3 may be finite- or infinite-dimensional, lying in some general parameter space Γ 3 (γ 1, γ 2 ) tat may depend on γ 1 and γ 2. Te component γ 3 does not affect te limit distribution F γ but may affect te properties of T n (θ 0 ) in finite samples. Formally, te parameter space for γ is given by Γ = {(γ 1, γ 2, γ 3 ) : γ 1 Γ 1, γ 2 Γ 2, γ 3 Γ 3 (γ 1, γ 2 )}. (1) To complete te preliminary setup, we impose te following product space assumption on Γ 1. Tis assumption is identical to Assumption A of AG. Let ( ) denote te left (rigt) endpoint of an interval tat may be open or closed. Assumption D. (i) Γ satisfies (1) and (ii) Γ 1 = p m=1 Γ 1,m, were Γ 1,m = γ l 1,m, γ u 1,m for some γ l 1,m < γ u 1,m tat satisfy γ l 1,m 0 γ u 1,m for m = 1,..., p. Tis paper introduces testing metods tat are asymptotically size-controlled. Asymptotic size control requires one to asymptotically bound te NRP uniformly over te parameter space admissible under te null ypotesis. In order to assess te uniform limiting beavior of a test, one must examine its beavior along drifting sequences of parameters (see e.g., Andrews and Guggenberger, 2010b and Andrews et al., 2011). In tis vein, we allow γ to depend on te sample size, and empasize tis dependence by denoting it γ n, = (γ n,,1, γ n,,2, γ n,,3 ), were = ( 1, 2 ) H H 1 H 2 is a localization parameter tat describes te limiting 5

7 beavior of te sequence. Te sets H 1 and H 2 depend on Γ 1 and Γ 2 as follows: R p +,, if γ1,m l = 0, H 1 = R,, if γ1,m u = 0, m=1 R, if γ1,m l < 0 and γ1,m u > 0, H 2 = cl(γ 2 ), were cl(γ 2 ) is te closure of Γ 2 wit respect to R q. Given r > 0 and H, define {γ n, } as te sequence of parameters in Γ for wic n r γ n,,1 1 and γ n,,2 2. In tis paper, we consider two broad classes of testing problems: one for wic te limiting beavior of te test statistic is fully caracterized by under any drifting sequence of parameters {γ n, } and te oter for wic tis limiting beavior depends upon bot and te limiting beavior of {γ n,,1 } relative to anoter sequence. 2.1 Single Localized Limit Distribution Framework We begin te analysis wit te simpler of te two cases just described. Tis class of testing problems can be broadly caracterized by Assumption D and te following assumption. Assumption S-B.1. Tere exists a single fixed r > 0 suc tat for all H and corresponding sequences {γ n, }, T n (θ 0 ) W under H 0 and {γ n, d }. Denote te limit distribution function for a given as J, i.e., P (W x) = J (x) and te (1 α) t quantile of W by c (1 α). We refer to J as a localized limit distribution as it obtains under a drifting sequence of parameters indexed by te localization parameter. Assumption S-B.1 is identical to Assumption B of AG. For every sequence of parameters {γ n, } indexed by te same localization parameter, te same limit distribution J obtains, ence te term single in SLLD. We now introduce some new assumptions, noting tat tey are applicable to most of te same econometric applications tat satisfy te assumptions imposed by AG. Assumption S-B.2. Consider some fixed δ (0, 1). (i) As a function in from H into R, c (1 δ) is continuous. (ii) For any H, J ( ) is continuous at c (1 δ). Assumption S-B.2 is a mild continuity assumption. strengten part (i) as follows. To obtain stronger results, we 6

8 Assumption S-BM.1. For some fixed α (0, 1) and pair (δ, δ) [0, α δ] [0, α δ ], as a function of and δ, c (1 δ) is continuous over H and [δ, α δ]. Te quantity δ serves as a lower bound and α δ serves as an upper bound on te points δ for wic c (1 δ) must be continuous. In many examples of interest, W is a continuous random variable wit infinite support so tat c (1) =. For suc examples δ can be set arbitrarily close, but not equal, to zero. Assumptions D, S-B.1, S-B.2 and S-BM.1 and te oter assumptions for tis framework introduced later in Sections 3.1 troug 3.3 old in many nonstandard econometric testing problems of interest. Te following are simple, illustrative examples of suc problems Testing After Conservative Model Selection Various forms of ypotesis tests after model selection exemplify testing problems wit parameter-discontinuous null limit distributions. Conducting a t-test on a parameter of interest after conservative model selection falls witin te framework of Section 2.1, aving a SLLD. Conservative model selection includes, among oters, metods based on te Akaike information criterion (AIC) and standard pre-testing tecniques. As an illustrative example, consider te following problem described by AG. We ave a model given by y i = x 1iθ + x 2iβ 2 + x 3iβ 3 + σε i, (2) for i = 1,..., n, were x i (x 1i, x 2i, x 3i) R k, β (θ, β 2, β 3) R k, x 1i, x 2i, θ, β 2, σ, ε i R, x 3i, β 3 R k 2, te observations {(y i, x i )} are i.i.d. and ε i as mean zero and unit variance conditional on x i. We are interested in testing H 0 : θ = θ 0 after determining weter to include x 2i in te regression model (2), tat is, after determining weter to impose te restriction β 2 = 0. Tis decision is based on weter te absolute value of te pretest t-statistic n 1/2 ˆβ2 T n,2 ˆσ(n 1 X2 M [X 1 :X3 ] X2) 1/2 exceeds a pretest CV c > 0, were c is fixed (i.e., does not depend on n), ˆβ 2 is te standard unrestricted OLS estimator of β 2 in te regression (2) and ˆσ 2 (n k) 1 Y M [X 1 :X 2 :X 3 ] Y wit Y (y 1,..., y n ), Xj [x j1 :... : x jn] for j = 1, 2, 3 and M A I A(A A) 1 A for some generic full-rank matrix A and conformable identity matrix I. Te model selection pretest rejects β 2 = 0 wen T n,2 > c and te subsequent t-statistic for testing H 0 is based 7

9 on te unrestricted version of (2) and is given by ˆT n,1 (θ 0 ) n 1/2 (ˆθ θ 0 ) ˆσ(n 1 X 1 M [X 2 :X 3 ] X 1) 1/2, were ˆθ is te unrestricted OLS estiamtor from regression (2). Conversely, te model selection pretest selects te model witout x 2i, or equivalently restricts β 2 = 0, wen T n,2 c and te resulting t-statistic for H 0 is given by T n,1 (θ 0 ) = n 1/2 ( θ θ 0 ) ˆσ(n 1 X 1 M X 3 X 1) 1/2, were θ is te restricted OLS estimator from regression (2) wit β 2 restricted to equal zero. Hence, for a two-sided test, te post-conservative model selection test statistic for testing H 0 is given by T n (θ 0 ) = T n,1 (θ 0 ) 1( T n,2 c) + ˆT n,1 (θ 0 ) 1( T n,2 > c). Wit straigtforward modification, te results described below also apply to one-sided testing for tis problem. See AG and Andrews and Guggenberger (2009c) for more details. 1 Results in AG sow ow tis testing problem satisfies Assumptions D and S-B.1. Specifically, let G denote te distribution of (ε i, x i ) and define te following quantities x i = x 1i x 3i(E G x 3ix 3i) 1 x 3ix 1i x 2i x 3i(E G x 3ix 3i) 1 x 3ix 2i Q = E G x i x i, and Q 1 = Q11 Q 12. Q 12 Q 22 Ten for tis example, γ 1 = β 2 /σ(q 22 ) 1/2, γ 2 = Q 12 /(Q 11 Q 22 ) 1/2 and γ 3 = (β 2, β 3, σ, G) and d Z 1 1( Z 2 c) + Ẑ1 1( Z 2 > c), if γ 1 = 0 T n (θ 0 ) Ẑ1, if γ 1 0, were Z 1, Ẑ1 and Z 2 are standard normal random variables wit Z 1 independent of Z 2 and Corr(Ẑ1, Z 2 ) = γ 2. Te parameter spaces in (1) are given by Γ 1 = R, Γ 2 = [ 1 + ω, 1 ω] for some ω > 0 and Γ 3 (γ 1, γ 2 ) = {(β 2, β 3, σ, G) : β 2 R, β 3 R k 2, σ (0, ), γ 1 = β 2 /σ(q 22 ) 1/2, 1 Te testing problem described ere also applies to testing a linear combination of regression coefficients after conservative model selection. Tis involves a reparameterization described in Andrews and Guggenberger (2009c). 8

10 γ 2 = Q 12 /(Q 11 Q 22 ) 1/2, λ min (Q) κ, λ min (E G x 3ix 3i) κ, E G x i 2+δ M, E G ε i x i 2+δ M, E G (ε i x i ) = 0 a.s., E G (ε 2 i x i ) = 1 a.s.} for some κ, δ > 0 and M <, were λ min (A) is te smallest eigenvalue of generic matrix A. From tis parameter space, it is clear tat Assumption D olds for tis example. Moreover, Andrews and Guggenberger (2009c) sow tat Assumption S-B.1 olds wit r = 1/2 and x ( ( )) J (x) = ( 1 2 (1 2 2) 1/ t, x) ( 1, c) + 1 (1 2 2), c φ(t)dt, 1/2 (1 2 2) 1/2 x were (a, b) = Φ(a+b) Φ(a b). As defined, H 1 = R and H 2 = [ 1+ω, 1 ω]. Turning to te new assumptions, te lower bound δ of Assumption S-BM.1 can be set arbitrarily close to zero and δ can be set to any quantity in its admissible range [0, α δ ] since W is a continuous random variable wit support over te entire real line for any H, wic can be seen by examining J ( ). Tis fact similarly implies Assumption S-B.2(ii) olds. An assumption imposed later will necessitate a restriction on δ tat we will discuss in Section 4. Continuity of c (1 δ) in follows from te facts tat c (1 δ) = J 1 (1 δ) and J (x) is continuous in H. Te SLLD framework applies to many more complex examples of testing after conservative model selection. For example, results in Leeb (2006) and Leeb and Pötscer (2008) can be used to verify te assumptions of tis paper for a sequential general-to-specific model selection procedure wit multiple potential control variables Testing wen a Nuisance Parameter may be on a Boundary We now explore an illustrative example of a testing problem in wic a nuisance parameter may be on te boundary of its parameter space under te null ypotesis and sow ow it also falls witin te framework of Section 2.1. Tis problem is considered by Andrews and Guggenberger (2010b) and can be described as follows. We ave a sample of size n of an i.i.d. bivariate random vector X i = (X i1, X i2 ) wit distribution F. Under F, te first two moments of X i exist and are given by E F (X i ) = θ and Var F (X i ) = µ σ2 1 σ 1 σ 2 ρ σ 1 σ 2 ρ σ2 2 Say we are interested in te null ypotesis H 0 : θ = θ 0 and we know tat µ 0. Now suppose we use te Gaussian quasi-maximum likeliood estimator of (θ, µ, σ 1, σ 2, ρ) under. 9

11 te restriction µ 0, denoted by (ˆθ n, ˆµ n, ˆσ n1, ˆσ n2, ˆρ n ), to construct a lower one-sided t-test of H 0. 2 Tat is, T n (θ 0 ) = n 1/2 (ˆθ n θ 0 )/ˆσ n1, were ˆθ n = X n1 (ˆρ nˆσ n1 ) min(0, X n2 /ˆσ n2 ) wit X nj = n 1 n i=1 X ij for j = 1, 2. As in te previous example, results for upper one-sided and two-sided tests are quite similar. Results in Andrews and Guggenberger (2010b) provide tat tis testing problem also satisfies Assumptions D and S-B.1. Here, γ 1 = µ/σ 2, γ 2 = ρ and γ 3 = (σ 1, σ 2, F ) and d Z 1 + γ 2 min{0, Z 2 }, if γ 1 = 0 T n (θ 0 ) Z 1, if γ 1 0, were Z 1 and Z 2 are standard normal random variables wit Corr(Z 1, Z 2 ) = γ 2. Te corresponding parameter spaces are Γ 1 = R +, Γ 2 = [ 1 + ω, 1 ω] for some ω > 0 and Γ 3 (γ 1, γ 2 ) = {(σ 1, σ 2, F ) : σ 1, σ 2 (0, ), E f X i 2+δ M, θ = 0, γ 1 = µ/σ 2, γ 2 = ρ} for some M < and δ > 0. 3 From tese definitions, it is immediate tat Assumption D olds. Assumption S-B.1 olds for tis example wit r = 1/2 and W d Z 2,1 + 2 min{0, Z 2,2 + 1 }, were (3) Z 2 = Z 2,1 Z 2,2 d N 0, In order to verify te new assumptions, it is instructive to examine te distribution function J ( ), wic is given by ( ) x 2 1 J (x) = Φ Φ( 1 ) x ( 1 Φ ( 1 2 z (1 2 2) 1/2 )) φ(z)dz (4) (see Appendix II for its derivation). By definition, H 1 = R +, and H 2 = [ 1+ω, 1 ω]. Now, looking at te form of J, we can see tat, as in te previous example, W is a continuous random variable wit support over te entire real line for any H so tat δ of Assumption S-BM.1 can again be set arbitrarily close to zero and δ is left unrestricted over its admissible 2 Te results below also allow for different estimators in tis construction. See Andrews and Guggenberger (2010b) for details. 3 For te purposes of tis paper, we make a small departure from te exact setup used by Andrews and Guggenberger (2010b) in our definition of Γ 2, wic tey define as [ 1, 1]. Tat is, we bound te possible correlation between X i1 and X i2 to be less tan perfect. We do tis in order to employ te size-corrections described later in tis paper. Note tat te analogous assumption is imposed in te above post-conservative model selection example. 10

12 range. We discuss a restriction imposed later on δ in Section 5. For te same reasons given in te previous example, Assumption S-B.2(ii) olds as well. More complicated testing problems wen a nuisance parameter may be on a boundary can also be sown to fit te SLLD framework and later assumptions of tis paper. For example, Andrews (1999, 2001) provides results for more complicated boundary examples tat fit tis framework Oter Examples Tere are many examples in te econometrics literature of testing problems tat fit te SLLD framework. Apart from tose discussed above, tese include, but are not limited to, tests after pretests wit fixed CVs (e.g., Guggenberger, 2010), testing wen te parameter of interest may lie on te boundary of its parameter space (e.g., Andrews and Guggenberger, 2010a), tests on model-averaged estimators (e.g., Hansen, 2007), tests on certain types of srinkage estimators (e.g., Hansen, 2012), tests in autoregressive models tat may contain a unit root (e.g., AG and Mikuseva, 2007), Vuong tests for nonnested model selection (e.g., Si, 2011), subvector tests allowing for weak identification (e.g., Guggenberger et al., 2012) and tests on break dates and coefficients in structural cange models (e.g., Elliott et al., 2012 and Elliott and Müller, 2012). Te SLLD assumptions tecnically preclude certain testing problems wit parameterdiscontinuous null limit distributions suc as testing in moment inequality models (e.g., Andrews and Soares, 2010) and certain tests allowing for weak identification (e.g., Staiger and Stock, 1997). Neverteless, te SLLD framework of tis paper can be modified in a problem-specific manner to incorporate some of tese problems and apply te testing metods introduced later to tem. For example, Assumption D does not allow for testing in te moment inequality context wen te condition of one moment binding depends upon weter anoter moment binds. Yet te results of Andrews and Soares (2010), Andrews and Barwick (2011) and Romano et al. (2012) suggest tat tailoring te assumptions to tis context would permit analogous results to tose presented later. 2.2 Multiple Localized Limit Distributions Framework Te MLLDs framework generalizes te SLLD framework. Te motivation for tis generalization comes from an important class of ypotesis testing problems wit parameterdiscontinuous null limit distributions tat do not satisfy Assumption S-B.1 because under H 0 and a given {γ n, }, te asymptotic beavior of T n (θ 0 ) is not fully caracterized by. 11

13 In tis more general framework, we retain te description of te parameter space given in Assumption D as well as te description of H following it but weaken Assumption S-B.1 to te following. Assumption M-B.1. Tere is a sequence {k n }, a set K H 1 and a single fixed r > 0 suc tat for all H and corresponding sequences {γ n, }, under H 0 : d (i) if lim γ n,,1 /k n K, T n (θ 0 ) W (1) ; (ii) if lim γ n,,1 /k n L K c d, T n (θ 0 ) W (2) (iii) if lim γ n,,1 /k n L c K c, te asymptotic distribution of T n (θ 0 ) is stocastically dominated by W (1) or W (2). Assumption M-B.1 allows for different -dependent localized limit distributions tat are relevant to te different possible limiting beaviors of γ n,,1 /k n. It collapses to Assumption S-B.1 wen k n = n r and K = H 1 or L = H 1. Te auxiliary sequence {k n } may depend upon te elements of {γ n, } toug tis potential dependence is suppressed in te notation. Denote te limit distributions corresponding to (i) and (ii) as J (1) and J (2), wic are te two localized limit distributions tat obtain under te corresponding sequences of γ n,,1 /k n. Similarly, c (1) and c (2) denote te corresponding quantile functions. Finally, we denote te limit random variable under (iii) as W (3). Tis is a sligt abuse of notation because tere may be MLLDs tat obtain under (iii) alone. Distinction between tese distributions is not ; necessary ere because of te imposed stocastic dominance. We will also make use of te following definition: ζ({γ n, }) lim γ n,,1 /k n. Te form tat {k n }, K and L take are specific to te testing problem at and. However, we make a few general remarks in order to provide some intuition. Te MLLDs frameowrk incorporates testing problems after a decision rule tat compares some statistic to a samplesize-dependent quantity, say c n, decides te form T n (θ 0 ) takes. Wen using suc a decision rule, under te drifting sequence of parameters {γ n, }, te null limit distribution of T n (θ 0 ) not only depends on te limit of n r γ n,,1 (and γ n,,2 ), but it also depends on ow fast n r γ n,,1 grows relative to c n. Te sequence {k n } is tus some (scaled) ratio of c n to n r and te sets K and L describe te limiting beavior of n r γ n,,1 relative to c n. Tis setting can clearly be furter generalized to allow for oter sequences like {γ n,,1 /k n } to also determine te limiting beavior of T n (θ 0 ) under H 0. For example, one additional sequence of tis sort could allow for two additional localized limit distributions tat are not necessarily stocastically dominated by any of te oters. In tis case, bot te space containing te limit of γ n,,1 /k n and te limit of tis additional sequence could determine te 12

14 null limiting beavior of T n (θ 0 ) under any given {γ n, }. We conjecture tat a more general case like tis obtains for ypotesis tests after more complicated consistent model selection procedures tan tose for wic asymptotic results under drifting sequences of parameters are presently available. (Te intuition for tis is given in Section below). In tis framework, we will also accommodate certain types of discontinuities in te localized limit distribution J. One form of tese discontinuities occurs wen 1 is on te boundary of its parameter space H 1, entailing infinite values. In order to accommodate tis type of discontinuity, define te following subset of H: H int(h1 ) H 2. Ten te set of corresponding to 1 on te boundary of H 1 is equal to H c. Assumption M-B.2. Consider some fixed δ (0, 1). (i) As a function from H into R, c (i) (1 δ) is continuous for i = 1, 2. (ii) For i = 1, 2 and any H, tere is some finite ε i 0 for wic J (i) ( ) is continuous at c (i) (1 δ) + ε i. Assumption M-B.2 is a continuity assumption tat is a relaxed version of a direct adaptation of Assumption S-B.2 to te MLLDs framework. For te problems in tis class, te localized quantiles can be infinite for H c. Tis is wy (i) is only required as a function from H. Similarly, te localized limit distribution functions can be discontinuous at te teir localized quantiles but continuous in a neigborood near tem. Part (ii) provides te flexibility to allow for tis feature. A sufficient condition for Assumption M-B.2 to old for i = 1 or 2 is tat te W (i) is a continuous random variable wit infinite support and a distribution function tat is continuous in. In tis case ε i = 0. As in te previous framework, we strengten part (i) of tis assumption to obtain stronger results. Assumption M-BM.1. For some fixed α (0, 1) and pairs (δ (i), δ (i) ) [0, α δ (i) ] [0, α δ (i) ] for i = 1, 2, as a function of and δ, c (i) (1 δ) is continuous over H and [δ (i), α δ (i) ]. Analogous remarks to tose on Assumption S-BM.1 can be made ere wit te exception tat continuity in is no longer required at H c Testing After Consistent Model Selection Unlike ypotesis testing after conservative model selection, testing after consistent model selection entails substantially more complicated limiting beavior of a test statistic under te null ypotesis. Te essential difference between conservative and consistent model selection in te context of our examples is tat in consistent model selection, te comparison/critical 13

15 value used in te model selection criterion grows wit te sample size. Tis is te case for example, wit te popular Bayesian information criterion (BIC) and te Hannan-Quinn information criterion. Te simple post-consistent model selection testing framework provided by Leeb and Pötscer (2005) provides an illustrative example of a testing problem tat fits te MLLDs framework. Hence, we sall consider it ere. We now consider te regression model y i = θx 1i + β 2 x 2i + ɛ i (5) for i = 1,..., n, were ɛ i d i.i.d.n(0, σ 2 ) wit σ 2 > 0, X (x 1,..., x n) wit x i (x 1i, x 2i ) is nonstocastic, full-rank and satisfies X X/n Q > 0. For simplicity, assume tat σ 2 is known toug te unknown σ 2 case can be andled similarly. We are interested in testing H 0 : θ = θ 0 after deciding weter or not to include x 2i in te regression model (5) via a consistent model selection rule. As in te conservative model selection framework of Section 2.1.1, tis decision is based on comparing te t-statistic for β 2 wit some CV except now tis CV c n grows in te sample size suc tat c n but c n / n 0. Formally, let σ 2 (X X/n) 1 = σ2 θ,n σ θ,β2,n σ θ,β2,n σβ 2 2,n and ρ n = σ θ,β2,n/(σ θ,n σ β2,n). Ten te model selection procedure cooses to include x 2i in te regression if n ˆβ 2 /σ β2,n > c n, were ˆβ 2 is te (unrestricted) OLS estimator of β 2 in te regression (5), and cooses to restrict β 2 = 0 oterwise. For tis example, we will test H 0 by examining te non-studentized quantity n( θ θ 0 ), were θ is equal to te unrestricted OLS estimator of θ in regression (5) wen n ˆβ 2 /σ β2,n > c n and te restricted OLS estimator of θ in (5) wit β 2 restricted to equal zero wen n ˆβ 2 /σ β2,n c n. 4 Tat is for an upper one-sided test, te post-consistent model selection test statistic for testing H 0 is given by T n (θ 0 ) = n( θ θ 0 )1( n ˆβ 2 /σ β2,n c n ) + n(ˆθ θ 0 )1( n ˆβ 2 /σ β2,n > c n ), were θ and ˆθ are te restricted and unrestricted estimators. Examining T n (θ 0 ) and T n (θ 0 ) and teir corresponding localized null limit distributions would allow us to perform te same analysis for lower one-sided and two-sided tests of H 0. 4 Following Leeb and Pötscer (2005), we examine te non-studentized quantity rater tan te t-statistic because use of te latter will not satisfy Assumption D and is terefore not amenable to te procedures put fort in tis paper. Note tat altoug te studentized quantity does not display a parameter-discontinuous null limit distribution, it suffers te same size-distortion problem wen standard CVs are used. 14

16 Let te limits of all finite sample quantities be denoted by a subscript, e.g., σ 2 θ,. Ten for tis example, γ 1 = β 2 ρ /σ β2,, γ 2 = (γ 2,1, γ 2,2 ) = (σ θ,, ρ ) and γ 3 = (β 2, σ 2, σ 2 θ,n, σ2 β 2,n, ρ n ) and T n (θ 0 ) d N(0, (1 γ 2 2,2)γ 2 2,1), if γ 1 = 0 N(0, γ 2 2,1), if γ 1 0. Te parameter spaces in (1) are given by Γ 1 = R, Γ 2 = [η, M] [ 1 + ω, 1 ω] for some η (0, M], ω (0, 1] and M (0, ) and Γ 3 (γ 1, γ 2 ) = {(β 2, σ 2, σ 2 θ,n, σ 2 β 2,n, ρ n ) : β 2 R, σ 2 (0, ), γ 1 = β 2 ρ /σ β2,, γ 2 = (σ θ,, ρ ) lim σ2 θ,n = γ 2,1, lim σβ 2 2,n = β 2 γ 2,2 /γ 1, lim ρ n = γ 2,2 }. Clearly, Γ satisfies Assumption D. Using te results of Proposition A.2 in Leeb and Pötscer (2005), we can establis tat Assumption M-B.1 is satisfied wit k n = c n ρ n / n, r = 1/2, K = ( 1, 1), J (1) (x) = Φ((1 2 2,2) 1/2 (x/ 2,1 + 1 )), L = [, 1) (1, ] and J (2) (x) = Φ(x/ 2,1 ). See Appendix II for details. Te intuition for wy different null limit distributions for T n (θ 0 ) obtain under {γ n, } depending on ow γ n,,1 = β 2,n ρ n /σ β2,n beaves relative to k n = c n ρ n / n as te sample size grows lies in te fact tat wen γ n,,1 /k n = nβ 2,n /σ β2,nc n < 1, T n (θ 0 ) equals n( θ θ 0 ) asymptotically. Conversely, wen γ n,,1 /k n = nβ 2,n /σ β2,nc n > 1, T n (θ 0 ) equals n(ˆθ θ 0 ) asymptotically. Under H 0 and te drifting sequence {γ n, }, te statistics n( θ θ 0 ) and n(ˆθ θ0 ) ave different limit distributions, corresponding to J (1) and J (2), respectively. In te knife-edge case for wic lim γ n,,1 /k n = 1, te limit of T n (θ 0 ) ten also depends on te limiting beavior of c n ± nβ 2,n /σ β2,n (see Leeb and Pötscer, 2005). However, no matter te limit of tis latter quantity, te limit of T n (θ 0 ) in tis case is always stocastically dominated by te limit tat pertains under eiter lim γ n,,1 /k n < 1 or lim γ n,,1 /k n > 1 (see Appendix II). Hence, under a given drifting sequence {γ n, }, te limiting beavior of T n (θ 0 ) is not fully caracterized by, in violation of Assumption S-B.1. By definition, H 1 = R and H 2 = [η, M] [ 1 + ω, 1 ω]. Turning now to Assumption M-BM.1, δ (1) can be set arbitrarily close to, but strictly greater tan, zero. Since c (1) (1 δ) is te (1 δ) t quantile of a normal distribution wit mean 1 2,1 and variance 2 2,1(1 2 2,2) and H = R [η, M] [ 1 + ω, 1 ω], c (1) (1 δ) is continuous in over H for any δ (0, 1). Continuity in δ over [δ (1), α δ (1) ] also follows for any δ (1) [0, α δ (1) ]. Similar reasoning sows tat, δ (2) can be set arbitrarily close to zero and δ (2) can be anywere in its admissible range for Assumption M-BM.1 to old (we discuss restrictions imposed on δ (i) for i = 1, 2 15

17 by later assumptions in Section 6). For H and i = 1, 2, J (i) is te distribution function of a continuous random variable so tat Assumption M-B.2(ii) olds wit ε i = 0. As wit te oter examples studied in tis paper, tis simple illustrative example is not te most general of its kind to fall into tis framework. Many of te assumptions in te above example can be relaxed. However, uniform asymptotic distributional results for more complex consistent model selection procedures are not readily available in te literature. Tis may be in part due to te very negative results put fort regarding attempts to conduct inference after even te simplest of suc procedures (e.g., Leeb and Pötscer, 2005 and Andrews and Guggenberger, 2009b). As alluded to above, more complicated procedures, suc as a consistent version of te sequential general-to-specific model selection approac of Leeb (2006) and Leeb and Pötscer (2008) or standard BIC approaces to more complicated models, likely require a straigtforward extension of te MLLDs framework and te corresponding CVs introduced below. Te intuition for tis essentially follows from te same intuition as tat used for te simple post-consistent model selection example provided above. As a simplification, suppose now tat anoter regressor enters te potential model (5) wit associated coefficient β 3. Using te obvious notation, we may also wis to determine weter β 3 sould enter te model prior to testing H 0 by comparing n ˆβ 3 /σ β3 to c n. In tis case T n (θ 0 ) would take one of four, rater tan two, possible values depending on bot te value of n ˆβ 3 /σ β3 and n ˆβ 2 /σ β2 relative to c n (ignoring knife-edge cases) Oter Examples Te class of super-efficient/ard-tresolding estimators studied by Andrews and Guggenberger (2009b) and Pötscer and Leeb (2009), including Hodges estimator, also fit te MLLDs framework. A certain subclass of tese estimators in fact requires ε i > 0 in Assumption M-B.2(ii) for i = 1 or 2, unlike te problem considered immediately above. Related problems of testing after pretests using pretest CVs tat grow in te sample size fit te MLLDs framework as well. Toug some very recent work as explored te properties of uniformly valid confidence intervals for some of te problems tat fall into tis framework (e.g., Pötscer and Scneider, 2010), to te autor s knowledge, tis is te first time suc a framework and corresponding uniformly valid inference procedure as been presented at tis level of generality. 16

18 3 Bonferroni-Based Critical Values For test statistics wit parameter-discontinuous null limit distributions, te asymptotic NRP of te test, evaluated at a given parameter value permissible under H 0 can provide a very poor approximation to te true NRP and size of te test, even for large samples. In order to be more precise about tis terminology, we introduce te following definitions for a test of H 0 : θ = θ 0, working under te framework described in Section 2. Let κ n be te (possibly random or sample-size-dependent) CV being used. Te NRP evaluated at γ Γ is given by P θ0,γ(t n (θ 0 ) > κ n ), were P θ0,γ(e) denotes te probability of event E given tat (θ 0, γ) are te true parameters describing te data-generating process (DGP). Te asymptotic NRP of a test statistic T n (θ 0 ) and cv κ n evaluated at γ Γ is given by lim sup P θ0,γ(t n (θ 0 ) > κ n ). Te exact and asymptotic sizes are defined as ExSZ n (θ 0, κ n ) sup P θ0,γ(t n (θ 0 ) > κ n ) γ Γ AsySz(θ 0, κ n ) lim sup ExSZ n (θ 0, κ n ). Note tat te exact and asymptotic sizes of a test ave te concept of uniformity built into teir definitions in tat ExSZ n (θ 0, κ n ) is te largest NRP uniformly over te parameter space Γ and AsySz(θ 0, κ n ) is its limit. In order to ave a test wit approximately controlled exact size, and terefore controlled NRP at any γ Γ, we must control AsySz(θ 0, κ n ). Under te frameworks of tis paper, te primary teoretical step in controlling te asymptotic size of a test is to control te asymptotic NRP under all drifting sequences of parameters {γ n, }. Tat is, if we can find a (sequence of) CV(s) { κ n } suc tat lim sup P θ0,γ n, (T n (θ 0 ) > κ n ) α for all {γ n, } described in Section 2, we can construct a ypotesis test wose asymptotic size is bounded by α (see Andrews and Guggenberger, 2010b, Andrews et al., 2011 or te subsequencing arguments used in Appendix I for details). Since c (1 α) (or c (i) (1 α), i = 1, 2) is te (1 α)t CV of te limit distribution of T n (θ 0 ) under H 0 and te drifting sequence of parameters {γ n, }, we would ideally like to use a CV tat is equal to c (1 α) wenever {γ n, } caracterizes te true DGP in order to maximize te power of te resulting test wile controlling its asymptotic size. Unfortunately, cannot be consistently estimated under all drifting sequence DGPs. Tis as led to te construction of te so-called LF CV sup H c (1 α) and variants tereof (e.g., AG). Guarding against te worst-case drifting sequence DGP, tis CV is often quite large, substantially reducing te power of te resulting test. Toug cannot be consistently estimated under {γ n, }, in typical applications one can 17

19 find an estimator of tat converges in distribution to a random variable centered around te true value under H 0 and tis drifting sequence DGP. Tis allows one to form asymptotically valid confidence sets for and subsequently restrict attention to data-dependent regions inside of H relevant to te testing problem at and, rater tan guarding against te worstcase scenario, leading to smaller CVs and resulting tests wit iger power. However, te additional uncertainty associated wit te estimation of must be taken into account for one to control te asymptotic NRP under all drifting sequences. Tis is were Bonferroni bounds become useful. We now introduce two sets of tree types of robust CVs based upon Bonferroni approaces. Eac set corresponds to CVs to be used witin eiter te SLLD or MLLDs framework. Witin eac set, te CVs are presented in increasing order of computational complexity. As te types of CV grow from least to most computationally complex, appropriately constructed tests using tem tend to gain in power. 3.1 S-Bonf Robust Critical Values We begin by examining Bonferroni-based size-corrected CVs for problems tat are caracterized by a SLLD, as described in Section 2.1. Te first, most conservative but most computationally simple Bonferroni-based CV is defined as follows: c S B(α, δ, ĥn) sup c (1 δ), I α δ (ĥn) were δ [0, α], ĥn is some random vector taking value in an auxiliary space H and I α δ ( ) is a correspondence from H into H. In applications, te space H will typically be equal to H or a space containing H but tis is not necessary for te ensuing assumptions to old. Te random vector ĥn is an estimator of under H 0 and te DGP caracterized by {γ n, } and I α δ (ĥn) serves as a (α δ)-level confidence set for. Construction of an estimator of is typically apparent from te context of te testing problem, given tat 1 = n r γ n,,1 + o(1) and 2 = γ n,,2 +o(1). Te S-Bonf CV generalizes te LF CV: wen I 0 (x) = H for all x H, c S B (α, α, ĥn) = c LF (α) sup H c (1 α). Te tuning parameter δ can be used to direct te power of te test towards different regions of te parameter space H. Procedures using Bonferroni bounds in inference problems involving nuisance parameters and/or composite ypoteses ave appeared in various contexts trougout te econometrics and statistics literature. Examples include Lo (1985), Stock (1991), Berger and Boos (1994), Silvapulle (1996), Staiger and Stock (1997), Romano and Wolf (2000), Hansen (2005b), 18

20 Moon and Scorfeide (2009), Cauduri and Zivot (2011) and Romano et al. (2012). 5 Romano et al. (2012), wic applies a specific form of te S-Bonf CV to ypotesis testing in partially identified moment inequality models, was developed concurrently wit tis work. Te metods of Lo (1985) and Hansen (2005b) are analogous to letting δ α as n in te confidence set I α δ (ĥn), but simply using te maximand c (1 α) in te construction of te CV. Toug tis approac is asymptotically size-correct since it leads to te use of c LF (α) in te limit, it can ave poor finite sample size control since it fails to fully account for te additional uncertainty involved in estimating. We now impose furter assumptions to ensure tat tests utilizing c S B in te current context exibit asymptotic size control. Assumption S-B.3. Consider some fixed β [0, 1]. Under H 0 and wen te drifting sequence of parameters {γ n, } caracterizes te true DGP for any fixed H, tere exists an estimator ĥn taking values in some space H and a (nonrandom) continuous, compactvalued correspondence I β : H H suc tat ĥn d, a random vector taking values in H for wic P ( I β ( )) 1 β. Assumption S-B.3 assures tat ĥn is a well-beaved estimator of under H 0 and {γ n, } and imposes basic continuity assumptions on te correspondence I β ( ) used to construct te confidence set for. It allows I β ( ) to take a variety of forms depending upon te context of te testing problem. For a given β = α δ, tis flexibility can be used to direct te power of te test towards different regions of H or to increase te computational tractability of constructing te S-Bonf-Min CVs (see, e.g., Romano et al., 2012). In a typical testing problem, under H 0 and {γ n, }, ĥn,1 converges weakly to a normally distributed random variable wit mean 1, te true localization parameter, making te construction of I β very straigtforward. Similarly, depending upon te testing problem, different coices of ĥn may lead to tests wit different power properties. For example, it may be advantageous to use ĥ n tat imposes H 0 if tis leads to smaller CVs. Assumption S-B.4. Under H 0 and wen te drifting sequence of parameters {γ n, } caracterizes te true DGP, (T n (θ 0 ), ĥn) d (W, ) for all H. d Assumptions S-B.1 and S-B.3 already provide tat T n (θ 0 ) W and ĥn d under H 0 and {γ n, } so tat Assumption S-B.4 only ensures tat tis weak convergence occurs jointly. Assumptions S-B.3 and S-B.4 also allow for muc flexibility in te estimation of. Since 5 I tank Hannes Leeb and Benedikt Pötscer for alerting me to some of te early references in te statistics literature troug te note Leeb and Pötscer (2012). 19