Optimal bandwidth selection for robust generalized method of moments estimation

Size: px
Start display at page:

Download "Optimal bandwidth selection for robust generalized method of moments estimation"

Transcription

1 Optimal bandwidth selection for robust generalized method of moments estimation Daniel Wilhelm he Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP15/14

2 Optimal Bandwidth Selection for Robust Generalized Method of Moments Estimation Daniel Wilhelm UCL and CeMMAP March 22, 2014 Abstract A two-step generalized method of moments estimation procedure can be made robust to heteroskedasticity and autocorrelation in the data by using a nonparametric estimator of the optimal weighting matrix. his paper addresses the issue of choosing the corresponding smoothing parameter (or bandwidth so that the resulting point estimate is optimal in a certain sense. We derive an asymptotically optimal bandwidth that minimizes a higher-order approximation to the asymptotic meansquared error of the estimator of interest. We show that the optimal bandwidth is of the same order as the one minimizing the mean-squared error of the nonparametric plugin estimator, but the constants of proportionality are significantly different. Finally, we develop a data-driven bandwidth selection rule and show, in a simulation experiment, that it may substantially reduce the estimator s mean-squared error relative to existing bandwidth choices, especially when the number of moment conditions is large. JEL classification: C12; C13; C14; C22; C51 Keywords: GMM; higher-order expansion; optimal bandwidth; mean-squared error; long-run variance. Department of Economics, University College London, 30 Gordon St, London WC1H 0AX, United Kingdom; address: d.wilhelm@ucl.ac.uk. I thank Christian Hansen, Alan Bester, the co-editor and two referees for helpful comments. he author gratefully acknowledges financial support from the ESRC Centre for Microdata Methods and Practice at IFS (RES

3 1 Introduction Since the seminal paper by Hansen (1982 the generalized method of moments (GMM has become a popular method for the estimation of partially specified models based on moment conditions. In time series applications, two-step GMM estimators can be made robust to heteroskedasticity and autocorrelation (HAC by using a nonparametric plugin estimator of the optimal weighting matrix. he goal of this paper is to develop a selection rule for the corresponding smoothing parameter of the nonparametric estimator such that the resulting point estimator minimizes a suitably defined mean-squared error (MSE criterion. Many instances of poor finite sample performance of GMM estimators have been reported in the literature. See for example Hansen, Heaton, and Yaron (1996 and references therein. As an attempt to improve the properties different extensions and new estimators have been proposed, e.g. the empirical likelihood estimator introduced by Owen (1988, Qin and Lawless (1994, the exponential tilting estimator of Kitamura and Stutzer (1997 and Imbens, Spady, and Johnson (1998 and the continuous updating estimator by Hansen, Heaton, and Yaron (1996. Newey and Smith (2004 show that all these estimators are members of a larger class of generalized empirical likelihood (GEL estimators. A different approach termed fixed-b asymptotics is based on deriving more accurate approximations of estimators and test statistics based on an asymptotic sequence in which the HAC smoothing parameter tends to infinity at the same rate as the sample size. See for example Kiefer and Vogelsang (2002a,b, Instead of treating the smoothing parameter as proportional to the sample size, Sun and Phillips (2008 and Sun, Phillips, and Jin (2008 develop a higher-order asymptotic theory based on which they find the optimal rate at which the smoothing parameter (here a bandwidth minimizes the coverage probability error or length of confidence intervals. Similar in spirit, the present paper derives the optimal growth rate of the bandwidth to minimize an asymptotic mean-squared error (AMSE criterion. We approximate the MSE of the second-step GMM estimator by the MSE of the first few terms in a stochastic expansion. Since the proposed semiparametric estimator is first-order equivalent to ordinary GMM estimators in the iid case, the optimal bandwidth derived in this paper will minimize its second-order effects on the estimator and lead to second-order efficiency gains. In an unpublished dissertation, Jun (2007 independently develops a similar expansion and arrives at the same MSE-optimal bandwidth as derived in this paper, however under a slightly different set of assumptions. 1 Other bandwidth choices for HAC-robust estimation have been suggested by Andrews 2

4 (1991, Newey and West (1994 and Andrews and Monahan (1992, for example, and are very popular in applied research. In this paper, we show that these are suboptimal choices if MSE-optimal point estimation is of main interest. In finite samples, the existing methods can select bandwidths that are significantly different from the MSE-optimal bandwidth even though they share the same asymptotic order relative to the sample size. he difference is due to the other methods minimizing the AMSE of the weighing matrix estimator instead of minimizing the AMSE of the GMM estimator itself; they guarantee accurate estimates of the optimal weighting matrix, but not necessarily of the parameter of interest. In the linear regression framework with potential autocorrelation and heteroskedasticity, there are several papers (e.g. Robinson (1991, Xiao and Phillips (1998 and amaki (2007 that derive higher-order expansions of the MSE of semiparametric frequency domain estimators to determine an optimal bandwidth that minimizes higher-order terms of such expansions. In the present paper, however, we allow for nonlinear models and over-identification which significantly complicate the problem and require a different set of tools to derive such expansions. o approximate higher-order moments of the GMM estimator we develop a stochastic expansion of the estimator similar to the approach in Nagar (1959. See Rothenberg (1984 for an introduction to Nagar-type expansions and for further references. Several other authors have analyzed higher-order properties of GMM and GEL estimators using similar tools. Rilstone, Srivastava, and Ullah (1996 and Newey and Smith (2004 provide expressions for the higher-order bias and variance of GMM and GEL estimators when the data are iid. Anatolyev (2005 derives the higher-order bias in the presence of serial correlation. Finally, Goldstein and Messer (1992 present general conditions under which functionals of nonparametric plug-in estimators achieve the optimal rate of convergence. Depending on the functional under-smoothing the plugin estimator relative to the smoothing parameter used to optimally estimate the nonparametric quantity itself may be necessary. he paper is organized as follows. he first section introduces the econometric setup, derives a higher-order expansion of the two-step GMM estimator and the optimal bandwidth that minimizes an approximate MSE based on that expansion. he third section describes an approach to estimate the infeasible optimal bandwidths, followed by a simulation experiment that demonstrates the procedure s performance in finite samples. he paper concludes with an appendix containing all mathematical proofs. Let vec( denote the column-by-column stacking operation and vech( the columnby-column stacking operation of entries on and above the diagonal of a symmetric matrix. 3

5 By K m,n denote the mn mn commutation matrix so that, for any m n matrix M, K m,n vec(m = vec(m. Let be the Kronecker product and r F (β, with r Z and F (β a matrix being r times differentiable in β, denote the matrix of r-th order partial derivatives with respect to β, recursively defined as in Rilstone, Srivastava, and Ullah ( F (β := F (β. his notation for derivatives will sometimes be used to save space and simplify notation. denotes the Euclidean (matrix norm, M the transpose of a matrix M, with probability approaching one is abbreviated w.p.a. 1 and with probability one by w.p. 1. he notation x = O p (1 means that the sequence {x } =1 is uniformly tight. 2 Optimal Bandwidth In this section, we introduce the basic framework, define an appropriate MSE criterion and find the optimal bandwidth that minimizes it. he idea is to derive a higher-order approximation of the second-step estimator and to the MSE from the moments of this approximation. A higher-order analysis is required in this setup because first-order asymptotics do not depend on the smoothing parameter. Consider estimation of a parameter β 0 B from the moment equation Eg(X t, β 0 = 0 given a data sample {x t } t=1. If the dimension of the range of g is at least as large as the dimension of the parameter β 0, then a popular estimator of β 0 is the two-step GMM estimator defined as follows. First, estimate β 0 by some -consistent estimator, say β, that is then used to construct a consistent estimator ˆΩ( β of the long-run variance Ω 0 := s= Γ(s, Γ(s := E[g(X t+s, β 0 g(x t, β 0 ]. In a second step, compute the GMM estimator ˆβ of β 0 with weighting matrix ˆΩ( β 1, viz. ˆβ := arg min β B ĝ(β ˆΩ ( β 1 ĝ(β, (2.1 where ĝ(β := 1 t=1 g(x t, β. he second step improves the first-step estimator in terms of efficiency. In fact, ˆβ is optimal in the sense that it achieves the lowest asymptotic variance among all estimators of the form ˆβ W := arg min β B ĝ(β W ĝ(β for some positive definite weighting matrix W (see Hansen (1982. In the special case of an iid process {X t }, Ω 0 collapses to Ω 0 = E[g(X t, β 0 g(x t, β 0 ] and can simply be estimated by its sample analog ˆΩ ( β := 1 t=1 g(x t, βg(x t, β. When the iid assumption is not justified, one can perform inference robust to autocorrelated and/or heteroskedastic X t processes. Robustness here means that potential dependence and heteroskedasticity are treated nonparametrically and one does not have to 4

6 be explicit about the data generating process of the X t s. o that end, one needs to smooth the observations g(x t+s, βg(x t, β over s to ensure that ˆΩ ( β is consistent. In this paper, we use a nonparametric kernel estimator of the form ˆΩ ( β := 1 1 min{, s} s=1 t=max{1,1 s} ( s k g t+s ( S βg t ( β with g t (β := g(x t, β and k a kernel function. S, with S as, is a so-called bandwidth parameter that governs the degree of smoothing. Andrews (1991 derives a range of rates (in terms of at which S is allowed to diverge in order to guarantee consistency of ˆΩ ( β. hese conditions, however, do not suggest rules for choosing S for a fixed sample size. Small values of S imply averaging over only few observations which decreases the variability of the estimator ˆΩ ( β, but increases its bias. On the other hand, large bandwidths yield inclusion of more distant lags in the above sum, thereby increasing the variance, but decreasing the bias of the estimator. Below we show that the choice of S affects the bias and variance of the second-step estimator ˆβ in a similar way. his trade-off can be used to derive decision rules on how to pick S in finite samples. For example, Andrews (1991 derives the optimal bandwidth minimizing a truncated asymptotic mean-square error (AMSE criterion that balances bias and variance of ˆΩ ( β, thereby guaranteeing good properties of the estimator of the optimal weighting matrix. However, in the GMM estimation framework, the secondstep estimator ˆβ is the quantity of interest and, thus, the bandwidth should be chosen so as to take into account the bias and variance trade-off of ˆβ, rather than that of ˆΩ ( β. o that end the subsequent analysis develops a higher-order expansion of the MSE of the second-step estimator and then minimize the leading terms with respect to the bandwidth. Assumption 2.1. (a he process {X t } t= taking values in X R m is fourth-order stationary and α-mixing with mixing coefficients α(j satisfying j=1 j2 α(j (ν 1/ν < for some ν > 1. (b {x t } t=1 is an observed sample of {X t } t=. (c h(, β := (g(, β, vec( g(, β E g(, β, vec( 2 g(, β E 2 g(, β is a measurable function for every β B. (d sup t 1 E[ h(x t, β 0 4ν ] <. (e sup t 1 E[sup β B k g(x t, β 2 ] < for k = 1, 2, 3. (f here is a first-step estimator β satisfying β β 0 = O p ( 1/2. As a slight abuse of notation, in the remainder, x t represents the random variable X t as well as the observation x t, but the distinction should be clear from the context. Furthermore, dropping β as an argument of a function means that the function is evaluated at β 0, e.g. ˆΩ := ˆΩ (β 0 or g t := g t (β 0. Let G 0 := EG(x t, G(x t, β := g t (β/ β, G t (β := G(x t, β, and the sample counterpart G (β := 1 t=1 g t(β/ β. 5

7 Following Parzen (1957, let q be the characteristic exponent that characterizes the smoothness of the kernel k at zero: q := max{α [0, : g α exists and 0 < g α < } 1 k(z with g α := lim z 0. For example, for the Bartlett, Parzen and ukey-hanning kernel z α the values of q are 1, 2 and 2, respectively. Assumption 2.2. Let the kernel k satisfy the following conditions: (a k : R [ 1, 1] satisfies k(0 = 1, k(x = k( x x R, k2 (xdx <, k(x dx <, k( is continuous at 0 and at all but a finite number of other points, and S, S 2 / 0, Sq / 0 for some q [0, for which g q, f (q [0, where f (q := 1 2π j= j q Γ(j. (b k(xdx < with { sup k(x := y x k(y, x 0 sup y x k(y, x < 0. Assumptions 2.1 and 2.2(a imply Assumptions A, B and C in Andrews (1991 applied to {g t } and {G t }, allowing us to use his consistency and rate of convergence results for HAC estimators. he necessity of Assumption 2.2(b is explained in Jansson (2002. Assumption 2.3. (a g : X B R l, l p, and β 0 int(b is the unique solution to Eg(x t, β 0 = 0, B R p is compact. (b rank(g 0 = p. (c For any x X, g(x, is twice continuously differentiable in a neighborhood N of β 0. (d here exists a function d : X X R with s= Ed(x t+s, x t < so that g satisfies the condition (g t+s (βg t (β (g t+s (β 0 g t (β 0 d(x t+s, x t w.p. 1 for β N. (e here exists a function b : X R with Eb(x t < such that k g(x, β k g(x, β 0 b(x β β 0 for k = 2, 3. (f Ω 0 is positive definite. he following proposition is the first main result of the paper, presenting an expansion of the second-step GMM estimator ˆβ up to the lowest orders involving the bandwidth S. his approximation constitutes a crucial ingredient for the computation of the optimal bandwidth. Proposition 2.1. Under Assumptions , ˆβ satisfies the stochastic expansion ˆβ = β 0 + κ 1, 1/2 + κ 2, S 1/2 1 + κ 3, S q 1/2 + o p (η 1/2 (2.2 with η := S 1/2 1/2 + S q, κ i, = O p (1 for i = 1, 2, 3, κ 1, := H 0 F κ 2, := H 0 /S (ˆΩ Ω P 0 F ( κ 3, := H 0 S q Ω Ω 0 P0 F 6

8 and F (β := ĝ(β, F := F (β 0, Ω := E ˆΩ, Σ 0 := (G 0Ω 1 0 G 0 1, H 0 := Σ 0 G 0Ω 1 0 and P 0 := Ω 1 0 Ω 1 0 G 0 H 0. Since the lowest-order term, H 0 F 1/2, does not depend on the bandwidth, the expansion (2.2 illustrates the well-known fact that nonparametric estimation of the GMM weighting matrix does not affect first-order asymptotics as long as that nonparametric estimator is consistent. he other two terms in the expansion involve the bandwidth and arise from the bias (S q (Ω Ω 0 and variance ( /S (ˆΩ Ω of the nonparametric estimator of the weighting matrix. In a similar expansion to (2.2 for the iid case, these two components do not appear and the next higher-order term after κ 1, 1/2 is of order 1 (see Newey and Smith (2004, which here is part of the remainder and plays no role in determining the optimal bandwidth derived below. Anatolyev (2005 does not explicitly present a stochastic expansion such as (2.2, but computes the higher-order bias B of ˆβ which turns out to be of order 1, i.e. E[ ˆβ β 0 ] = B 1 + o( 1, and therefore does not depend on the bandwidth. Interestingly, one can show that the two higher-order terms in (2.2 do not contribute to that bias (Jun (2007. In the class of GMM estimators defined by (2.1 and indexed by the bandwidth S, we now characterize the most efficient one under quadratic loss. Specifically, we rank estimators according to the MSE of the approximation ζ := β 0 +κ 1, 1/2 +κ 2, S 1/2 1 + κ 3, S q 1/2. heorem 2.1. Suppose Assumptions hold. Let W R p p be a weighting matrix. Define the weighted MSE MSE := E[(ζ β 0 W(ζ β 0 ]. hen with ν 1 := MSE = ν ν 2 S 2 + ν 3 S 2q 1 + o(η 2 1 s= ν 2 := lim E S 2 lim E [g th 0WH 0 g t+s ] [F P 0(ˆΩ Ω H 0WH 0 (ˆΩ Ω P 0 F ] S E [F P 0(ˆΩ Ω H 0WH 0 F ] ν 3 := lim S2q E [ F P 0(Ω Ω 0 H 0WH 0 (Ω Ω 0 P 0 F ] where all the limits exist and are finite. As explained above, the bias of the approximation ζ is zero so that the MSE expansion (2.1 represents only the variance of ζ. Despite the lack of a bias component of ζ, 7

9 the expansion displays the first-order tradeoff that is relevant for choosing a bandwidth: ν 2 S 2 increases in S and ν 3 S 2q 1 decreases in S. he terms have the standard order of squared bias and variance of HAC estimators as derived in Andrews (1991. Under additional conditions, the moments of the approximation ζ correspond to the moments of a formal Edgeworth expansion of the cdf of the second-step estimator ˆβ. he (finite moments of such an Edgeworth expansion can be used to approximate the distribution of ˆβ up to the specified order even when the corresponding moments of ˆβ do not exist (Götze and Hipp (1978, Rothenberg (1984, Magdalinos (1992. In this sense, we can regard (2.1 as an approximation of the MSE of ˆβ when ˆβ possesses second moments and as the MSE of an approximate estimator that shares the same Edgeworth expansion up to a specified order. Remark 2.1. For linear instrumental variable models with iid variables and normal errors, Kinal (1980 shows that ˆβ has finite moments up to order l p. Similar results have been conjectured for GMM and generalized empirical likelihood estimators (e.g. Kunitomo and Matsushita (2003, Guggenberger (2008. herefore, one should be careful in interpreting the MSE approximation in cases when the degree of over-identification is less than two. Other loss functions may then be more appropriate (see Zaman (1981, for example. Having established heorem 2.1, the calculation of an MSE-optimal bandwidth, S, becomes straightforward: for the second-order term to attain its fastest possible rate of convergence, the terms of order S 2 and S 2q 1 have to be balanced which is the case for S = c (q 1/(1+2q and some constant c (q. We refer to this bandwidth as the MSE( ˆβ-optimal bandwidth. he bandwidth minimizing the MSE of ˆΩ ( β as derived in Andrews (1991 we call the MSE(ˆΩ-optimal bandwidth. Corollary 2.1. Under the assumptions of heorem 2.1 and if l > p, minimizing the lowest order of MSE involving the bandwidth yields the optimal bandwidth growth rate 1/(1+2q. Moreover, S = c (q 1/(1+2q minimizes the higher-order AMSE defined as the limit of HMSE := (1+4q/(1+2q {MSE ν 1 1 } with ( { 1/(1+2q c c0 ν 3 2q, sign(ν 2 = sign(ν 3 (q :=, c 0 := 1, sign(ν 2 sign(ν 3. (2.3 ν 2 he expressions for ν 2 and ν 3 show that the optimal bandwidth growth rate is governed by the convergence rate of the covariances between the moment functions and the HAC estimator. he bias and variance of the HAC estimator itself only play an indirect role; in particular, the AMSE of ˆβ is not an increasing function of the AMSE of the HAC 8

10 estimator. In consequence, none of the existing procedures minimizing the AMSE of the HAC estimator (Andrews (1991, Newey and West (1994 and Andrews and Monahan (1992 among others are optimal in the above sense. he convergence rate of the MSE of ˆβ is not affected by the bandwidth choice because it is of order O( 1, only the second-order terms of the MSE, converging at an optimal rate of O( (1+4q/(1+2q, are. By choosing kernels of very high order q, this rate can be made arbitrarily close to O( 2. Nevertheless, kernels of order smaller than or equal to 2 are popular because, unlike kernels of higher order, they can produce positive definite covariance matrix estimates. Remark 2.2. Interestingly, the semiparametric estimator ˆβ converges at rate 1/2, but the optimal bandwidth minimizing MSE( ˆβ is of the same order as the optimal bandwidth minimizing MSE(ˆΩ. his result contrasts the findings in other semiparametric settings such as those studied by Powell and Stoker (1996 and Goldstein and Messer (1992, for example, in which under-smoothing the nonparametric plugin estimator leads to 1/2 - convergence rates of smooth functionals of that nonparametric plugin estimator. o gain more insight into which features of the data generating process determine the value of the optimal bandwidth and to be able to directly estimate the quantities involved, the following proposition derives more explicit expressions for the constants ν i. Proposition 2.2. Assume that {g t } follows a linear Gaussian process, viz. g t = Ψ s e t s s=0 for t = 1,...,, e t N(0, Σ e iid and Ψ s satisfies s=0 s4 Ψ s <. Define µ i := ki (xdx for i = 1, 2 and Ω (q 0 := 2πf (q. hen 2.1 Linear IV Model ν 1 = tr(ω 0 H 0WH 0, ν 2 = (2µ 1 + µ 2 (l ptr(σ 0 W, ( ν 3 = gqtr 2 Ω (q 0 H 0WH 0 Ω (q 0 P 0. In this sub-section, we specialize the expressions in Proposition 2.2 to a stylized instrumental variable model that allows us to analyze the difference between the MSE( ˆβ-optimal and the MSE(ˆΩ-optimal bandwidths and subsequently serves as the data generating process for the Monte Carlo simulations. Let y t and w t be random variables satisfying y t = β 0 w t + ε t 9

11 and z t an l-dimensional random vector of instruments such that w t = γι z t + v t where ι := (1,..., 1 R l and γ R. Define x t = (y t, w t, z t so that g(x t, β = (y t βw t z t. Further let {ε t } and {v t } be AR(1 processes with autocorrelation coefficient ρ (0, 1, viz. ε t = ρε t 1 + η t and v t = ρv t 1 + u t, where ( ( ( η t 1 σ 12 iid N 0,, σ σ 12 1 u t he instruments follow a VAR(1 process, z t = ρ z z t 1 + ɛ t with ρ z := diag(0, ρ,..., ρ, ρ (0, 1, and ɛ t iid N(0, I l independent of {(η t, u t }. hen, one can show that ( c 4c 0 g 2 (1 = 1ρ 2 ρ 2 (1 ρ 2 (2µ 1 + µ 2 (1 ρ ρ(1 + ρ ρ( l + (l 2ρ ρ + ρ 2 + ρ ρ 3 2 ( c 4c 0 g 2 (2 = 2ρ 2 ρ 2 (1 + ρ ρ(1 ρ 2 1/5 (2µ 1 + µ 2 (1 ρ ρ 3 ( l + (l 2ρ ρ + ρ 2 + ρ ρ 3 2 In this specific example, we can easily compare the MSE( ˆβ-optimal to the MSE(ˆΩ- optimal bandwidth derived in Andrews (1991, S = (qgq/µ 2 2 α(q 1/(1+2q with α(q = 2vec(Ω (q 0 W A vec(ω (q 0 /tr (W A (I l 2 + K ll (Ω 0 Ω 0. For W A = I l 2, the constants of proportionality become α(1 = 8(l 1ρρ 2 2/f(ρ, ρ 2, l and α(2 = 8(l 1ρρ 2 2(1 + ρρ 2 2 /((1 ρρ 2 2 f(ρ, ρ 2, l with f(ρ, ρ 2, l := (1 ρρ 2 2 [l(l (l 2 l 2ρρ 2 + (l 2 ρ 2 l(2 + 3ρ 2 + 4ρ 2 2ρ ρρ (1 + (l 3ρ 2 ρ 4 2 4ρρ ρ 2 ρ 6 2]. Notice that the MSE( ˆβ-optimal and the MSE(ˆΩ-optimal bandwidth both adapt to the persistence of the error processes (ρ, the persistence of the instruments ( ρ, and the number of instruments (l, but through very different functional forms. herefore, we expect there to be scenarios in which the MSE(ˆΩ-optimal bandwidth is clearly not MSE( ˆβ-optimal and the two bandwidths may differ significantly. he simulation evidence in Section 4 confirms these findings. 1/3 3 Data-driven Bandwidth Choice he optimal bandwidth S is infeasible because it depends on several unknown quantities. In the case in which {g t } is a linear Gaussian process, we require knowledge of Ω 0, Ω (q 0, and G 0. In this section, I describe a data-driven approach to select the optimal bandwidth by estimating the required quantities based on parametric approximating models for {g t }, 10

12 similarly as proposed in Andrews (1991. he idea is to first construct the first-step estimator β, then fit a parsimonious auto-regressive (AR model to {g t ( β} t=1 and, finally, to substitute its parameter estimates into analytical formulae for Ω 0 and Ω (q 0 assuming that the AR is the true model. ogether with the usual sample average estimator for G 0, these estimates, are then substituted into the expressions of ν 2 and ν 3 in Proposition 2.2 to yield estimates of the optimal bandwidths. We focus on estimating a univariate AR(1 model for each component of {g t ( β}, although other approximating models like vector autoregressions or models with more lags could be considered. Let ˆρ i and ˆσ i be the estimated coefficient and the residual variance of the i-th estimated AR(1 process. We can construct estimators of Ω 0 and Ω (q 0 as ˆΩ (q 0 := diag(ˆω 1,..., ˆω l and ˆΩ 0 := diag(ˆω (q 1,..., ˆω (q l with ˆω i := ˆσ i 2 /(1 ˆρ i 2, ˆω (1 i := 2ˆσ i 2 ˆρ i /[(1 ˆρ i 3 (1 + ˆρ i ] and ˆω (2 i Ĥ 0 := ˆΣ 0 G ( β ˆΩ 1 0, ˆΣ 0 := (G ( := 2ˆσ i 2 ˆρ i /[(1 ˆρ i 4 ]. hen, estimate H 0, Σ 0 and P 0 by β ˆΩ 1 0 G ( β 1 and ˆP := ˆΩ 0 ˆΩ 0 G ( βĥ0. Finally, substitute all these expressions into the formulae of Proposition 2.2 to get estimates ˆν 2 and ˆν 3 of ν 2 and ν 3, and the estimator of the optimal bandwidth, ( 1/(1+2q c0ˆν 3 Ŝ := 1/(1+2q. ˆν 2 he difference in performance one incurs by using Ŝ instead of the infeasible bandwidth minimizing the finite-sample MSE of ˆβ has four sources: the error made by replacing the MSE of ˆβ by the MSE of the first terms in the higher-order expansion (ζ, the error due to the large sample approximation of the MSE, the estimation error in ˆΩ 0 and ˆΩ (q 0, and the error made by potential misspecification of the approximating parametric model for {g t }. In practice, one hopes that these errors are small. As mentioned in the discussion after heorem 2.1, the first type of error vanishes with the sample size under additional assumptions. he second and third type also disappear as. he fourth type of error can typically be conjectured to be negligible because the MSE of ˆβ tends to be relatively flat around its minimum (as is the case in the Monte Carlo simulations, for example, so that misspecification in the approximating model is not expected to have a large impact on the properties of the resulting GMM estimator. Nevertheless, the applied researcher should bear in mind that the plugin procedure is not automatic and some thought has to go into selecting an appropriate approximating model and the potential impact of the aforementioned types of errors has to be considered. Remark 3.1. As in Andrews and Monahan (1992 one may want to consider pre-whitening the series {g t ( β} t=1 before fitting the AR process. he reported increases in accuracy of 11

13 test statistics in Andrews and Monahan (1992 and Newey and West (1994 are expected to occur with the procedure presented here as well. 4 Simulations In this section, we discuss a small simulation experiment that illustrates the theoretical findings from the previous sections and, in particular, shows that the MSE( ˆβ-optimal bandwidth may lead to a substantially lower finite-sample MSE of ˆβ relative to choosing the MSE(ˆΩ-optimal bandwidth. We simulate the model from Section 2.1, denoted by AR(1-HOM, for different degrees of serial correlation (ρ {0.01, 0.1, 0.5, 0.9, 0.99}, weak and strong instruments (γ {0.1, 2} and increasing number of instruments (l {2, 3, 4, 5, 10, 15, 25}. We also consider two variants of the model, one in which the outcome equation is replaced by a model with heteroskedastic errors, y t = β 0 w t + w t ε t (referred to as AR(1-HE, and one in which the AR(1 error process is replaced by an MA(1, i.e. ε t = ρη t 1 + η t and v t = ρu t 1 + u t. We simulate 1, 000 samples and, to save space, present only results for sample size = 64, ρ = 0.9, σ 12 = 0.9, β 0 = 1 and the Bartlett kernel. Other parameter combinations yield similar results. For each of the three different data generating processes, ables 1, 3, and 5 report four different bandwidths ( bw averaged over the simulation samples: optimal, Andrews, naive and sim, referring to the MSE( ˆβ-optimal, the MSE(ˆΩ-optimal, the naive choice S = 1/(1+2q and to the (infeasible bandwidth that minimizes the simulated MSE( ˆβ over a grid of bandwidths, respectively. he optimal bandwidth is estimated based on the procedure in Section 3. he table also shows the bias, standard deviation ( SD and MSE of ˆβ. In almost all cases considered here, the MSE( ˆβ-optimal bandwidth is closer to the one minimizing the simulated MSE than the MSE(ˆΩ-optimal bandwidth. In all scenarios, the MSE( ˆβ-optimal bandwidth is smaller than the MSE(ˆΩ- optimal bandwidth one, in some cases substantially smaller. ables 2, 4, and 6 show the ratios of MSE ( MSE ratio and higher-order MSE ( HMSE ratio, as defined in Corollary 2.1, based on the MSE( ˆβ-optimal bandwidth divided by those based on the MSE(ˆΩ-optimal bandwidth. he number of instruments is fixed at l = 10, but other numbers yield qualitatively the same results. µ 2 /l denotes the standardized concentration parameter measuring the strength of the instruments (Stock, Wright, and Yogo (2002. he MSE ratios demonstrate that the MSE( ˆβ-optimal bandwidth may lead to substantial MSE gains relative to the MSE(ˆΩ-optimal bandwidth. he gains are particularly large, up to more than 20%, when the number of instruments is large 12

14 and the estimator of the optimal weighting matrix becomes less precise. As predicted by the theoretical results in the previous sections, the MSE( ˆβ-optimal bandwidth may also lead to dramatic higher-order MSE gains relative to the MSE(ˆΩ-optimal bandwidth of up to more than 90%. Unlike the MSE(ˆΩ-optimal bandwidth defined by Andrews (1991, the MSE( ˆβ- optimal bandwidth is formally not defined for the case l = p in which the estimator ˆβ is independent of the weighting matrix. o study scenarios in which l/p is close to this boundary, we conclude this section by considering a robustness check in which l/p approaches one. able 7 reports the same MSE and HMSE ratio as ables 2, 4, and 6, but for a sequence l/p {5/1, 4/2, 3/2, 4/3, 5/4, 8/7, 10/9} that approaches one., ρ, ρ, and γ are fixed at values 128, 0.5, 0.9, and 2, respectively, but other values yield similar results. he MSE based on the MSE( ˆβ-optimal bandwidth stays close to or slightly smaller than the one based on the MSE(ˆΩ-optimal bandwidth for all values of l/p. Similarly, the HMSE is significantly smaller for the optimal bandwidth. In the case of the AR(1-HE model, the HMSE gains are even up to 67%. 5 Conclusion his paper develops a selection procedure for the bandwidth of a HAC estimator of the optimal GMM weighting matrix which minimizes the asymptotic MSE of the resulting two-step GMM estimator. We show that it is of the same order as the bandwidth minimizing the MSE of the nonparametric plugin estimator, but the constants of proportionality differ significantly. he simulation study suggests that the data-driven version of the selection procedure works well in finite samples and may substantially reduce the firstand second-order MSE of the GMM estimator relative to existing, sub-optimal choices, especially when the number of moment conditions is large. Notes 1 I thank Michael Jansson for making me aware of this work. References Anatolyev, S. (2005: GMM, GEL, Serial Correlation, and Asymptotic Bias, Econometrica, 73(3,

15 Andrews, D. W. K. (1991: Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation, Econometrica, 59(3, Andrews, D. W. K., and J. C. Monahan (1992: An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator, Econometrica, 60(4, Goldstein, L., and K. Messer (1992: Optimal Plug-in Estimators for Nonparametric Functional Estimation, he Annals of Statistics, 20(3, Götze, F., and C. Hipp (1978: Asymptotic Expansions in the Central Limit heorem under Moment Conditions, Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 42, Guggenberger, P. (2008: Finite Sample Evidence Suggesting a Heavy ail Problem of the Generalized Empirical Likelihood Estimator, Econometric Reviews, 27(4, Hall, P., and C. C. Heyde (1980: Martingale Limit heory and Its Applications. Academic Press, New York. Hansen, L. P. (1982: Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50(4, Hansen, L. P., J. Heaton, and A. Yaron (1996: Finite-Sample Properties of Some Alternative GMM Estimators, Journal of Business and Economic Statistics, 14(3, Imbens, G. W., R. H. Spady, and P. Johnson (1998: Information heoretic Approaches to Inference in Moment Condition Models, Econometrica, 66(2, Jansson, M. (2002: Consistent Covariance Matrix Estimation for Linear Processes, Econometric heory, 18, Jun, B. H. (2007: Essays in Econometrics, Ph.D. thesis, University of California, Berkeley. Kiefer, N., and. Vogelsang (2002a: Heteroskedasticity-Autocorrelation Robust Standard Errors Using he Bartlett Kernel Without runcation, Econometrica, 70(5,

16 (2002b: Heteroskedasticity-Autocorrelation Robust esting Using Bandwidth Equal to Sample Size, Econometric heory, 18, (2005: A New Asymptotic heory for Heteroskedasticity-Autocorrelation Robust ests, Econometric heory, 21, Kinal,. W. (1980: he Existence of Moments of k-class Estimators, Econometrica, 48(1, pp Kitamura, Y., and M. Stutzer (1997: An Information-heoretic Alternative to Generalized Method of Moments Estimation, Econometrica, 65(4, Kunitomo, N., and Y. Matsushita (2003: Finite Sample Distributions of the Empirical Likelihood Estimator and the GMM Estimator, Discussion Paper F-200, CIRJE. Magdalinos, M. A. (1992: Stochastic Expansions and Asymptotic Approximations, Econometric heory, 8(3, Nagar, A. L. (1959: he Bias and Moment Matrix of the General k-class Estimators of the Parameters in Simultaneous Equations, Econometrica, 27(4, Newey, W. K., and D. McFadden (1994: Large Sample Estimation and Hypothesis esting, in Handbook of Econometrics, ed. by R. F. Engle, and D. L. McFadden, vol. IV, pp Elsevier Science B.V. Newey, W. K., and R. J. Smith (2004: Higher Order Properties of GMM and Generalized Empirical Likelihood Estimators, Econometrica, 72(1, Newey, W. K., and K. West (1994: Automatic Lag Selection in Covariance Matrix Estimation, he Review of Economic Studies, 61(4, Owen, A. B. (1988: Empirical Likelihood Ratio Confidence Intervals for a Single Functional, Biometrika, 75(2, Parzen, E. (1957: On Consistent Estimates of the Spectrum of a Stationary ime Series, he Annals of Mathematical Statistics, 28(2, Powell, J. L., and. M. Stoker (1996: Optimal Bandwidth Choice for Densityweighted Averages, Journal of Econometrics, 75, Qin, J., and J. Lawless (1994: Empirical Likelihood and General Estimating Equations, he Annals of Statistics, 22(1,

17 Rilstone, P., V. K. Srivastava, and A. Ullah (1996: he Second-order Bias and Mean Squared Error of Nonlinear Estimators, Journal of Econometrics, 75(2, Robinson, P. M. (1991: Automatic Frequency Domain Inference on Semiparametric and Nonparametric Models, Econometrica, 59(5, Rothenberg,. (1984: Approximating the Distributions of Econometric Estimators and est Statistics, in Handbook of Econometrics, ed. by Z. Griliches, and M. D. Intriligator, vol. II, pp Elsevier Science Publishers B.V. Stock, J. H., J. H. Wright, and M. Yogo (2002: A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments, Journal of Business and Economic Statistics, 20(4. Sun, Y., and P. C. B. Phillips (2008: Optimal Bandwidth Choice for Interval Estimation in GMM Regression, Discussion Paper 1661, Cowles Foundation, Yale University. Sun, Y., P. C. B. Phillips, and S. Jin (2008: Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust esting, Econometrica, 76(1, amaki, K. (2007: Second Order Optimality for Estimators in ime Series Regression Models, Journal of Multivariate Analysis, 98, White, H., and I. Domowitz (1984: Nonlinear Regression with Dependent Observations, Econometrica, 52(1, Xiao, Z., and P. C. B. Phillips (1998: Higher-order Approximations for Frequency Domain ime Series Regression, Journal of Econometrics, 86, Zaman, A. (1981: Estimators Without Moments: he Case of the Reciprocal of a Normal Mean, Journal of Econometrics, 15(2, A Proofs Lemma A.1. Under Assumptions and 2.3(a (c, ˆβ β 0 = ψ 1/2 + o p ( 1/2 with ψ = G 1 0 ĝ = Op (1. 16

18 Proof. We need to check the assumptions of Newey and McFadden (1994, heorem 3.2 with Ŵ replaced by ˆΩ ( β. First of all, by Assumptions and Andrews (1991, heorem 1(b ˆΩ ( β p Ω 0. (i, (ii and (v hold by assumption. (iii and (iv hold by Assumption 2.1 and White and Domowitz (1984, heorem 2.3, 2.4. Q.E.D. Proof of Proposition 2.1. Step I: Expansion of the optimal weighting matrix. A aylor expansion of ˆΩ ( β around β 0 yields vec (ˆΩ ( β = vec (ˆΩ + ˆΩ ( β( β β 0 = vec(ω 0 + vec (ˆΩ Ω 0 + Ω 0 ( β β ( ˆΩ Ω 0 ( β β 0 ( ˆΩ ( β ˆΩ ( β β 0 (A.1 where β lies on the line segment joining β and β 0. By Assumptions and Andrews (1991, Proposition 1(a,(b, ˆΩ Ω 0 = ω 1, S 1/2 1/2 + ω 2, S q with ω i, = O p (1, i = 1, 2, and ˆΩ Ω 0 = O p (η. Next, we show that ˆΩ ( β ˆΩ = O p ( β β 0. o this end, let ḡ t := g t ( β. Notice that, by Assumption 2.1(f, β N w.p.a. 1 and, thus, β N w.p.a. 1. From Assumption 2.3(d, we get ˆΩ ( β ˆΩ 1 1 min{, s} s=1 t=max{1,1 s} 1 s= min{, s} ( s k [ ḡ t+s ḡ t g t+s g t] S C d(x t+s, x t β β 0 s=1 t=max{1,1 s} ( = C Ed(x t+s, x t + o p (1 β β 0 = O p ( β β 0 (A.2 which holds w.p.a. 1 and for some constant C. (A.1 together with (A.2 and the firstorder asymptotics in Assumption 2.1(f then imply ˆΩ ( β = Ω 0 + ω 1, S 1/2 1/2 + ω 2, S q + R (A.3 with R = O p (η 1/2, ω 1, := /S (ˆΩ Ω = O p (1 and ω 2, := S q (Ω Ω 0 = O(1. Step II: Expansion of the second-step GMM estimator. Write the second-stage estimator ˆθ := ( ˆβ, ˆλ of θ 0 := (β 0, 0 B Λ, Λ := [0, l, as the solution to ( G ( ˆβ ˆλ ĝ( ˆβ + ˆΩ ( βˆλ = 0. (A.4 17

19 Further, define ˆm(θ := 1 t=1 m t(θ with ( m t (θ := G(x t, β λ g t (β + Ω 0 λ for some θ := (β, λ B Λ. hen use the expansion in (A.3 to rewrite (A.4 as ( 0 0 = ˆm(ˆθ (ω 1, S 1/2 1/2 + ω 2, S q + R ˆλ. (A.5 Next, consider ˆλ = ˆΩ ( β 1 ĝ( ˆβ. By Assumptions and Andrews (1991, Proposition 1(a,(b, heorem 1(b, ˆΩ ( β Ω 0 = O p (η. Also, by an expansion of ĝ( ˆβ around β 0, Lemma A.1, Assumption 2.1 and the CL, ĝ( ˆβ = (I l G 0 H 0 ĝ + o p ( 1/2, and thus ˆλ = [Ω 0 + O p (η ] 1 ( (I l G 0 H 0 F 1/2 + o p ( 1/2 = P 0 F 1/2 + O p (η 1/2 + o p ( 1/2. (A.6 Consider the following expansion of ˆm(ˆθ around θ 0 : ˆm(ˆθ = ˆm(θ 0 + ˆm(θ 0 (ˆθ θ [ ] 2 2 ˆm( θ (ˆθ θ 0 (ˆθ θ 0. where θ lies on the line segment joining ˆθ and θ 0. By Assumptions 2.1(e and 2.3(c (d, 2 ˆm( θ is O p (1. Lemma A.1, Assumption 2.1 and the CL for mixing sequences then imply ˆm(ˆθ = ˆm(θ 0 + M 1 (ˆθ θ 0 + O p ( 1. Substituting (A.6 and (A.7 into (A.5 and solving for ˆθ θ 0 yields ( ˆθ θ 0 = M1 1 ˆm(θ 0 M1 1 0 (ω 1, S1/2 1/2 + ω 2, S q P 0F 1/2 + O p ( 1 + O p (η 2 1/2 + o p (η 1/2 (A.7 where ( ( M 1 = 0 G 0 G 0 Ω 0, M 1 1 = Σ 0 H 0 H 0 P 0. herefore, ( ˆβ β0 = H 0 F + H 0 ω 1, P 0 F S 1/2 1/2 + H 0 ω 2, P 0 F S q + o p(η. Since F, ω 1, and ω 2, are O p (1, we also have that κ i, = O p (1 for i = 1, 2, 3. Q.E.D. 18

20 Proof of heorem 2.1. We need to derive the order of E[κ i, Wκ j, ], i, j {1, 2, 3}. Consider the case i = j = 1: E[κ 1, Wκ 1, ] = E[F H 0WH 0 F ] = 1 = 1 s= ( 1 E[g th 0WH 0 g s ] s,t=1 ( 1 s E[g th 0WH 0 g t s ] s= E[g th 0WH 0 g t s ]. he limiting sum can be shown to be finite using Assumption 2.1, the Hölder Inequality and the mixing inequality of Hall and Heyde (1980, Corollary A.2. Similarly, we can show that E[κ 2, Wκ 2, ] and E[κ 3, Wκ 3, ] are O(1, E[κ 1, Wκ 2, ] = O(S 1/2 1/2, but E[κ 1, Wκ 3, ] = o(η and E[κ 2, Wκ 3, ] = o(1. Q.E.D. Proof of Proposition 2.2. From the proof of heorem 2.1, E[κ 1, Wκ 1, ] E[g th 0WH 0 g t s ] = vec(ω 0 vec (H 0WH 0. s= By Andrews (1991, Proposition 1(b we have S q (Ω Ω 0 g q Ω (q 0 and thus E[κ 3, Wκ 3, ] = E[F P 0S q (Ω Ω 0 H 0WH 0 S q (Ω Ω 0 P 0 F ] [ ] gq 2 E g tp 0Ω (q 0 H 0WH 0 Ω (q 0 P 0 g t s s= ( = gqvec(ω 2 0 vec P 0Ω (q 0 H 0WH 0 Ω (q 0 P 0 ( = gqtr 2 Ω (q 0 H 0WH 0 Ω (q 0 P 0 because, for conformable matrices A, B, tr(ab = tr(ba, and P 0 G 0 H 0 = 0. Next, consider the terms E [ ] [ ] κ 1, Wκ 2, = E F H 0WH 0 /S (ˆΩ Ω P 0 F [ = E (F F vec (H 0WH ] 0 /S (ˆΩ Ω P 0 [ ( ] = E (F F (P 0 H 0WH 0 vec /S (ˆΩ Ω [ ( ( ] = E F F vec /S (ˆΩ Ω vec (P 0 H 0WH 0 and, similarly, [ ] E[κ 2, Wκ 2, ] = E F P 0 /S (ˆΩ Ω H 0WH 0 /S (ˆΩ Ω P 0 F = [ ] vec(p 0 P S 0 E F F (ˆΩ Ω (ˆΩ Ω vec(h 0WH 0. 19

21 In order to find expressions for these two cross-products involving /S (ˆΩ Ω, we make use of the BN-decomposition of the linear process {g t }, viz. g t = Ψe t + ẽ t 1 ẽ t where Ψ := j 0 Ψ j, ẽ t := j 0 Ψ j e t j and the tail sums Ψ j := k j+1 Ψ k. With this representation of {g t } we can calculate limiting variances and covariances of g t based only on Ψe t and disregard the transient part of the process, ẽ t 1 ẽ t. Since e t N(0, Σ e, third and fourth moments are zero. herefore, [ ( ] S 1/2 3/2 E F F vec /S (ˆΩ Ω 1 min{, v} ( = E g t g s v k vec (g u+v g u E[g u+v g S u] s,t=1 v=1 u=max{1,1 v} = E Ψe t Ψe t vec (Ψe u Ψe u E[Ψe u Ψe u] + o(1 = = + E s,t=1 t=1 u=1 u t Ψe t Ψe s s,t=1 1 v=1 v 0 ( t s k s,t=1 + s,t=1 min{, v} u=max{1,1 v} S S 1 v=1 v 0 ( v k S min{, v} u=max{1,1 v} E [Ψe t Ψe s Ψe s Ψe t ] ( v k vec (Ψe u+v Ψe S u E [Ψe t Ψe s Ψe u Ψe u+v ] + o(1 ( t s k E [Ψe t Ψe s Ψe t Ψe s ] + o(1 One can show that E[Ψe t Ψe s Ψe s Ψe t ] = E[Ψe t Ψe s Ψe t Ψe s ] = vec(ω 0 Ω 0 when s t and 0 otherwise. Noticing 1 S s,t=1 k((t s/s µ 1, we then have S E [ κ 1, Wκ 2, ] 2µ1 vec(ω 0 Ω 0 vec (P 0 H 0WH 0 = 2µ 1 tr((ω 0 Ω 0 (P 0 H 0WH 0 = 2µ 1 tr(ω 0 P 0 Ω 0 H 0WH 0 = 2µ 1 tr(i l G 0 H 0 tr(ω 0 Ω 1 0 G 0 Σ 0 WΣ 0 G 0Ω 1 0 = 2µ 1 (l tr(h 0 G 0 tr(σ 0 WΣ 0 G 0Ω 1 0 G 0 = 2µ 1 (l ptr(σ 0 W 20

22 which uses the fact that H 0 G 0 = I p. By a similar derivation, E[κ 2, Wκ 2, ] µ 2 (l ptr(σ 0 W so that ν 2 = (2µ 1 + µ 2 (l ptr(σ 0 W. Q.E.D. AR(1-HOM, = 64, l = 10 γ = 0.1 γ = 2 ρ optimal Andrews naive sim optimal Andrews naive sim 0.01 bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE able 1: Bandwidths ( bw, bias, standard deviation ( SD and MSE of ˆβ when computed based on the MSE(ˆΩ-optimal ( optimal, the MSE(ˆΩ-optimal ( Andrews, S = 1/(1+2q ( naive or the simulated MSE-minimizing ( sim bandwidth. 21

23 AR(1-HOM, = 64 γ = 0.1 γ = 2 ρ l MSE ratio HMSE ratio µ 2 /l MSE ratio HMSE ratio µ 2 /l able 2: Ratios of MSE ( MSE ratio and higher-order MSE ( HMSE ratio based on the MSE( ˆβ- optimal bandwidth divided by those based on the MSE(ˆΩ-optimal bandwidth. µ 2 /l is the standardized concentration parameter measuring the strength of the instruments. 22

24 AR(1-HE, = 64, l = 10 γ = 0.1 γ = 2 ρ optimal Andrews naive sim optimal Andrews naive sim 0.01 bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE able 3: Bandwidths ( bw, bias, standard deviation ( SD and MSE of ˆβ when computed based on the MSE(ˆΩ-optimal ( optimal, the MSE(ˆΩ-optimal ( Andrews, S = 1/(1+2q ( naive or the simulated MSE-minimizing ( sim bandwidth. 23

25 AR(1-HE, = 64 γ = 0.1 γ = 2 ρ l MSE ratio HMSE ratio µ 2 /l MSE ratio HMSE ratio µ 2 /l able 4: Ratios of MSE ( MSE ratio and higher-order MSE ( HMSE ratio based on the MSE( ˆβ- optimal bandwidth divided by those based on the MSE(ˆΩ-optimal bandwidth. µ 2 /l is the standardized concentration parameter measuring the strength of the instruments. 24

26 MA(1, = 64, l = 10 γ = 0.1 γ = 2 ρ optimal Andrews naive sim optimal Andrews naive sim 0.01 bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE bw bias sd MSE able 5: Bandwidths ( bw, bias, standard deviation ( SD and MSE of ˆβ when computed based on the MSE(ˆΩ-optimal ( optimal, the MSE(ˆΩ-optimal ( Andrews, S = 1/(1+2q ( naive or the simulated MSE-minimizing ( sim bandwidth. 25

27 MA(1, = 64 γ = 0.1 γ = 2 ρ l MSE ratio HMSE ratio µ 2 /l MSE ratio HMSE ratio µ 2 /l able 6: Ratios of MSE ( MSE ratio and higher-order MSE ( HMSE ratio based on the MSE( ˆβ- optimal bandwidth divided by those based on the MSE(ˆΩ-optimal bandwidth. µ 2 /l is the standardized concentration parameter measuring the strength of the instruments. 26

A Two-Stage Plug-In Bandwidth Selection and Its Implementation in Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation

A Two-Stage Plug-In Bandwidth Selection and Its Implementation in Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation A wo-stage Plug-In Bandwidth Selection and Its Implementation in Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation Masayuki Hirukawa Concordia University September 2004 Abstract

More information

1. INTRODUCTION. In many structural economic or time-series models, the errors may have heterogeneity and temporal dependence of unknown form.

1. INTRODUCTION. In many structural economic or time-series models, the errors may have heterogeneity and temporal dependence of unknown form. A PRACIIONER S GUIDE O ROBUS COVARIANCE MARIX ESIMAION Wouter J. den Haan Department of Economics, University of California at San Diego, and National Bureau of Economic Research and Andrew Levin International

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

The term structure of Russian interest rates

The term structure of Russian interest rates The term structure of Russian interest rates Stanislav Anatolyev New Economic School, Moscow Sergey Korepanov EvrazHolding, Moscow Corresponding author. Address: Stanislav Anatolyev, New Economic School,

More information

Department of Economics

Department of Economics Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Assessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope

Assessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope Assessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope Dukpa Kim Boston University Pierre Perron Boston University December 4, 2006 THE TESTING

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Financial TIme Series Analysis: Part II

Financial TIme Series Analysis: Part II Department of Mathematics and Statistics, University of Vaasa, Finland January 29 February 13, 2015 Feb 14, 2015 1 Univariate linear stochastic models: further topics Unobserved component model Signal

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Testing against a Change from Short to Long Memory

Testing against a Change from Short to Long Memory Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: December 9, 2007 Abstract This paper studies some well-known tests for the null

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

More information

Testing against a Change from Short to Long Memory

Testing against a Change from Short to Long Memory Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: January 2, 2008 Abstract This paper studies some well-known tests for the null

More information

Bandwidth Selection for Nonparametric Distribution Estimation

Bandwidth Selection for Nonparametric Distribution Estimation Bandwidth Selection for Nonparametric Distribution Estimation Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~bhansen May 2004 Abstract The mean-square efficiency of cumulative distribution function

More information

State Space Time Series Analysis

State Space Time Series Analysis State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

More information

(Quasi-)Newton methods

(Quasi-)Newton methods (Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting

More information

Example 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum. asin. k, a, and b. We study stability of the origin x

Example 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum. asin. k, a, and b. We study stability of the origin x Lecture 4. LaSalle s Invariance Principle We begin with a motivating eample. Eample 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum Dynamics of a pendulum with friction can be written

More information

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011 IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS By Steven T. Berry and Philip A. Haile March 2011 Revised April 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1787R COWLES FOUNDATION

More information

Bootstrapping Big Data

Bootstrapping Big Data Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Nonparametric Time Series Analysis in JMulTi March 29, 2005 Rolf Tschernig

Nonparametric Time Series Analysis in JMulTi March 29, 2005 Rolf Tschernig Nonparametric Time Series Analysis in JMulTi March 29, 2005 Rolf Tschernig Univariate nonparametric time series models are a valuable tool for modelling the conditional mean and conditional volatility

More information

Quantile Regression under misspecification, with an application to the U.S. wage structure

Quantile Regression under misspecification, with an application to the U.S. wage structure Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The

More information

5 Numerical Differentiation

5 Numerical Differentiation D. Levy 5 Numerical Differentiation 5. Basic Concepts This chapter deals with numerical approximations of derivatives. The first questions that comes up to mind is: why do we need to approximate derivatives

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Centre for Central Banking Studies

Centre for Central Banking Studies Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

More information

FULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS

FULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS FULLY MODIFIED OLS FOR HEEROGENEOUS COINEGRAED PANELS Peter Pedroni ABSRAC his chapter uses fully modified OLS principles to develop new methods for estimating and testing hypotheses for cointegrating

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Uncertainty in Second Moments: Implications for Portfolio Allocation

Uncertainty in Second Moments: Implications for Portfolio Allocation Uncertainty in Second Moments: Implications for Portfolio Allocation David Daewhan Cho SUNY at Buffalo, School of Management September 2003 Abstract This paper investigates the uncertainty in variance

More information

Analysis of Load Frequency Control Performance Assessment Criteria

Analysis of Load Frequency Control Performance Assessment Criteria 520 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 3, AUGUST 2001 Analysis of Load Frequency Control Performance Assessment Criteria George Gross, Fellow, IEEE and Jeong Woo Lee Abstract This paper presents

More information

On Marginal Effects in Semiparametric Censored Regression Models

On Marginal Effects in Semiparametric Censored Regression Models On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the

More information

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Generalized Method of Moments Estimation

Generalized Method of Moments Estimation Generalized Method of Moments Estimation Lars Peter Hansen 1 Department of Economics University of Chicago email: l-hansen@uchicago.edu June 17, 2007 1 I greatly appreciate comments from Lionel Melin,

More information

Univariate and Multivariate Methods PEARSON. Addison Wesley

Univariate and Multivariate Methods PEARSON. Addison Wesley Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

More information

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student

More information

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND I J A B E R, Vol. 13, No. 4, (2015): 1525-1534 TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND Komain Jiranyakul * Abstract: This study

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

More information

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

More information

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1. **BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,

More information

The Statistics of Sharpe Ratios

The Statistics of Sharpe Ratios he Statistics of Sharpe Ratios Andrew W. Lo he building blocks of the Sharpe ratio expected returns and volatilities are unknown quantities that must be estimated statistically and are therefore subject

More information

Non Parametric Inference

Non Parametric Inference Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z

FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z DANIEL BIRMAJER, JUAN B GIL, AND MICHAEL WEINER Abstract We consider polynomials with integer coefficients and discuss their factorization

More information

Stochastic Inventory Control

Stochastic Inventory Control Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

More information

A Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models

A Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models Article A Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models Richard A. Ashley 1, and Xiaojin Sun 2,, 1 Department of Economics, Virginia Tech, Blacksburg, VA 24060;

More information

Integrating Financial Statement Modeling and Sales Forecasting

Integrating Financial Statement Modeling and Sales Forecasting Integrating Financial Statement Modeling and Sales Forecasting John T. Cuddington, Colorado School of Mines Irina Khindanova, University of Denver ABSTRACT This paper shows how to integrate financial statement

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Forecasting methods applied to engineering management

Forecasting methods applied to engineering management Forecasting methods applied to engineering management Áron Szász-Gábor Abstract. This paper presents arguments for the usefulness of a simple forecasting application package for sustaining operational

More information

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

More information

Statistical Tests for Multiple Forecast Comparison

Statistical Tests for Multiple Forecast Comparison Statistical Tests for Multiple Forecast Comparison Roberto S. Mariano (Singapore Management University & University of Pennsylvania) Daniel Preve (Uppsala University) June 6-7, 2008 T.W. Anderson Conference,

More information

On the long run relationship between gold and silver prices A note

On the long run relationship between gold and silver prices A note Global Finance Journal 12 (2001) 299 303 On the long run relationship between gold and silver prices A note C. Ciner* Northeastern University College of Business Administration, Boston, MA 02115-5000,

More information

Time Series Analysis

Time Series Analysis Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:

More information

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

More information

Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization

Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Archis Ghate a and Robert L. Smith b a Industrial Engineering, University of Washington, Box 352650, Seattle, Washington,

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

171:290 Model Selection Lecture II: The Akaike Information Criterion

171:290 Model Selection Lecture II: The Akaike Information Criterion 171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

Extreme Movements of the Major Currencies traded in Australia

Extreme Movements of the Major Currencies traded in Australia 0th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 013 www.mssanz.org.au/modsim013 Extreme Movements of the Major Currencies traded in Australia Chow-Siing Siaa,

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV Contents List of Figures List of Tables List of Examples Foreword Preface to Volume IV xiii xvi xxi xxv xxix IV.1 Value at Risk and Other Risk Metrics 1 IV.1.1 Introduction 1 IV.1.2 An Overview of Market

More information

Vilnius University. Faculty of Mathematics and Informatics. Gintautas Bareikis

Vilnius University. Faculty of Mathematics and Informatics. Gintautas Bareikis Vilnius University Faculty of Mathematics and Informatics Gintautas Bareikis CONTENT Chapter 1. SIMPLE AND COMPOUND INTEREST 1.1 Simple interest......................................................................

More information

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Appendix 1: Time series analysis of peak-rate years and synchrony testing. Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5. PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

More information

On the mathematical theory of splitting and Russian roulette

On the mathematical theory of splitting and Russian roulette On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing

More information

the points are called control points approximating curve

the points are called control points approximating curve Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.

More information

Examples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.

Examples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert. Cornell University April 25, 2009 Outline 1 2 3 4 A little about myself BA and MA in mathematics PhD in statistics in 1977 taught in the statistics department at North Carolina for 10 years have been in

More information

Threshold Autoregressive Models in Finance: A Comparative Approach

Threshold Autoregressive Models in Finance: A Comparative Approach University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Informatics 2011 Threshold Autoregressive Models in Finance: A Comparative

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Quantitative Methods for Finance

Quantitative Methods for Finance Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

Inequality, Mobility and Income Distribution Comparisons

Inequality, Mobility and Income Distribution Comparisons Fiscal Studies (1997) vol. 18, no. 3, pp. 93 30 Inequality, Mobility and Income Distribution Comparisons JOHN CREEDY * Abstract his paper examines the relationship between the cross-sectional and lifetime

More information

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear

More information