Monte Carlo, Bootstrap, and Jackknife Estimation Assume that your true model is
|
|
- Ellen Hodges
- 7 years ago
- Views:
Transcription
1 Monte Carlo, Bootstrap, and Jackknife Estimation Assume that your true model is y = Xβ + u, (1.1) where u is i.i.d. with 1) E(u X) = 0 and 2) E(uu X) = σ 2 I, that is, the conditional mean of the error is zero and there is no autocorrelation or heteroskedasticity conditional on X. Then using 1) the ordinary least squares (OLS) estimator of β, ˆβ = (X X) 1 X y, is unbiased. You will want to estimate the variance of ˆβ. Using 2) an estimator of the var( ˆβ) = σ 2 (X X) 1 is ˆσ 2 (X X) 1, (1.2) where ˆσ 2 = û û/(n K), û = y X ˆβ, N = the number of observations, and K = the number of regressors. To understand what is meant by the var( ˆβ) and its estimator, consider the following Monte Carlo procedure. Keep in mind that you would never want to apply this procedure to the classical linear model in (1.1) for actual data, because you can easily evaluate (1.2). 1. MONTE CARLO ESTIMATION OF STANDARD ERRORS OF ˆβ: (= positive square root of the estimated variance). a. Assume a value for β, which is otherwise unobservable. You also select a matrix of values for X, which you hold constant over repeated trials. b. Draw u randomly with replacement from some distribution you assume to be correct, using a random number generator. This use of a random number generator yields the term Monte Carlo, famed for its roulette wheels and games of chance. c. Compute y from equation (1.1). d. Estimate ˆβ by regressing your generated y on X obtaining ˆβ = (X X) 1 X y. (1.3) e. Repeat steps (a)-(d) many times (say 10,000) holding β constant. Note that you have generated 10,000 drawings of the random variable u and y through equation (1.1), from which we could compute 10,000 estimates of ˆβ using (1.3). Our estimate of the sample variance of ˆβ over 10,000 such outcomes is our sample measure of the population variance. 1
2 f. The main use of the Monte Carlo method is to compute the bias and meansquare error of your estimator when it is difficult to do so analytically. However, it is also useful to demonstrate omitted variable bias and other related models to econometrics students. Keep in mind that the assumptions made in the Monte Carlo method may make your results specific to your exact model. 2. BOOTSTRAP ESTIMATION OF THE STANDARD ERRORS OF ˆβ. The term bootstrap implies that you are going pull yourself up by your bootstraps. The wide range of bootstrap methods fit into two categories: 1) methods that allow computation of standard errors when the analytical formulas are hard to derive and 2) bootstrap methods that lead to better small-sample approximations. Here you are confined to one actual data set and wish to resample from the empirical distribution of the residuals or the original {X, y} data, rather than assume some distribution of the true error as with Monte Carlo analysis. Given (1.1) as your true model, you could easily evaluate (1.2) and not do any bootstrapping. With more complex models containing non-normal error terms and non-linearities in β, or in two-step models where you need to correct the estimated standard errors of the second-step estimators, the derivation of analytical formulas for the variance of ˆβ is complex. Examples are two-step M estimators, two-step panel data estimators, and two-step logit or probit estimators. In these cases ˆθ, a second-step estimator, is a function of parameters that are estimated in the first step. The bootstrap will adjust for this. For software that does not compute heteroskedastic consistent (HC) standard errors, if there is heteroskedasticity in the model, the wild and pairs bootstrap estimators make the HC correction of the estimated standard errors. With clustered data, cluster-robust standard errors can be obtained by resampling the clusters via bootstrapping. Theory and Monte Carlo evidence indicate that the bootstrap estimates are more accurate (measured by the size and power of a t-test based on the estimated standard error) in small samples than the asymptotic formula, when an asymptotically pivotal statistic is employed (one whose asymptotic normal distribution does not depend on unknown parameters). Otherwise, there is no guarantee of a gain in accuracy However, usually there is a gain in accuracy even if an asymptotically pivotal statistic is not employed. A nice summary is found in Bootstrap Inference in Econometrics by James MacKinnon, Dept. of Economics Working Paper, Queens Univ., June, Also, see J. L. Horowitz, The Bootstrap, Ch. 52, Handbook of Econometrics, J.J. Heckman and E. Leamer editors, Vol. 5, 2001 for technical derivations. There are three basic non-parametric bootstrap methods we will focus on: the naive bootstrap, pairs bootstrapping, and the wild bootstrap. These are in contrast to the less 2
3 popular parametric bootstrap. The three methods are most easily explained for the simple model (1.1). As with the Monte Carlo method, you assume that the model generating your data is the same as in (1.1). However, now you do not assume knowledge of β or u and do not generate random data from (1.1). Instead you use the estimator ˆβ and the original data {X, y}.: 1. Bootstrap Methods 1.1 Non-Parametric: Residual Bootstrap a. Estimate ˆβ = (X X) 1 X y. b. Compute û = y X ˆβ. You work with û instead of assuming the distribution of u as in Monte Carlo estimation. c. Draw with replacement a sample of size N using a discrete uniform random number generator U[1, N], where N is your sample size. Let these random numbers be represented by z 1,..., z N. Generate element u n as element z n of û, n = 1,..., N. What this means is that each element of û has probability 1/N of being drawn. See the Residual Bootstrap example in the Stata do file called monte carlo.do. d. Treating y = X ˆβ + u as your true model, compute y. e. Compute β = (X X) 1 X y. f. Repeat (c)-(e) B times. See MacKinnon for details. g. Compute the square root of the sample variance of these β estimates. This is the estimate of the standard error of ˆβ. With B bootstrap replications, β 1,..., β B compute s 2ˆβ,Boot = 1 (B 1) B (βb β ) 2. where β = B 1 B b=1 β b. h. Take the square root of s 2ˆβ,Boot to get the bootstrap estimate of the standard error. i. This bootstrap provides no asymptotic refinement (an improved approximation to the finite-sample distribution of an asymptotically pivotal statistic), since its distribution depends on the unknown parameter defining the mean and variance of ˆβ. That is, there will be no guarantee of an improvement in finite-sample performance. However, such an improvement usually obtains anyway. This method can be very useful in computing adjusted standard 3 b=1
4 errors with 2-step models or in computing cluster-robust standard errors by resampling clusters. 1.2 Non-Parametric: Pairs Bootstrap a. Follow step a above. b. Then draw pairs randomly with replacement, where the probability of any pair being drawn is equal to 1/N, from {X, y} to obtain {X, y } c. Then use the {X, y } data to obtain the pairs estimator βp = (X X ) 1 X y, d. Note that the pairs bootstrap produces a HC covariance matrix. See Lancaster (2003) for a proof of this. See the Pairs Estimator in the Stata file called monte carlo.do. 1.3 Non-Parametric: Wild Bootstrap a. The wild bootsrap also produces a HC covariance matrix; see MacKinnon (2002) for details. b. The wild first generates y n = X n ˆβ + f(ûn )v n, (1.4) where f(û n ) = û n (1 h n ) 1/2 (1.5) and h n is the n th diagonal element of X(X X) 1 X. We do this normalization so that, if u n is homoskedastic, then the normalized residual in (1.5) is homoskedastic. To see this remember that û n = (1 h n )u n and compute the variance of û n, where (1 h n ) is sometimes called m n. c. The best approach to specifying v n is to use the Rademacher distribution (See Davidson and Flachaire (2001)): v n = { 1 with probability 1/2, 1 with probability 1/2. (1.6) d. Now v n has E(v n) = 0, E(v 2 n ) = 1, E(v 3 n ) = 0, and E(v 4 n ) = 1. Since v n and û n are independent, the mean of the composite residual is zero, which preserves E(û n ) = 0. This is a nice property and if we take X n as given, this implies unbiasedness of β. 4
5 e. One can prove that var(wz) = var(w)var(z) assuming independence of w and z and E(w) = E(z) = 0. Then the variance of the composite residual is one times the variance of û n, preserving the variance of û n, the skewness of û n is eliminated, but the kurtosis of û n is preserved. Further, Wu and Mammen (1993) shows that the asymptotic distribution of their version of the wild bootstrap is the same as the asymptotic distribution of various statistics. These asymptotic refinements are due to their wild bootstrap s taking account of the skewness of û n. However, their version of the wild ignores kurtosis. f. Now follow steps e) h) of section 1.1 using the wild data for y generated in step b) of this section. See the Wild Estimator in the Stata do file called monte carlo.do. 1.4 Pairs vs. Wild Based on Atkinson and Cornwell, Inference in Two-Step Panel Data Models with Instruments and Time-Invariant Regressors: Bootstrap versus Analytic Estimators, for models with endogeneity, the wild has more accurate size and virtually the same power as the pairs estimator in estimation of t-values for the second-step estimators. Both generally outperform the asymptotic formula in terms of size and power. In a linear model context without panel data, Davidson and Flachaire (2001) find that the wild often outperforms the pairs when the error is heteroskedastic. 1.5 Parametric Bootstrap If it known that y n Normal[µ, σ 2 ] then we could obtain B bootstrap samples of size N by drawing from the Normal[ˆµ, s 2 ] distribution. This is an example of a parametric bootstrap. 2. Number of Bootstrap Draws The bootstrap asymptotics rely on big N, even if B is small. However, the bootstrap is more accurate with big B. How large B should be depends on the simulation error you can accept in your work. Davidson and MacKinnon recommend B = 399 for a type I error of.05 and B = 1, 499 for tests at a level of.01. If you are performing bootstrapping within a Monte Carlo analysis, then B = 399 is adequate. You need to have α (B + 1) be an integer. Note: If you assume a two-sided confidence interval with α =.05 then for the upper-tail, 399*.025=9.98 is the theoretical number of significant t-values you would 5
6 expect if the size were correct. You would array t-values from high to low. With 400 bootstrap draws, 400*.025=10, which says that you should have 10 t-values equal to 1.96 or greater to have correct size. However, if the 10-th ranked t-value is the last t-value greater than or equal to 1.96, should the 10-th ranked t-value belong to one set or the other. It sits on the cusp. Since.025 percent of 399 is 9.98 and 9.98 is not an even number, you eliminate ambiguity, since the required number is not an integer. This is not a major issue in my opinion. 3. Bias Adjustment Using The Bootstrap or Jackknife In small samples many sandwich estimators may be biased. Weak instruments may also cause bias. We can correct for these biases using the bootstrap or the jackknife via the following: a. Since the Bootstrap estimator of ˆβ is 1/B B b=1 β b, we can compute the bias correction for ˆβ as ˆβ (1/B B b=1 β b ˆβ) = 2 ˆβ 1/B B b=1 β b. The intuition is that since we do not know β, we treat ˆβ as the true value and determine the bias of the bootstrap estimator relative to this value. We then adjust ˆβ by this computed bias, assuming that the bias of the bootstrap estimator relative to ˆβ is the same as the bias of ˆβ relative to β. b. We can compute the jackknife estimator of the standard deviation of ˆβ for a sample of size N, n = 1,..., N, by computing N jackknife estimates of β obtained by successively dropping observation n and recomputing β J,n, where J stands for Jackknife. Then compute the variance of the N estimates and multiply by N 1 to get the estimated variance of ˆβ. Take the square root to get the estimated standard error. We can employ the jackknife two-stage-least-squares (JK2SLS) estimator of Hahn, J., and J. Hausman (2003), Weak Instruments: Diagnosis and Cures in Empirical Econometrics, American Economics Review Papers and Proceedings 93: , to correct for the bias caused by weak instruments. The formula for the jackknife bias correction is given in Shao and Tu (1995). To compute the jackknife bias correction for the estimated coefficients, let ˆβ be the estimator of β for a sample of size N. First compute N jackknife estimates of ˆβ obtained by successively dropping one observation and recomputing ˆβ. Call each of these N estimates β J,n, n = 1..., N, and their average β J = N n=1 β J,n. Define the jackknife bias estimator as BIAS J = (N 1)( β J ˆβ). (1.7) 6
7 Then the jackknife bias-adjusted (BA) estimator of β is ˆβ BA = ˆβ BIAS J = N ˆβ (N 1)( β J ). (1.8) Again, the intuition is that since we do not know β, we treat ˆβ as the true value and determine the bias of the jackknife estimator relative to this value. We then adjust ˆβ by this computed bias, assuming that the bias of the jacknife estimator relative to ˆβ is the same as the bias of ˆβ relative to β. c. The jackknife uses fewer computations (N < B) than the the bootstrap, but is outperformed by the bootstrap as B. 4. Hypothesis Testing Assume a model y = α + xβ + u. You can compute t = ( ˆβ β)/s ˆβ,Boot, using the bootstrap estimator of the standard deviation. For the specific null hypothesis that β = 0 you would compute t = ( ˆβ 0)/s ˆβ,Boot. While this is asymptotically valid so long as β and ˆβ approach the true β, this will not give you asymptotic refinements for any N. To obtain asymptotic refinement, we need to compute asymptotically pivotal test statistics whose asymptotic normal distribution does not depend on unknown parameters. This would require the studentized test statistic based on the asymptotic standard error of ˆβ =. We fashion this after the usual test sˆθ b statistic t = ( ˆβ β)/s ˆβ N[0, 1], that provides asymptotic refinement since it is asymptotically pivotal. This occurs because its asymptotic distribution does not depend on unknown parameters. To achieve asymptotic refinement, you have to compute t = (β ˆβ)/s ˆβ, b where s ˆβ is the analytic or asymptotic estimator evaluated using the bootstrap data for b each draw, and then find t (1 α/2) and t (α/2) for the bootstrap after rank ordering the B bootstrap draws. Use this to test the null hypothesis. For α =.05, take the (1 α/2) =
8 percentile and the (α/2) =.025 percentile. Then these standardized t values can then compared with the t value. If t > t (1 α/2) or t < t (α/2) then the null hypothesis is rejected. We are comparing one standardized statistic with another. However, computing the analytic formula may be very difficult and one may have to use the bootstrap estimator based on the standard deviations (s ˆβ,Boot ), computed over the B bootstrap trials. This will not yield asymptotic refinements but will probably still be better than using the asymptotic formula. 5. Boostrapping Time Series Data The bootstrap does not generally work well with time series data. The reason is that the bootstrap relies on resampling from an iid distrubution. With standard bootstrapping you are randomly selecting among a set of residuals which follow some autocorrelation process, thereby destroying that process. Two alternatives that can be employed are block bootstrapping and the sieve bootstrap. With block bootstrapping, time-series blocks that capture the autoregressive process are randomly selected and the entire block is resampled. The sieve bootstrap works by fitting an autoregressive process with order p for the original data and then generating boostrap samples by resampling the rescaled residuals randomly which are assumed to be iid. Since the sieve imposes more structure on the DGP, it should have better performance than the block sootstrap. As an example of the sieve, with p = 1 consider the model y t = βx t + u t, (1.9) where u t = ρu t 1 + ϵ t, (1.10) and ϵ t is white noise. Now estimate β and ρ and obtain ˆϵ t = û t ˆρû t 1. Bootstrap these residuals to get ˆϵ, t = 1,..., T. Then recursively compute û t = ρû t 1 +ˆϵ t and hence y t = ˆβx t + û t. Then regress y t on x t. The Moving Block Bootstrap constructs overlapping moving blocks. For the movingblock bootstrap, there are n - b + 1 blocks. The first contains obs. 1 through b, the second contains obs. 2 through b + 1, and the last contains obs. n - b + 1 through n. Choice of b is critical. In theory, it must increase as n increases. If blocks are too short, bootstrap samples cannot mimic original sample. Dependence is broken whenever we start a new block. If blocks are too long, bootstrap samples are not random enough. 8
9 For a nice discussion of the moving block bootstrap and a comparison of this and the other methods for time series see Bootstrap Methods in Econometrics by James G. MacKinnon Department of Economics Queens University Kingston, Ontario, Canada K7L 3N6 jgm@econ.queensu.ca September, Boostrapping Panel Data With both panel-data bootstrap methods, three resampling schemes are available. These are cross-sectional (also called panel bootstrap) resampling, temporal resampling (also called block bootstrap resampling), and cross-sectional/temporal resampling. With panel-bootstrap resampling, one randomly selects among N cross-sectional units and uses all T observations for each. If cross-sectional dependence exists, one can select the relevant blocks of cross-sectional units. With temporal resampling, one randomly selects temporal units and uses all N observations for each. If temporal dependence exists, one can select the relevant blocks of temporal units. Of course this choice is critical to the accuracy of the bootstrap. With cross-sectional/temporal resampling, both methods are utilized. Following Cameron and Trivedi (2005), in the fixed-t case consistent (as N ) standard errors can be obtained using the cross-sectional bootstrap method. Hence, we employ this method for both the pairs and wild methods, where we assume no cross-sectional or temporal dependence. Also, see Kapetnaios (2008), A Bootstrap Procedure for Panel Data Sets with Many Cross-Sectional Units, The Econometrics Journal 11, , who shows that if the data do not exhibit cross-sectional dependence but exhibit temporal dependence, then cross-sectional resampling is superior to block bootstrap resampling. Further, he shows that cross-sectional resampling provides asymptotic refinements. Monte Carlo results using these assumptions indicate the superiority of the cross-sectional method. 9
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationBootstrap Methods in Econometrics
Bootstrap Methods in Econometrics Department of Economics McGill University Montreal, Quebec, Canada H3A 2T7 by Russell Davidson email: russell.davidson@mcgill.ca and James G. MacKinnon Department of Economics
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationThe Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationSolución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
More informationBootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
More informationDepartment of Economics
Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional
More informationF nest. Monte Carlo and Bootstrap using Stata. Financial Intermediation Network of European Studies
F nest Financial Intermediation Network of European Studies S U M M E R S C H O O L Monte Carlo and Bootstrap using Stata Dr. Giovanni Cerulli 8-10 October 2015 University of Rome III, Italy Lecturer Dr.
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationNon Parametric Inference
Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable
More informationPractical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University
Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More informationOnline Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy
Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy By MATIAS BUSSO, JESSE GREGORY, AND PATRICK KLINE This document is a Supplemental Online Appendix of Assessing the
More informationRobust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics
Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics John Pepper Assistant Professor Department of Economics University of Virginia 114 Rouss
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationPARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationAffine-structure models and the pricing of energy commodity derivatives
Affine-structure models and the pricing of energy commodity derivatives Nikos K Nomikos n.nomikos@city.ac.uk Cass Business School, City University London Joint work with: Ioannis Kyriakou, Panos Pouliasis
More informationPITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More informationNote 2 to Computer class: Standard mis-specification tests
Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the
More informationChapter 5: Bivariate Cointegration Analysis
Chapter 5: Bivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie V. Bivariate Cointegration Analysis...
More informationMarketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
More informationCameron, A. Colin; Miller, Douglas L.; Gelbach, Jonah B. Working Paper Bootstrap-based improvements for inference with clustered errors
econstor www.econstor.eu Der Open-Access-Publikationsserver der ZBW Leibniz-Informationszentrum Wirtschaft The Open Access Publication Server of the ZBW Leibniz Information Centre for Economics Cameron,
More informationExact Nonparametric Tests for Comparing Means - A Personal Summary
Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationhttp://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions
A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationLecture 6. Event Study Analysis
Lecture 6 Event Studies Event Study Analysis Definition: An event study attempts to measure the valuation effects of a corporate event, such as a merger or earnings announcement, by examining the response
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More information16 : Demand Forecasting
16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationBias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes
Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA
More informationCalculating the Probability of Returning a Loan with Binary Probability Models
Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The
More informationChapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
More informationStatistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
More informationFULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS
FULLY MODIFIED OLS FOR HEEROGENEOUS COINEGRAED PANELS Peter Pedroni ABSRAC his chapter uses fully modified OLS principles to develop new methods for estimating and testing hypotheses for cointegrating
More informationSimulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes
Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes Simcha Pollack, Ph.D. St. John s University Tobin College of Business Queens, NY, 11439 pollacks@stjohns.edu
More informationAn extension of the factoring likelihood approach for non-monotone missing data
An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions
More informationFrom the help desk: Swamy s random-coefficients model
The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients
More informationA COMPARISON OF STATISTICAL METHODS FOR COST-EFFECTIVENESS ANALYSES THAT USE DATA FROM CLUSTER RANDOMIZED TRIALS
A COMPARISON OF STATISTICAL METHODS FOR COST-EFFECTIVENESS ANALYS THAT U DATA FROM CLUSTER RANDOMIZED TRIALS M Gomes, E Ng, R Grieve, R Nixon, J Carpenter and S Thompson Health Economists Study Group meeting
More informationFIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationAppendix 1: Time series analysis of peak-rate years and synchrony testing.
Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are
More informationImplementations of tests on the exogeneity of selected. variables and their Performance in practice ACADEMISCH PROEFSCHRIFT
Implementations of tests on the exogeneity of selected variables and their Performance in practice ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More informationUsing Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
More informationThe Best of Both Worlds:
The Best of Both Worlds: A Hybrid Approach to Calculating Value at Risk Jacob Boudoukh 1, Matthew Richardson and Robert F. Whitelaw Stern School of Business, NYU The hybrid approach combines the two most
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More informationQuantile Regression under misspecification, with an application to the U.S. wage structure
Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The
More informationKeep It Simple: Easy Ways To Estimate Choice Models For Single Consumers
Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers Christine Ebling, University of Technology Sydney, christine.ebling@uts.edu.au Bart Frischknecht, University of Technology Sydney,
More informationMultiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
More informationMultiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
More informationA spreadsheet Approach to Business Quantitative Methods
A spreadsheet Approach to Business Quantitative Methods by John Flaherty Ric Lombardo Paul Morgan Basil desilva David Wilson with contributions by: William McCluskey Richard Borst Lloyd Williams Hugh Williams
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationA THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA
A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University Agency Internal User Unmasked Result Subjects
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationA Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models
Article A Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models Richard A. Ashley 1, and Xiaojin Sun 2,, 1 Department of Economics, Virginia Tech, Blacksburg, VA 24060;
More informationComparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
More informationVector Time Series Model Representations and Analysis with XploRe
0-1 Vector Time Series Model Representations and Analysis with plore Julius Mungo CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin mungo@wiwi.hu-berlin.de plore MulTi Motivation
More informationStatistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationEstimating Industry Multiples
Estimating Industry Multiples Malcolm Baker * Harvard University Richard S. Ruback Harvard University First Draft: May 1999 Rev. June 11, 1999 Abstract We analyze industry multiples for the S&P 500 in
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More informationEconometric analysis of the Belgian car market
Econometric analysis of the Belgian car market By: Prof. dr. D. Czarnitzki/ Ms. Céline Arts Tim Verheyden Introduction In contrast to typical examples from microeconomics textbooks on homogeneous goods
More informationThe Assumption(s) of Normality
The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationMarginal Person. Average Person. (Average Return of College Goers) Return, Cost. (Average Return in the Population) (Marginal Return)
1 2 3 Marginal Person Average Person (Average Return of College Goers) Return, Cost (Average Return in the Population) 4 (Marginal Return) 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationHow To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More information2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)
2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came
More informationAn Investigation of the Statistical Modelling Approaches for MelC
An Investigation of the Statistical Modelling Approaches for MelC Literature review and recommendations By Jessica Thomas, 30 May 2011 Contents 1. Overview... 1 2. The LDV... 2 2.1 LDV Specifically in
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationMODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST
MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST Zahayu Md Yusof, Nurul Hanis Harun, Sharipah Sooad Syed Yahaya & Suhaida Abdullah School of Quantitative Sciences College of Arts and
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More information