Structural Equation Modeling
|
|
- Marcus Summers
- 7 years ago
- Views:
Transcription
1 Structural Equation Modeling Kosuke Imai Princeton University POL572 Quantitative Analysis II Spring 2016 Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
2 Causal Mediation Analysis Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
3 Quantitative Research and Causal Mechanisms Causal inference is a central goal of scientific research Scientists care about causal mechanisms, not just causal effects Randomized experiments often only determine whether the treatment causes changes in the outcome Not how and why the treatment affects the outcome Common criticism of experiments and statistics: black box view of causality Qualitative research uses process tracing Question: How can quantitative research be used to identify causal mechanisms? Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
4 Direct and Indirect Effects Causal mediation analysis: Mediator, M Indrect (Mediation) Effect Treatment, T Outcome, Y Direct Effect Direct and indirect effects; intermediate and intervening variables READING: Imai, Keele, Tingley, and Yamamoto (2012). APSR Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
5 Causal Mediation Analysis in American Politics Media framing experiment in Nelson et al. (APSR, 1998) Path analysis, structural equation modeling Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
6 Causal Mediation Analysis in Comparative Politics Resource curse thesis Authoritarian government civil war Natural resources Slow growth Causes of civil war: Fearon and Laitin (APSR, 2003) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
7 Causal Mediation Analysis in International Relations The literature on international regimes and institutions Krasner (International Organization, 1982) Power and interests are mediated by regimes Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
8 Potential Outcomes Framework for Mediation Binary treatment: T i Pre-treatment covariates: X i Potential mediators: M i (t) Observed mediator: M i = M i (T i ) Potential outcomes: Y i (t, m) Observed outcome: Y i = Y i (T i, M i (T i )) Again, only one potential outcome can be observed per unit Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
9 Causal Mediation Effects Total causal effect: τ i Y i (1, M i (1)) Y i (0, M i (0)) Causal mediation (Indirect) effects: δ i (t) Y i (t, M i (1)) Y i (t, M i (0)) Causal effect of the treatment-induced change in M i on Y i Change the mediator from M i (0) to M i (1) while holding the treatment constant at t Represents the mechanism through M i Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
10 Total Effect = Indirect Effect + Direct Effect Direct effects: ζ i (t) Y i (1, M i (t)) Y i (0, M i (t)) Causal effect of T i on Y i, holding mediator constant at its potential value that would be realized when T i = t Change the treatment from 0 to 1 while holding the mediator constant at M i (t) Represents all mechanisms other than through M i Total effect = mediation (indirect) effect + direct effect: τ i = δ i (t) + ζ i (1 t) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
11 What Does the Observed Data Tell Us? Quantity of Interest: Average causal mediation effects (ACME) δ(t) E(δ i (t)) = E{Y i (t, M i (1)) Y i (t, M i (0))} Average direct effects ( ζ(t)) are defined similarly Y i (t, M i (t)) is observed but Y i (t, M i (t )) can never be observed We have an identification problem = Need additional assumptions to make progress Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
12 Identification under Sequential Ignorability Proposed identification assumption: Sequential Ignorability (SI) {Y i (t, m), M i (t)} T i X i = x, (1) Y i (t, m) M i (t) T i = t, X i = x (2) (1) is guaranteed to hold in a standard experiment (2) does not hold unless X i includes all confounders Like any ignorability assumption, the assumption is very strong 1 No unmeasured pre-treatment confounder 2 No measured and unmeasured post-treatment confounder Under SI, ACME is nonparametrically identified: E(Y i M i, T i = t, X i ) {dp(m i T i = 1, X i ) dp(m i T i = 0, X i )} dp(x i ) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
13 The Linear Structural Equation Model Y i = α 1 + β 1 T i + Xi ξ 1 + ɛ 1i, (1) M i = α 2 + β 2 T i + Xi ξ 2 + ɛ 2i, (2) Y i = α 3 + β 3 T i + γm i + Xi ξ 3 + ɛ 3i. (3) Equation (1) is redundant given equations (2) and (3): α 1 = α 3 + α 2 γ, β 1 = β 2 γ + β 3, ξ 1 = ξ 2 γ + ξ 3, ɛ 1i = γɛ 2i + ɛ 3i Potential outcomes notation: M i (T i ) = α 2 + β 2 T i + Xi ξ 2 + ɛ 2i, Y i (T i, M i (T i )) = α 3 + β 3 T i + γm i (T i ) + Xi ξ 3 + ɛ 3i. Under SI and linearity: 1 E(ɛ 2i T i, X i ) = E(ɛ 2i X i ) = f 2 (X i ) = X i ξ 2 2 E(ɛ 3i T i, M i, X i ) = E(ɛ 3i X i ) = f 3 (X i ) = X i ξ 3 Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
14 Estimation of Average Causal Mediation Effects Redefine: 1 ɛ 2i = Xi ξ2 + ɛ 2i where E(ɛ 2i X i ) = 0 2 ɛ 3i = Xi ξ3 + ɛ 3i where E(ɛ 3i X i ) = 0 Then, SI implies Cor(ɛ 2i, ɛ 3i ) = 0 Non-zero correlation = M i is correlated with ɛ 3i given T i and X i Under the LSEM and SI, 1 δ(1) = δ(0) = β2 γ 2 ζ(1) = ζ(0) = β3 3 τ = β 2 γ + β 3 Exact variance: V( ˆβ 2ˆγ) = β 2 2 V(ˆγ) + γ2 V( ˆβ 2 ) + V(ˆγ)V( ˆβ 2 ) Asymptotic variance: V( ˆβ 2ˆγ) β 2 2 V(ˆγ) + γ2 V( ˆβ 2 ) No interaction assumption: δ(1) = δ(0) Can be relaxed via an interaction term in equation (3): T i M i Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
15 The Delta Method Taylor s Theorem: If f (x) is n + 1 times differentiable and f (n) is continuous, then for each x there exists c (x 0, x) such that, f (x) = f (x 0 ) + n f (k) (x 0 ) (x x 0 ) k + f (n+1) (c) k! (n + 1)! (x x 0) (n+1) k=1 }{{}}{{} Lagrange remainder Taylor series Univariate: Suppose that d n(x n θ) N (0, σ 2 ). Then, d n(f (Xn ) f (θ)) N (0, σ 2 [f (1) (θ)] 2 ) Multivariate: Suppose that n(x n θ) n(f (Xn ) f (θ)) d N (0, [f (1) (θ)] Σf (1) (θ)) d N (0, Σ). Then, Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
16 General Estimation Algorithm Based on the nonparametric identification result under SI: 1 Model outcome and mediator Outcome model: p(y i T i, M i, X i ) Mediator model: p(m i T i, X i ) These models can be of any form (linear or nonlinear, semi- or nonparametric, with or without interactions) 2 Predict mediator for both treatment values (M i (1), M i (0)) 3 Predict outcome by first setting T i = 1 and M i = M i (0), and then T i = 1 and M i = M i (1) 4 Compute the average difference between two outcomes to obtain a consistent estimate of ACME 5 Monte Carlo simulation or bootstrap to estimate uncertainty (more on these methods later) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
17 Need for Sensitivity Analysis The sequential ignorability assumption is often too strong Need to assess the robustness of findings via sensitivity analysis Question: How large a departure from the key assumption must occur for the conclusions to no longer hold? Sensitivity analysis by assuming {Y i (t, m), M i (t)} T i X i = x but not Y i (t, m) M i T i = t, X i = x Possible existence of unobserved pre-treatment confounder Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
18 Sensitivity Analysis Sensitivity parameter: ρ Corr(ɛ 2i, ɛ 3i ) Sequential ignorability implies ρ = 0 Set ρ to different values and see how mediation effects change Identification with a given error correlation δ(0) = δ(1) = β 2 σ 12 σ2 2 ρ ( ) 1 σ 2 1 ρ 2 σ1 2 σ2 12 σ2 2, where σ 2 j V(ɛ ij ) for j = 1, 2 and σ 12 Cov(ɛ 1i, ɛ 2i ). When do my results go away completely? δ(t) = 0 if and only if ρ = Corr(ɛ 1i, ɛ 2i ) (easy to compute!) Interpretation of ρ can be based on R 2 statistics Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
19 An Example: Nelson et al. (1998) APSR Average Mediation Effect: δ Sensitivity Parameter: ρ Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
20 Beyond Sequential Ignorability Sensitivity analysis may be unsatisfactory Existence of post-treatment confounders: multiple mediators What if we get rid of the assumption altogether? Under a standard design, even the sign of ACME is unidentified Can we do any better? Develop alternative designs for more credible and powerful inference Designs feasible when the mediator can be directly or indirectly manipulated Statistical methods for handling multiple mediators Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
21 Instrumental Variables Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
22 Instrumental Variables The Model: Y = Xβ + ɛ with E(ɛ i ) = 0 and V(ɛ i ) = σ 2 Endogeneity: X ɛ where X is n K Instruments: Z ɛ where Z is n L Rank condition: Z X and Z Z have full rank Identification 1 K = L: just-identified 2 K < L: over-identified 3 K > L: under-identified The IV estimator: ˆβ IV (X Z(Z Z) 1 Z X) 1 X Z(Z Z) 1 Z Y Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
23 Geometry of Instrumental Variables Projection matrix (onto S(Z)): P Z = Z(Z Z) 1 Z Purge endogeneity: X = P Z X Since P Z = P Z and P ZP Z = P Z, we have ˆβ IV = ( X X) 1 X Y Two stage least squares: 1 Regress X on Z and obtain the fitted values X 2 Regress Y on X We do not assume the linearity of X in Z Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
24 Asymptotic Inference A familiar (by now!) trick: ˆβ IV β = (X Z(Z Z) 1 Z X) 1 X Z(Z Z) 1 Z ɛ ( ) ( ) 1 ( 1 n = X i Zi 1 n Z i Zi 1 n n n ( 1 n i=1 n i=1 X i Z i ) ( 1 n i=1 n i=1 Z i Z i p Thus, ˆβ IV β Under the homoskedasticity, V(ɛ Z) = σ 2 I n : ) 1 ( 1 n n i=1 Z i X i ) n Z i ɛ i i=1 ) d n( ˆβ IV β) N (0, σ 2 [E(X i Zi ){E(Z i Zi )} 1 E(Z i Xi )] 1 ) 1 Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
25 Residuals and Robust Standard Error ˆɛ = Y X ˆβ IV and not ˆɛ Y X ˆβ IV Under homoskedasticity: ˆσ 2 ˆɛ 2 n K p σ 2 V( ˆβ IV ) = ˆσ 2 {X Z(Z Z) 1 Z X} 1 = ˆσ 2 ( X X) 1 Sandwich heteroskedasticity consistent estimator: bread = {X Z(Z Z i ) 1 Z X} 1 X Z(Z Z) 1 ( ) n meat = Z diag(ˆɛ 2 i )Z = ˆɛ 2 i Z izi i=1 bread meat bread = ( X X) 1 X diag(ˆɛ 2 i ) X( X X) 1 Robust standard error for clustering and auto-correlation Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
26 Relationship with Linear Structural Equation Modeling The model: T i = α 2 + β 2 Z i + Xi ξ 2 + ɛ i2, Y i = α 3 + β }{{} 3 Z i + γt i + Xi ξ 3 + ɛ i3. =0 Exclusion restriction (no direct effect): β 3 = 0 ɛ i2 ɛ i3 X i = x for some x Ignorability is assumed for Z i Sequential ignorability is not assumed for T i What is the causal interpretation of the IV method? Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
27 Partial Compliance in Randomized Experiments Unable to force all experimental subjects to take the (randomly) assigned treatment/control Intention-to-Treat (ITT) effect treatment effect Selection bias: self-selection into the treatment/control groups Political information bias: effects of campaign on voting behavior Ability bias: effects of education on wages Healthy-user bias: effects of exercises on blood pressure Encouragement design: randomize the encouragement to receive the treatment rather than the receipt of the treatment itself Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
28 Potential Outcomes Notation Randomized encouragement: Z i {0, 1} Potential treatment variables: (T i (1), T i (0)) 1 T i (z) = 1: would receive the treatment if Z i = z 2 T i (z) = 0: would not receive the treatment if Z i = z Observed treatment receipt indicator: T i = T i (Z i ) Observed and potential outcomes: Y i = Y i (Z i, T i (Z i )) Can be written as Y i = Y i (Z i ) No interference assumption for T i (Z i ) and Y i (Z i, T i ) Randomization of encouragement: But (Y i (1), Y i (0)) T i Z i = z (Y i (1), Y i (0), T i (1), T i (0)) Z i Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
29 Principal Stratification Framework Four principal strata (latent types): compliers (T i (1), T i (0)) = (1, 0), always takers (T i (1), T i (0)) = (1, 1), non-compliers never takers (T i (1), T i (0)) = (0, 0), defiers (T i (1), T i (0)) = (0, 1) Observed and principal strata: Z i = 1 Z i = 0 T i = 1 Complier/Always-taker Defier/Always-taker T i = 0 Defier/Never-taker Complier/Never-taker Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
30 Instrumental Variables Randomized encouragement as an instrument for the treatment Two additional assumptions 1 Monotonicity: No defiers T i (1) T i (0) for all i. 2 Exclusion restriction: Instrument (encouragement) affects outcome only through treatment Y i (1, t) = Y i (0, t) for t = 0, 1 Zero ITT effect for always-takers and never-takers ITT effect decomposition: ITT = ITT c Pr(compliers) + ITT a Pr(always takers) +ITT n Pr(never takers) = ITT c Pr(compliers) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
31 IV Estimand and Interpretation IV estimand: ITT c = ITT Pr(compliers) = E(Y i Z i = 1) E(Y i Z i = 0) E(T i Z i = 1) E(T i Z i = 0) = Cov(Y i, Z i ) Cov(T i, Z i ) ITT c = Complier Average Treatment Effect (CATE) CATE ATE unless ATE for noncompliers equals CATE Different encouragement (instrument) yields different compliers Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
32 Asymptotic Inference Wald estimator: ÎV W Ĉov(Y i,z i ) = ÎTT Y Ĉov(T i,z i ) ÎTT T p Consistency: ÎV W CATE = ITT c Asymptotic variance: V(ÎV W ) 1 { ITT 4 ITT 2 T V(ÎTT Y ) + ITT 2 Y V(ÎTT T ) T } 2 ITT Y ITT T Cov(ÎTT Y, ÎTT T ). Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
33 Violations of IV Assumptions Violation of exclusion restriction: Large sample bias = ITT noncomplier Pr(noncomplier) Pr(complier) Weak encouragement (instruments) Direct effects of encouragement; failure of randomization, alternative causal paths Violation of monotonicity: Large sample bias = {CATE + ITT defier} Pr(defier) Pr(complier) Pr(defier) Proportion of defiers Heterogeneity of causal effects Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
34 An Example: Testing Habitual Voting Randomized encouragement to vote in an election Treatment: turnout in the election Outcome: turnout in the next election Monotonicity: Being contacted by a canvasser would never discourage anyone from voting Exclusion restriction: being contacted by a canvasser in this election has no effect on turnout in the next election other than through turnout in this election CATE: Habitual voting for those who would vote if and only if they are contacted by a canvasser in this election Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
35 Multi-valued Treatment Two stage least squares regression: T i = α 2 + β 2 Z i + η i, Y i = α 3 + γt i + ɛ i. Binary encouragement and binary treatment, ˆγ = ĈATE (no covariate) ˆγ p CATE (with covariates) Binary encouragement multi-valued treatment Monotonicity: T i (1) T i (0) Exclusion restriction: Y i (1, t) = Y i (0, t) for each t = 0, 1,..., K Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
36 Estimator ˆγ TSLS p Cov(Y i, Z i ) Cov(T i, Z i ) = E(Y i(1) Y i (0)) E(T i (1) T i (0)) K K ( Yi (1) Y i (0) ) = w jk E T i (1) = j, T i (0) = k j k k=0 j=k+1 where w jk is the weight, which sums up to one, defined as, w jk = (j k) Pr(T i (1) = j, T i (0) = k) K k =0 K j =k +1 (j k ) Pr(T i (1) = j, T i (0) = k ). Easy interpretation under the constant additive effect assumption for every complier type Assume encouragement induces at most only one additional dose Then, w k = Pr(T i (1) = k, T i (0) = k 1) Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
37 Fuzzy Regression Discontinuity Design Sharp regression discontinuity design: T i = 1{X i c} What happens if we have noncompliance? Forcing variable as an instrument: Z i = 1{X i c} Potential outcomes: T i (z) and Y i (z, t) Monotonicity: T i (1) T i (0) Exclusion restriction: Y i (0, t) = Y i (1, t) E(T i (z) X i = x) and E(Y i (z, T i (z)) X i = x) are continuous in x Estimand: E(Y i (1, T i (1)) Y i (0, T i (0)) Complier, X i = c) Estimator: lim x c E(Y i X i = x) lim x c E(Y i X i = x) lim x c E(T i X i = x) lim x c E(T i X i = x) Disadvantage: external validity Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
38 An Example: Class Size Effect (Angrist and Lavy) Effect of class-size on student test scores Maimonides Rule: Maximum class size = 40 Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
39 Concluding Remarks Causal mediation analysis: exploring causal mechanisms Even in randomized experiments, an additional (untestable) assumption is required Importance of sensitivity analysis Instrumental variables in randomized experiments: dealing with partial compliance Additional (untestable) assumptions are required ITT vs. CATE Instrumental variables in observational studies: dealing with selection bias Validity of instrumental variables requires rigorous justification Tradeoff between internal and external validity Kosuke Imai (Princeton) Structural Equation Modeling POL572 Spring / 39
Imbens/Wooldridge, Lecture Notes 5, Summer 07 1
Imbens/Wooldridge, Lecture Notes 5, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 5, Monday, July 30th, 4.30-5.30pm Instrumental Variables with Treatment Effect Heterogeneity: Local
More informationExplaining Causal Findings Without Bias: Detecting and Assessing Direct Effects *
Explaining Causal Findings Without Bias: Detecting and Assessing Direct Effects * Avidit Acharya, Matthew Blackwell, and Maya Sen April 9, 2015 Abstract Scholars trying to establish causal relationships
More informationCausal Infraction and Network Marketing - Trends in Data Science
Causality and Treatment Effects Prof. Jacob M. Montgomery Quantitative Political Methodology (L32 363) October 21, 2013 Lecture 13 (QPM 2013) Causality and Treatment Effects October 21, 2013 1 / 19 Overview
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationPS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationIdentification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments
Advance Access publication January 12, 2013 Political Analysis (2013) 21:141 171 doi:10.1093/pan/mps040 Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationMultiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationMachine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University
Machine Learning Methods for Causal Effects Susan Athey, Stanford University Guido Imbens, Stanford University Introduction Supervised Machine Learning v. Econometrics/Statistics Lit. on Causality Supervised
More information6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander
Reply Affirmative Action s Affirmative Actions: A Reply to Sander Daniel E. Ho I am grateful to Professor Sander for his interest in my work and his willingness to pursue a valid answer to the critical
More informationDegrees of Freedom and Model Search
Degrees of Freedom and Model Search Ryan J. Tibshirani Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationChapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationPOL 451: Statistical Methods in Political Science
POL 451: Statistical Methods in Political Science Fall 2007 Kosuke Imai Department of Politics, Princeton University 1 Contact Information Office: Corwin Hall 041 Phone: 609 258 6601 Fax: 973 556 1929
More informationAn Introduction to Path Analysis. nach 3
An Introduction to Path Analysis Developed by Sewall Wright, path analysis is a method employed to determine whether or not a multivariate set of nonexperimental data fits well with a particular (a priori)
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationIntroduction to Path Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationFIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038
More information[This document contains corrections to a few typos that were found on the version available through the journal s web page]
Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,
More informationChapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationPARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationFactor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationSAS Certificate Applied Statistics and SAS Programming
SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
More informationHandling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
More informationImputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data
More informationIntroduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationCHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationCAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
More informationAn extension of the factoring likelihood approach for non-monotone missing data
An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationHow To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationSales forecasting # 2
Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationWhy High-Order Polynomials Should Not be Used in Regression Discontinuity Designs
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order
More informationMagne Mogstad and Matthew Wiswall
Discussion Papers No. 586, May 2009 Statistics Norway, Research Department Magne Mogstad and Matthew Wiswall How Linear Models Can Mask Non-Linear Causal Relationships An Application to Family Size and
More informationMATH. ALGEBRA I HONORS 9 th Grade 12003200 ALGEBRA I HONORS
* Students who scored a Level 3 or above on the Florida Assessment Test Math Florida Standards (FSA-MAFS) are strongly encouraged to make Advanced Placement and/or dual enrollment courses their first choices
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationMaster programme in Statistics
Master programme in Statistics Björn Holmquist 1 1 Department of Statistics Lund University Cramérsällskapets årskonferens, 2010-03-25 Master programme Vad är ett Master programme? Breddmaster vs Djupmaster
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS
ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More informationFinancial TIme Series Analysis: Part II
Department of Mathematics and Statistics, University of Vaasa, Finland January 29 February 13, 2015 Feb 14, 2015 1 Univariate linear stochastic models: further topics Unobserved component model Signal
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationStatistical Tests for Multiple Forecast Comparison
Statistical Tests for Multiple Forecast Comparison Roberto S. Mariano (Singapore Management University & University of Pennsylvania) Daniel Preve (Uppsala University) June 6-7, 2008 T.W. Anderson Conference,
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationMultiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
More informationMarket Risk Analysis. Quantitative Methods in Finance. Volume I. The Wiley Finance Series
Brochure More information from http://www.researchandmarkets.com/reports/2220051/ Market Risk Analysis. Quantitative Methods in Finance. Volume I. The Wiley Finance Series Description: Written by leading
More informationHow Would One Extra Year of High School Affect Academic Performance in University? Evidence from a Unique Policy Change
How Would One Extra Year of High School Affect Academic Performance in University? Evidence from a Unique Policy Change Harry Krashinsky University of Toronto Abstract This paper uses a unique policy change
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More informationCovariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationSilvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
More informationSampling. COUN 695 Experimental Design
Sampling COUN 695 Experimental Design Principles of Sampling Procedures are different for quantitative and qualitative research Sampling in quantitative research focuses on representativeness Sampling
More informationThe Identification Zoo - Meanings of Identification in Econometrics: PART 2
The Identification Zoo - Meanings of Identification in Econometrics: PART 2 Arthur Lewbel Boston College 2016 Lewbel (Boston College) Identification Zoo 2016 1 / 75 The Identification Zoo - Part 2 - sections
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationIntroduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationDo Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*
Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?* Jennjou Chen and Tsui-Fang Lin Abstract With the increasing popularity of information technology in higher education, it has
More informationSection A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
More informationGoodness of fit assessment of item response theory models
Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing
More informationHow To Understand Multivariate Models
Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationMisunderstandings between experimentalists and observationalists about causal inference
J. R. Statist. Soc. A (2008) 171, Part 2, pp. 481 502 Misunderstandings between experimentalists and observationalists about causal inference Kosuke Imai, Princeton University, USA Gary King Harvard University,
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationEstimating the random coefficients logit model of demand using aggregate data
Estimating the random coefficients logit model of demand using aggregate data David Vincent Deloitte Economic Consulting London, UK davivincent@deloitte.co.uk September 14, 2012 Introduction Estimation
More information