Panel Data: Linear Models

Size: px
Start display at page:

Download "Panel Data: Linear Models"

Transcription

1 Panel Data: Linear Models Laura Magazzini University of Verona Laura Magazzini Panel Data: Linear Models 1 / 45

2 Introduction Outline What is Panel Data? Motivation: the omitted variable problem An example: Production function Model specification Estimation Laura Magazzini Panel Data: Linear Models 2 / 45

3 Introduction What is panel (or longitudinal) data? It is a time-series of cross-section, where the same unit is observed over a number of periods Units can be individuals, firms, households, industries, markets, regions, countries,... Micro- vs. Macro-panels: different techniques are required for estimation Bank of Italy, European panel: large N & small T OECD: large N & small/medium/large T We work on micro-panel (large N & small T ) Random sampling over the cross-sectional dimension Micro & Macro-panel: one of the most active bodies of literature in econometrics Laura Magazzini (@univr.it) Panel Data: Linear Models 3 / 45

4 Introduction Basic model and notation We will consider the linear model y it = x itβ + v it with i = 1,..., N (sample units), t = 1,..., T (time periods) For each sample units, we have the following T equations: y i1 = x i1β + v i1 y i2 = x i2β + v i2. y it = x it β + v it Laura Magazzini (@univr.it) Panel Data: Linear Models 4 / 45

5 Advantages of panel data Introduction Greater flexibility in the study of dynamics than CS or TS (ex.1) Repeated CS: in two points in time you observed 50% of women appear working. One-half of the women will be working? Or the some one-half of women will be working over all time periods? [Ben-Porath (1973)] (ex.2) Production function: economies of scale (ES) versus technical change (TC). CS only provides information about ES. TS muddle the two effects. Greater precision in estimation (greater number of observations due to pooling) Heterogeneity across units: it is possible to disentangle different sources of variance of the units of interest (permanent versus transitory factors) Can solve the omitted variables bias (fixed effects) Consistent estimates can be obtained in the presence of omitted variables, if the omitted variable vary across sample units, but it is constant over time, e.g. preferences, individual ability, propensity to patent,... Laura Magazzini Panel Data: Linear Models 5 / 45

6 Introduction Example 1: Production function Max output given the value of the inputs Consider the case of agricultural production: Q = φ(l, V ) o Q: Output o L: Input that varies over time (labor) o V : Input that remains constant over time (soil quality) You can also think of a firm production function where V represents managerial capability Typically, V is known to the farmer/manager, but unknown to the econometrician Laura Magazzini (@univr.it) Panel Data: Linear Models 6 / 45

7 Introduction Example 1: Econometric specification Let us consider a Cobb-Douglas production function: φ(l, V ) = AL α V β Taking logs (and adding an error term, summarizing all inputs outside the farmer s control, e.g. rainfall): q = a + αl + βv + u Parameter of interest: α, i.e. the (%) increase in Q driven by a 1 percent increase in L, holding V constant Laura Magazzini (@univr.it) Panel Data: Linear Models 7 / 45

8 Introduction Example 1: Data availability Ideal world q = a + αl + βv + u You measure Q, L, and V on a sample of N farmers If standard hypotheses hold, the relationship can be estimated by OLS Real world V is not observable: you measure only Q and L on a sample of N farmers q = a + αl + (βv + u) = a + αl + ɛ Omitted variable bias? Laura Magazzini (@univr.it) Panel Data: Linear Models 8 / 45

9 Introduction Example 1: Estimation by OLS? E[q l] = a + αl + (βe[v l] + E[u l]) = a + αl + E[ε l] OLS regression of q on l allows the identification of the parameter of interest α if and only if E[ε l] = 0 We assume E[u l] = 0, therefore we need the omitted variable v (1) not to affect q once l is controlled for, i.e. β = 0 or (2) uncorrelated with l: E[v l] = 0 We do not believe (1): soil quality affects harvest (managerial capabilities affect firm output) What does economic theory tell us about hypothesis (2)? Laura Magazzini (@univr.it) Panel Data: Linear Models 9 / 45

10 Introduction Example 1: Relationship between L and V According to economic theory, a farmer/firm chooses L that maximizes the expected profit Let p l the cost of a unit of L, and p the price of the output Q π = AL α V β p Lp l Taking first derivatives and solving first order condition, the optimal L depends on V As a consequence, L is correlated with V : firms choose the optimal L on the basis of characteristics that are unobservable for the researcher but known to the farmer/firm! cov(v, l) 0 E[v l] 0 and, therefore, E[ε l] 0: OLS is inconsistent Laura Magazzini (@univr.it) Panel Data: Linear Models 10 / 45

11 Introduction Example 1: The panel solution (1) The omitted variable bias is linked to the problem of endogeneity Instrumental Variable can be applied for estimation (need to search for external instruments) What if...? The soil quality/managerial ability V is constant over time Q and L are observed for (at least) T = 2 time periods Laura Magazzini (@univr.it) Panel Data: Linear Models 11 / 45

12 Introduction Example 1: The panel solution (2) When t = 1: q i1 = a + αl i1 + βv i1 + u i1 When t = 2: q i2 = a + αl i2 + βv i2 + u i2 Taking the difference (we assume V constant over time v i1 = v i2 ): q i2 q } {{ i1 = α(l } i2 l i1 ) + u } {{ } i2 u } {{ i1 } q i l i u i The equation q i = α l i + u i does not depend from the unobserved variable v If u i satisfies classic assumptions, the regression of q i on l i can provide an estimate of the parameter of interest α. Laura Magazzini (@univr.it) Panel Data: Linear Models 12 / 45

13 Introduction Example 1: The panel solution (3) Advantages: repeated observations over time on the same unit allows to use estimation methods that are robust to the presence of omitted variables in the model, if these variables are constant over time. Any transformation of the initial model that eliminates the unobservable variable v is a good starting point The linearity and additivity of the model are necessary in this context. Laura Magazzini (@univr.it) Panel Data: Linear Models 13 / 45

14 Introduction Example 2: Return to schooling Aim: Study the variation in income associated to a change in the years of schooling The model of interest is: w i = α + ρs i + a i + ɛ i with w i indicates the income, s i is the number of years of schooling, a i represents individual ability (i = 1,..., N). Likely, individual ability affects income (cov(w, a) > 0) and is correlated with the years of schooling (cov(s, a) > 0) Unfortunately, a i is typically unobservable! Laura Magazzini (@univr.it) Panel Data: Linear Models 14 / 45

15 Introduction Example 2: Identification and estimation Let us suppose we observe (w, s) for the same unit at two points in time Typically, s i does not vary over time, i.e. we look at the relationship between w and s when choices about s have already been done At time 1: w i1 = α + ρs i1 + a i + ɛ i1 At time 2: w i2 = α + ρs i2 + a i + ɛ i2 Taking differences (since s i1 = s i2 ): w i2 w i2 = ɛ i2 ɛ i1 The availability of repeated observations does not improve the identification of ρ Laura Magazzini (@univr.it) Panel Data: Linear Models 15 / 45

16 The Omitted Variables Problem Motivation: The omitted variables problem Panel data can be used to obtain consistent estimators in the presence of omitted variables Let y and x = (x 1,..., x K ) be observable random variables Let c be an unobservable random variable We are interested in the partial effect of the observable explanatory variables x j in the population regression function: E[y x 1,..., x K, c] Assuming a linear model: E[y x 1,..., x K, c] = β 0 + x β + c, i.e. y = β 0 + x β + c + ɛ - Interest lies in the (K 1) vector β - c is called unobserved effect Laura Magazzini (@univr.it) Panel Data: Linear Models 16 / 45

17 The Omitted Variables Problem What if cov(x, c) 0? y = β 0 + x β + c + ɛ 1 Find a proxy for c and estimate β using OLS 2 Find an external instrument for x and apply 2SLS 3 If we can observe the same units at different points in time (i.e. we can collect a panel data set), we can get consistent estimates of β as long as we can assume c to be constant over time Accomplished by transforming the original data ( internal instruments) Laura Magazzini (@univr.it) Panel Data: Linear Models 17 / 45

18 The Omitted Variables Problem The panel solution to omitted variable bias (T = 2) Assume we can observe (y, x) at two different points in time: t = 1: (y 1, x 1 ) & t = 2: (y 2, x 2 ) The population regression function is: E[y t x t, c] = β 0 + x tβ + c or y t = β 0 + x tβ + c + u t where by definition E[u t x t, c] = 0 (t = 1, 2). What about E[c x t ]? If E[x tc] = 0, we can apply OLS If E[x tc] 0, pooled OLS is biased and inconsistent But we can take first difference and eliminate c: y 2 y } {{ } 1 = (x 2 x 1 ) } {{ } β + u 2 u } {{ } 1 y x u Laura Magazzini (@univr.it) Panel Data: Linear Models 18 / 45

19 The Omitted Variables Problem Can we apply OLS for estimation? (T = 2) y = x β + u Exogeneity: E[ x u] = 0 E(x 2 u 2) + E(x 1 u 1) E(x 1 u 2) E(x 2 u 1) = 0 Stronger than E(x tu t ) = 0 (t=1,2) Strict exogeneity: cov(x t, u s ) = 0 for all t and s No restrictions on the correlation between x t and c Rank condition: ranke( x x) = K If x t contains a variable that is constant across time for every member of the population, then x contains an entry that is identically zero, and rank condition fails Laura Magazzini (@univr.it) Panel Data: Linear Models 19 / 45

20 Linear Model Notation The basic linear panel data model (1) For a randomly drawn cross-section i, we assume (i = 1,..., N, t = 1,.., T ): y it = x itβ + c i + u it c i : individual effect or individual heterogeneity u it : idiosyncratic errors/disturbances Assume c i uncorrelated with u it Assume u it homeschedastic and serially uncorrelated We consider a balanced panel : each cross-section i is observed T times (total of N T observations) Laura Magazzini (@univr.it) Panel Data: Linear Models 20 / 45

21 Linear Model Notation The basic linear panel data model (2) In compact form we can write: y i = x iβ + c i ι T + u i where vectors have dimension T 1 y i = (y i1,..., y it ) x i = (x i1,..., x it ) u i = (u i1,..., u it ) ι T = (1,..., 1) Different estimators are available on the basis of underlying assumptions on the correlation structure of c i Asymptotics rely on N, for fixed T Laura Magazzini (@univr.it) Panel Data: Linear Models 21 / 45

22 Linear Model OLS estimation When pooled OLS? y it = x itβ + c i + u it = x itβ + v it v it : composite error, sum of the unobserved effect and idiosyncratic error OLS is consistent if E[x it v it] = 0: E[x it u it] = 0 E[x it c i] = 0, t = 1, 2,..., T Robust standard errors: the presence of c i induces correlation over time for the same individual OLS is not efficient Laura Magazzini (@univr.it) Panel Data: Linear Models 22 / 45

23 Random effects structure Linear Model Random effect estimation y it = x itβ + c i + u it = x itβ + v it u it homoschedastic and serially uncorrelated: E[u i u i x i, c i ] = σ 2 ui T c i homoschedastic: E[ci 2 x i] = σc 2 As a result, the error structure has the following form: σc 2 + σu 2 σc 2... σc 2 Ω i = E[v i v i] σc 2 σc 2 + σu 2... σc 2 = σc σc 2 + σu 2 (T T ) E[vv ] = I N Ω i = Ω = σ 2 cι T ι T + σ2 ui T Laura Magazzini (@univr.it) Panel Data: Linear Models 23 / 45

24 Linear Model GLS estimation (unfeasible) Random effect estimation ˆβ RE(GLS) = ( N i=1 X iω 1 i X i ) 1 ( N i=1 X i Ω 1 i y i The estimator can be obtained by applying OLS regression to Ω 1/2 X on Ω 1/2 y Ω 1/2 = [I N Ω i ] 1/2 = I N Ω 1/2 Ω 1/2 i = 1 [ σ IT u θ T ι T ι ] T with θ = 1 σ u i σ 2 u +T σc 2 The GLS estimator can be obtained by the OLS regression of (y it θȳ i ) on (x it θ x i ) If σc 2 = 0, θ = 0: RE = OLS (no unobs. heterogeneity; Breusch Pagan LM statistic) ) Laura Magazzini (@univr.it) Panel Data: Linear Models 24 / 45

25 GLS estimation (feasible) Linear Model Random effect estimation In order to implement the RE procedure, we need to obtain ˆσ 2 c and ˆσ 2 u ( N ) 1 ( N ) ˆβ RE(FGLS) = X ˆΩ i 1 X i X i ˆΩ 1 y i i=1 To get ˆΩ (get ˆσ c 2 and ˆσ u), 2 Wooldridge suggests: σ2 c + σu 2 from pooled OLS residuals As σc 2 = E[v it v is ], autocorrelation in OLS residuals can be exploited to obtain an estimate of σc 2 ˆσ u 2 can be recovered by taking the difference σ c 2 + σu 2 ˆσ c 2 Alternative procedure described in Greene (Maddala and Mount, 1973) In small sample you can have ˆσ c 2 < 0! i=1 Laura Magazzini (@univr.it) Panel Data: Linear Models 25 / 45

26 Linear Model Random effect estimation Random effect estimation y it = x itβ + c i + u it Obtained from the OLS regression of (y it θȳ i ) on (x it θ x i ) (in the more general case: OLS regression of Ω 1/2 y on Ω 1/2 X) Assumptions (stronger than OLS): (1) Strict exogeneity: E[x is u it] = 0 for each s, t = 1,..., T (2) Orthogonality between c i and each x it : E[c i x i ] = E[c i ] = 0 (3) Rank condition: rank E[X i Ω 1 X i ] = K, where Ω = E[v i v i ] Why REE? Exploit serial correlation of the error term in a GLS framework: efficient Laura Magazzini (@univr.it) Panel Data: Linear Models 26 / 45

27 Linear Model Random effect estimation The strict exogeneity assumption y it = x itβ + c i + u it E[y it x i1, x i2,..., x it, c i ] = E[y it x it, c i ] = x it β + c i Once x it and c i are controlled for, x is has no partial effect on y it for s t {x it, t = 1,..., T } are strictly exogenous conditional on the unobserved effect c i The strict exogeneity assumption can be stated in terms of the idiosyncratic error term: E[u it x i1, x i2,..., x it, c i ] = 0 This implies that explanatory variables in each time period are uncorrelated with the idiosyncratic error in each time period: E[x is u it] = 0 for each s, t = 1,..., T Stronger than zero contemporaneous correlation: E[x it u it] = 0 Laura Magazzini (@univr.it) Panel Data: Linear Models 27 / 45

28 Linear Model Fixed effect estimation Fixed effect framework We maintain the strict exogeneity assumption: E[u it x i, c i ] = 0 Allow c i to be arbitrarily correlated with x i FE is more robust than RE We can consistently estimate partial effects in the presence of time-constant omitted variable, that can be related to the observables x i BUT we cannot include time-constant factors in x i (e.g. gender, race in the analysis of individuals; foundation year for firms;...) To get estimates we transform the equation to remove c i and apply OLS Dummy variable regression Within transformation First difference Laura Magazzini (@univr.it) Panel Data: Linear Models 28 / 45

29 Linear Model Dummy variable regression Least Squares Dummy Variables (LSDV) Fixed effect estimation y i = x i β + c i ι T + u i Collecting the terms over the N units gives: y 1 x 1 ι T y 2. = x 2. β + 0 ι T ι T y N x N c 1 c 2. c N + Or, letting d i be a dummy variable indicating unit i [ ] β y = [X d 1 d 2... d N ] + u = Xβ + Dc + u c Classical regression model with K + N parameters What if N is thousands? Laura Magazzini (@univr.it) Panel Data: Linear Models 29 / 45 u 1 u 2. u N

30 Linear Model Fixed effect estimation Dummy variable regression Discussion The parameter of interest is β c i : nuisance parameters that only increase the computational complexity of estimation Incidental parameter problem: increasing N also increases the number of c i to be estimated Solution: use the within gruop (WG) transformation Numerically, LSDV and WG transformation lead to the same estimate for β (result of partitioned regression just algebra) Estimate of β easier to compute with WG (an important issue some years ago...) Laura Magazzini (@univr.it) Panel Data: Linear Models 30 / 45

31 Linear Model Fixed effect estimation Within group (WG) transformation We transform the model in order to remove the term c i For individual i at time t: y it = x it β + c i + u it For individual i, the average over the T periods is: ȳ i = x i β + c i + ū i Therefore by taking deviations from group means, we get: y it ȳ i = (x it x i ) β + (u it ū i ) Under the assumption of strict exogeneity, we can apply OLS the the transformed data to get a consistent estimate of β Estimates of c i can be computed by ĉ i = ȳ i ˆβ x i (unbiased; not consistent for fixed T and N ) The F test can be applied for the joint significance of c i Laura Magazzini (@univr.it) Panel Data: Linear Models 31 / 45

32 Fixed effect estimation Linear Model Fixed effect estimation y it = x itβ + c i + u it WG: OLS regression of y it ȳ i on x it x i (removes c i ) Assumptions: (1) Strict exogeneity: E[x is u it] = 0 for each s, t = 1,..., T T ) (2) Rank condition: rank( t=1 E[ẍ itẍit] = rank E[Ẍ i Ẍi] = K, where ẍ it = x it x i No assumption about the correlation of c i and each x it : consistent even if E[c i x i ] 0 More robust than RE, but effect of time-invariant variables cannot be identified Efficient if u it homoschedastic and uncorrelated over time Laura Magazzini (@univr.it) Panel Data: Linear Models 32 / 45

33 Linear Model Fixed effect estimation First difference (FD) Another way to remove the term c i from the equation is to take first differences: y it y it 1 = (x it x it 1 ) β + (u it u it 1 ) OLS can be applied for estimation if x it is uncorrelated with u it (satisfied under strict exogeneity) However it is not efficient, due to the correlation introduced among the error terms u it and u it 1 (if u it is uncorrelated over time) For example, for T = 3 y i2 = x i2β + (u i2 u i1 ) y i3 = x i3β + (u i3 u i2 ) GLS estimation could be employed to solve the problem: you get the within-group estimator Laura Magazzini (@univr.it) Panel Data: Linear Models 33 / 45

34 Linear Model Fixed effect estimation First difference estimation y it = x itβ + c i + u it FD: OLS regression of y it on x it (removes c i ) Assumptions: (1) E[ x it u it] = 0, that is E[x is u it] = 0 for each t = 1,..., T ; s = t 1, t, t + 1 satisfied under strict exogeneity (2) Rank condition: rank E[ X i X i] = K No assumption about the correlation of c i and each x it : consistent even if E[c i x i ] 0 More robust than RE, but effect of time-invariant variables cannot be identified Laura Magazzini (@univr.it) Panel Data: Linear Models 34 / 45

35 Linear Model Fixed effect estimation Non-spherical u it What if Ω i σ 2 cι T ι T + σ2 ui T? That is, u it heteroskedastic and/or correlated over time If E(c i x i ) 0, then the FE estimator is still consistent (under strict exogeneity); it is no longer efficient Robust formulas should be employed for the computation of the standard errors! ˆβ FD is efficient if u it is a random walk ( u it serially uncorrelated) If E(c i x i ) = 0 (the orthogonality condition holds), then the RE estimator remains consistent (under strict exogeneity); it is no longer efficient A more general estimator of Ω i can be obtained as: ˆΩ i = N 1 with ˆv i pooled OLS residuals (efficient in the more general case) Assume alternative specifications: parametric assumptions about the correlation structure in u it, e.g. AR(1) and perform GLS estimation Laura Magazzini (@univr.it) Panel Data: Linear Models 35 / 45 N i=1 ˆv i ˆv i

36 WG vs. FD Which one to choose? Linear Model Which one to choose? WG: OLS regression of (y it ȳ i ) on (x it x i ) FD: OLS regression of y it on x it Both WG and FD produces unbiased and consistent estimates of the parameter of interest β, as c i is removed from the regression The estimate of β is not affected by the correlation (if any) between c i and x i Generally, if the two estimators are different, this can be interpreted as evidence against the assumption of strict exogeneity When T = 2, ˆβ WG = ˆβ FD If T 3, under homoschedasticity of u, ˆβ WG is to be preferred because efficient If uncorrelation and homoschedasticity of u is not satisfied, the choice depends on the assumptions about u it : If u it is a random walk, then u it is serially uncorrelated: ˆβFD is efficient In the more general set up, use FD or WG with robust s.e.! Laura Magazzini (@univr.it) Panel Data: Linear Models 36 / 45

37 FE vs. RE (1) Which one to choose? Linear Model Which one to choose? Traditional approach: c i treated either as parameter to be estimated vs. random disturbance Philosophical issue Wrongheaded in microeconometrics applications Modern terminology: fixed effects estimation vs. random effects estimation The difference is in the assumptions about E[c i x i ] FE allows consistent estimation of β even in cases where c i is correlated with x i RE requires c i to be uncorrelated with x i Laura Magazzini (@univr.it) Panel Data: Linear Models 37 / 45

38 FE vs. RE (2) Which one to choose? Linear Model Which one to choose? FE: OLS regression of (y it ȳ i ) on (x it x i ) Only within variation is considered RE: OLS regression of (y it θȳ i ) on (x it θ x i ) Both within and between variation are employed for estimation It is possible to show that ˆβ RE = Λ ˆβ B + (I K Λ) ˆβ FE with ˆβ B obtained from the OLS regression of ȳ i on x i σ θ = 1 u : if T, RE = FE you need a different framework! σ 2 u +T σ 2 c Key: correlation between c i and x it If E[c i x it ] = E[c i ] (= 0): RE is consistent and efficient, FE consistent If E[c i x it ] E[c i ]: FE consistent, but RE is not Laura Magazzini (@univr.it) Panel Data: Linear Models 38 / 45

39 Linear Model Which one to choose? FE vs. RE The Hausman test Both FE and RE assume strict exogeneity If E[c i x] = E[c i ] (= 0) Both ˆβ FE and ˆβ RE are consistent for β: ˆβFE ˆβ RE 0 ˆβ RE is efficient: Var( ˆβ FE ) is greater than Var( ˆβ RE ) If E[c i x] E[c i ] ˆβ FE is consistent, but ˆβ RE is biased: ˆβFE ˆβ RE 0 We can apply the Hausman test ( ˆβ FE ˆβ RE ) (Var( ˆβ FE ) Var( ˆβ RE )) 1 ( ˆβ FE ˆβ RE ) χ 2 K Remark: Two maintained hypotheses (not tested!): (i) strict exogeneity; (ii) random effect structure of the covariance (under the null, RE has to be efficient: valid under spherical u it ) Laura Magazzini (@univr.it) Panel Data: Linear Models 39 / 45

40 Linear Model Which one to choose? Between FE and RE: Correlated random effects (Mundlak, 1978; Chamberlain, 1982, 1984) RE assumes no correlation between c i and x it Richer models can be specified that relax this assumption Mundlak (1978): c i = x i π + w i with w i i.i.d. GLS estimation of the regression of y it on x it and x i produces the fixed effect estimator Chamberlain (1982, 1984): c i = x i1 π x it π T + w i Estimation of the extended model by minimum distance method produces the fixed effect estimator In nonlinear models, fixed effect models are not always estimable and richer RE models provide an alternative approach Laura Magazzini (@univr.it) Panel Data: Linear Models 40 / 45

41 Linear Model Which one to choose? FE vs. RE A robust version of the Hausman test Starting from the Mundlak (1978) definition (linear projection): c i = x i π + w i with w i i.i.d. we can write: y it = x itβ + c i + u it = x itβ + x i π + (w i + u it ) GLS estimation produces: ˆβ GLS = ˆβ FE and ˆπ GLS = ˆβ BET ˆβ FE ( ˆβ BET : OLS estimate in the regression of ȳ i on x i ) Hausman test can be carried out by testing H 0 : π = 0 in the extended regression Robust version of the Hausman test: use a robust Wald statistic in the context of pooled OLS (strict exo is still needed, but we can relax on efficiency of RE under the null) Laura Magazzini (@univr.it) Panel Data: Linear Models 41 / 45

42 The R 2 with panel data Goodness of fit R 2 as the square of correlation coefficient between observed and fitted values Total variability can be decomposed into within and between variability: 1 NT i,t (y it ȳ) 2 = 1 NT STATA provides three R 2 statistics: Rwithin 2 = corr 2 ((x it x i ) ˆβFE, y it ȳ i ) Rbetween 2 = corr 2 ( x i ˆβ B, ȳ i ) Roverall 2 = corr 2 (x it ˆβ OLS, y it ) i,t (y it ȳ i ) NT (ȳ i ȳ) 2 i,t Laura Magazzini (@univr.it) Panel Data: Linear Models 42 / 45

43 Discussion Discussion Source of the examples: Wooldridge Two questions: Is the unobserved effect c i uncorrelated with x it for all t? Is the strict exogeneity assumption (conditional on c i ) reasonable? Examples: (a) Program evaluation log(wage it ) = θ t + z itγ + δ 1 prog it + c i + u it (b) Distributed Lag Model (Hausman, Hall, Griliches, 1984) patents it = θ t + z itγ + δ 0 RD it + δ 1 RD it δ 5 RD it 5 + c i + u it (c) Lagged Dependent Variable log(wage it ) = β 1 log(wage it 1 ) + c i + u it Laura Magazzini (@univr.it) Panel Data: Linear Models 43 / 45

44 Main References Main References Baltagi BH (2001): Econometric Analysis of Panel Data, John Wiley & Sons Ltd. Chamberlein G (1984): Panel Data, in Griliches and Intriligator, (eds.) Handbook of Econometrics, Vol.2, Elsevier Science, Amsterdam Greene, WH (2003): Econometric Analysis, Prentice Hall, ch.13 Hsiao C (2003): Analysis of Panel Data, Cambridge University Press Mundlak Y (1978): On the Pooling of Time Series and Cross Section Data, Econometrica, 46(1), Verbeek M (2006): A Guide to Modern Econometrics, ch. 10 Wooldridge, JM (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press: Cambridge, ch.10 Laura Magazzini (@univr.it) Panel Data: Linear Models 44 / 45

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models: Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

Correlated Random Effects Panel Data Models

Correlated Random Effects Panel Data Models INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

More information

Panel Data Analysis in Stata

Panel Data Analysis in Stata Panel Data Analysis in Stata Anton Parlow Lab session Econ710 UWM Econ Department??/??/2010 or in a S-Bahn in Berlin, you never know.. Our plan Introduction to Panel data Fixed vs. Random effects Testing

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

More information

Panel Data Econometrics

Panel Data Econometrics Panel Data Econometrics Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is

More information

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working

More information

Econometric Methods for Panel Data

Econometric Methods for Panel Data Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

1 Introduction. 2 The Econometric Model. Panel Data: Fixed and Random Effects. Short Guides to Microeconometrics Fall 2015

1 Introduction. 2 The Econometric Model. Panel Data: Fixed and Random Effects. Short Guides to Microeconometrics Fall 2015 Short Guides to Microeconometrics Fall 2015 Kurt Schmidheiny Unversität Basel Panel Data: Fixed and Random Effects 2 Panel Data: Fixed and Random Effects 1 Introduction In panel data, individuals (persons,

More information

MODELS FOR PANEL DATA Q

MODELS FOR PANEL DATA Q Greene-2140242 book November 23, 2010 12:28 11 MODELS FOR PANEL DATA Q 11.1 INTRODUCTION Data sets that combine time series and cross sections are common in economics. The published statistics of the OECD

More information

On Marginal Effects in Semiparametric Censored Regression Models

On Marginal Effects in Semiparametric Censored Regression Models On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider

More information

From the help desk: Swamy s random-coefficients model

From the help desk: Swamy s random-coefficients model The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Poor identification and estimation problems in panel data models with random effects and autocorrelated errors

Poor identification and estimation problems in panel data models with random effects and autocorrelated errors Poor identification and estimation problems in panel data models with random effects and autocorrelated errors Giorgio Calzolari Laura Magazzini January 7, 009 Submitted for presentation at the 15th Conference

More information

Longitudinal (Panel and Time Series Cross-Section) Data

Longitudinal (Panel and Time Series Cross-Section) Data Longitudinal (Panel and Time Series Cross-Section) Data Nathaniel Beck Department of Politics NYU New York, NY 10012 nathaniel.beck@nyu.edu http://www.nyu.edu/gsas/dept/politics/faculty/beck/beck home.html

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Fixed Effects Bias in Panel Data Estimators

Fixed Effects Bias in Panel Data Estimators DISCUSSION PAPER SERIES IZA DP No. 3487 Fixed Effects Bias in Panel Data Estimators Hielke Buddelmeyer Paul H. Jensen Umut Oguzoglu Elizabeth Webster May 2008 Forschungsinstitut zur Zukunft der Arbeit

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College. The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Multiple Choice Models II

Multiple Choice Models II Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO. Hamilton New Zealand UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

More information

Employer-Provided Health Insurance and Labor Supply of Married Women

Employer-Provided Health Insurance and Labor Supply of Married Women Upjohn Institute Working Papers Upjohn Research home page 2011 Employer-Provided Health Insurance and Labor Supply of Married Women Merve Cebi University of Massachusetts - Dartmouth and W.E. Upjohn Institute

More information

Introduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A.

Introduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A. Introduction to Regression Models for Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. McManus Panel Data Analysis October 2011 What are Panel Data? Panel

More information

Sample Size Calculation for Longitudinal Studies

Sample Size Calculation for Longitudinal Studies Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Using instrumental variables techniques in economics and finance

Using instrumental variables techniques in economics and finance Using instrumental variables techniques in economics and finance Christopher F Baum 1 Boston College and DIW Berlin German Stata Users Group Meeting, Berlin, June 2008 1 Thanks to Mark Schaffer for a number

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

The Effect of R&D Expenditures on Stock Returns, Price and Volatility

The Effect of R&D Expenditures on Stock Returns, Price and Volatility Master Degree Project in Finance The Effect of R&D Expenditures on Stock Returns, Price and Volatility A study on biotechnological and pharmaceutical industry in the US market Aleksandra Titi Supervisor:

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Introduction to Path Analysis

Introduction to Path Analysis This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2 Multiple Linear

More information

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537) Get the latest knowledge from leading global experts. Financial Science Economics Economics Short Courses Presented by the Department of Economics, University of Pretoria WITH 2015 DATES www.ce.up.ac.za

More information

Non-Stationary Time Series andunitroottests

Non-Stationary Time Series andunitroottests Econometrics 2 Fall 2005 Non-Stationary Time Series andunitroottests Heino Bohn Nielsen 1of25 Introduction Many economic time series are trending. Important to distinguish between two important cases:

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

More information

Note 2 to Computer class: Standard mis-specification tests

Note 2 to Computer class: Standard mis-specification tests Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052) Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

More information

Spatial panel models

Spatial panel models Spatial panel models J Paul Elhorst University of Groningen, Department of Economics, Econometrics and Finance PO Box 800, 9700 AV Groningen, the Netherlands Phone: +31 50 3633893, Fax: +31 50 3637337,

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Sales forecasting # 1

Sales forecasting # 1 Sales forecasting # 1 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Lecture 3: Differences-in-Differences

Lecture 3: Differences-in-Differences Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).

More information

Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

More information

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,

More information

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

More information

Identifying Non-linearities In Fixed Effects Models

Identifying Non-linearities In Fixed Effects Models Identifying Non-linearities In Fixed Effects Models Craig T. McIntosh and Wolfram Schlenker October 2006 Abstract We discuss the use of quadratic terms in models which include fixed effects or dummy variables.

More information

Longitudinal Meta-analysis

Longitudinal Meta-analysis Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department

More information

Linear Models for Continuous Data

Linear Models for Continuous Data Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear

More information

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2) Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2) Oscar Torres-Reyna otorres@princeton.edu December 2007 http://dss.princeton.edu/training/ Intro Panel data (also known as longitudinal

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format: Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random

More information

The following postestimation commands for time series are available for regress:

The following postestimation commands for time series are available for regress: Title stata.com regress postestimation time series Postestimation tools for regress with time series Description Syntax for estat archlm Options for estat archlm Syntax for estat bgodfrey Options for estat

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

More information

APPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS. Giovanni S.F. Bruno EEA 2004-1

APPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS. Giovanni S.F. Bruno EEA 2004-1 ISTITUTO DI ECONOMIA POLITICA Studi e quaderni APPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS Giovanni S.F. Bruno EEA 2004-1 Serie di econometria ed economia applicata

More information

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

An Investigation of the Statistical Modelling Approaches for MelC

An Investigation of the Statistical Modelling Approaches for MelC An Investigation of the Statistical Modelling Approaches for MelC Literature review and recommendations By Jessica Thomas, 30 May 2011 Contents 1. Overview... 1 2. The LDV... 2 2.1 LDV Specifically in

More information

Implementing Panel-Corrected Standard Errors in R: The pcse Package

Implementing Panel-Corrected Standard Errors in R: The pcse Package Implementing Panel-Corrected Standard Errors in R: The pcse Package Delia Bailey YouGov Polimetrix Jonathan N. Katz California Institute of Technology Abstract This introduction to the R package pcse is

More information

TURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS. A nonlinear moving average test as a robust test for ARCH

TURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS. A nonlinear moving average test as a robust test for ARCH TURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS ISSN 0786 656 ISBN 951 9 1450 6 A nonlinear moving average test as a robust test for ARCH Jussi Tolvi No 81 May

More information

Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach.

Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach. Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach. César Calderón a, Enrique Moral-Benito b, Luis Servén a a The World Bank b CEMFI International conference on Infrastructure Economics

More information

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector Journal of Modern Accounting and Auditing, ISSN 1548-6583 November 2013, Vol. 9, No. 11, 1519-1525 D DAVID PUBLISHING A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing

More information

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract

More information

Redistributional impact of the National Health Insurance System

Redistributional impact of the National Health Insurance System Redistributional impact of the National Health Insurance System in France: A microsimulation approach Valrie Albouy (INSEE) Laurent Davezies (INSEE-CREST-IRDES) Thierry Debrand (IRDES) Brussels, 4-5 March

More information

Sovereign Defaults. Iskander Karibzhanov. October 14, 2014

Sovereign Defaults. Iskander Karibzhanov. October 14, 2014 Sovereign Defaults Iskander Karibzhanov October 14, 214 1 Motivation Two recent papers advance frontiers of sovereign default modeling. First, Aguiar and Gopinath (26) highlight the importance of fluctuations

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

1. THE LINEAR MODEL WITH CLUSTER EFFECTS

1. THE LINEAR MODEL WITH CLUSTER EFFECTS What s New in Econometrics? NBER, Summer 2007 Lecture 8, Tuesday, July 31st, 2.00-3.00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples

More information

Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

More information

The marginal cost of rail infrastructure maintenance; does more data make a difference?

The marginal cost of rail infrastructure maintenance; does more data make a difference? The marginal cost of rail infrastructure maintenance; does more data make a difference? Kristofer Odolinski, Jan-Eric Nilsson, Åsa Wikberg Swedish National Road and Transport Research Institute, Department

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Department of Economics

Department of Economics Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

The Method of Least Squares

The Method of Least Squares Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Estimating the random coefficients logit model of demand using aggregate data

Estimating the random coefficients logit model of demand using aggregate data Estimating the random coefficients logit model of demand using aggregate data David Vincent Deloitte Economic Consulting London, UK davivincent@deloitte.co.uk September 14, 2012 Introduction Estimation

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Estimating the marginal cost of rail infrastructure maintenance. using static and dynamic models; does more data make a

Estimating the marginal cost of rail infrastructure maintenance. using static and dynamic models; does more data make a Estimating the marginal cost of rail infrastructure maintenance using static and dynamic models; does more data make a difference? Kristofer Odolinski and Jan-Eric Nilsson Address for correspondence: The

More information

Econometric analysis of the Belgian car market

Econometric analysis of the Belgian car market Econometric analysis of the Belgian car market By: Prof. dr. D. Czarnitzki/ Ms. Céline Arts Tim Verheyden Introduction In contrast to typical examples from microeconomics textbooks on homogeneous goods

More information

Statistical Tests for Multiple Forecast Comparison

Statistical Tests for Multiple Forecast Comparison Statistical Tests for Multiple Forecast Comparison Roberto S. Mariano (Singapore Management University & University of Pennsylvania) Daniel Preve (Uppsala University) June 6-7, 2008 T.W. Anderson Conference,

More information