The Identification Zoo - Meanings of Identification in Econometrics: PART 2

Size: px
Start display at page:

Download "The Identification Zoo - Meanings of Identification in Econometrics: PART 2"

Transcription

1 The Identification Zoo - Meanings of Identification in Econometrics: PART 2 Arthur Lewbel Boston College 2016 Lewbel (Boston College) Identification Zoo / 75

2 The Identification Zoo - Part 2 - sections 5, 6, 7, 8. (These are notes to accompany a survey article on econometric identification commissioned by the Journal of Economic Literature). More than two dozen types of identification appear in the econometrics literature, including (in alphabetical order): Baysian identification, causal identification, eventual identification, exact identification, first order identification, frequentist identification, generic identification, global identification, identification arrangement, identification at infinity, identification of bounds, ill-posed identification, irregular identification, local identification, nearly-weak identification, over identification, nonparametric identification, non-robust identification, nonstandard weak identification, parametric identification, partial identification, point identification, sampling identification, semiparametric identification, set identification, semi-strong identification, strong identification, structural identification, thin-set identification, and weak identification. Lewbel (Boston College) Identification Zoo / 75

3 1. Introduction Goals: Summarize the different types of identification and their meanings. Show how they relate to each other. Look at issues closely related to identification, such as coherence. Study some examples of identification. Lewbel (Boston College) Identification Zoo / 75

4 Table of Contents: 1. Introduction 2. Identifying Ordinary (Global, Point) Identification 2.1 Historical Roots of Identification 2.2 Unifying Notions of Ordinary Identification 2.3 Demonstrating Ordinary Identification 2.4 Reasons for Ordinary Nonidentification 2.5 Identification by Functional Form 2.6 Over, Under, and Exact Identification, Rank and Order conditions 3. Coherence and Completeness 4. Functions and Sets 4.1 Partial, Point, and Set Identification 4.2 Nonparametric and Semiparametric Identification 4.3 The Meaning and Role of Normalizations in Identification 4.4 Examples: Some Special Regressor Models Continued next slide Lewbel (Boston College) Identification Zoo / 75

5 Table of Contents - continued: 5. Causal Inference vs Structural Model Identification 5.1 Causal vs Structural Identification: An Example 5.2 Causal vs Structural Simultaneous Systems 5.1 Causal vs Structural Conclusions 6. Limited Forms of Identification 6.1 Local and Global Identification 6.2 Generic Identification 7. Identification Concepts that Affect Inference 7.1 Weak vs Strong Identification 7.2 Identification at Infinity or Zero; Irregular and Thin set identification 7.3 Ill-Posed Identification 7.4 Bayesian Identification 8. Conclusions Lewbel (Boston College) Identification Zoo / 75

6 Part 1 had sections: 2. Identifying Ordinary Identification 3. Coherence and Completeness 4. Identification of Functions and Sets Lewbel (Boston College) Identification Zoo / 75

7 5. Causal Inference vs Structural Model Identification Most identification theory was developed for structural models, Came before the rise of randomized/causal modeling in econometrics. Most surveys of identification don t touch causal identification Much structural identification (like Cowles) is for simultaneous systems. In contrast, consider Rubin (1974) counterfactual notation: Y (t), the outcome Y that would have occurred if treatment T had equaled t. This notation generally assumes T affects Y, not the reverse. Causal inference doesn t care about identifying structural parameters. Causal estimands to be identified are summary measures. Example: Average treatment effect (ATE): θ = E (Y (1) Y (0)). Lewbel (Boston College) Identification Zoo / 75

8 Other common causal estimands: average treatment effects on the treated (ATT), local average treatments effects (LATE), marginal treatment effects (MTE), quantile treatment effects (QTE). Typical structural model obstacles to identification are simultaneity, nonuniqueness, multiple solutions. The main obstacle to identification of causal parameters is that they are defined in terms of unobservable counterfactuals. Instead of seeing whether a structural parameter can be identified, Angrist and Imbens (1994) LATE starts with a standard estimator, and asks whether one can give the identifiable estimand a causal interpretation. Lewbel (Boston College) Identification Zoo / 75

9 5.1 Causal vs Structural Identification: An Example Here we compare identification of standard causal LATE estimator (Angrist and Imbens 1994) to standard structural ATE estimation. One diffi culty in comparing is that causal and structural methods use different notations. To facilitate the comparison, I ll rewrite the LATE assumptions completely in terms of the corresponding restrictions on the standard structural model. Let Y i be an observed outcome for individual i. Let T i be a binary endogenous regressor (treatment indicator). Let Z i be a binary covariate (potential instrument), correlated with T. Lewbel (Boston College) Identification Zoo / 75

10 Assume a DGP for which the second moments of Y, T, Z exist and can be identified. Let c = cov (Z, Y ) /cov (Z, T ), which is therefore identified (by construction). No model has yet been specified, but this c would be the estimated coeffi cient of T in a linear instrumental variables regression of Y on a constant and on T, using a constant and Z as instruments. Lewbel (Boston College) Identification Zoo / 75

11 Start with ( the most ) general possible ( model ) for Y and T : Y = G T, Z, Ũ and T = R Y, Z, Ṽ Ũ and Ṽ are vectors of unobservable errors. G and R are arbitrary, unknown functions. Assume G does not depend directly on Z, and R does not depend ( directly ) on Y. Model becomes Y = G T, Ũ and T = R ( ) Z, Ṽ. These are exclusion restrictions. Makes the model Triangular. Lewbel (Boston College) Identification Zoo / 75

12 ( ) Have Y = G T, Ũ and T = R ( ) Z, Ṽ ( ) ( ) ( ) Let U 0 = G 0, Ũ, U 1 = G 1, Ũ G 0, Ũ, ( ) ( ) ( ) V 0 = R 0, Ṽ, V 1 = R 1, Ṽ R 0, Ṽ. Since both T and Z are binary, this lets us without loss of generality rewrite as the model as Y = U 0 + U 1 T and T = V 0 + V 1 Z. This is a linear random coeffi cients model. Also: Since T and Z are binary, we also have that V 0 and V 1 + V 0 are binary. Lewbel (Boston College) Identification Zoo / 75

13 Now start over. Take the same model, but write it causally, using unobserved counterfactuals instead of unobserved errors. Rubin (1974) Causal notation: Y (t) is the random variable denoting outcome Y if T = t. T (z) is the random variable denoting treatment T if Z = z. How do the notations relate? Y (t) = G ( ) ( ) t, Ũ, T (z) = R z, Ṽ As in the structural model, this assumes the exclusion that Z does not directly affect Y. More precisely, Z does not affect either Y (0) or Y (1). and Y does not directly affect T. We could have started with Y (t, z), the exclusion is then Y (t, 1) = Y (t, 0). Lewbel (Boston College) Identification Zoo / 75

14 So far, structural and causal are identical; both assumed nothing except exclusions/triangular structure. A required causal inference assumption: SUTVA (Stable Unit Treatment Value Assumption): Any one person s outcome is unaffected by the treatment that other people receive. The term SUTVA was coined by Rubin (1980), concept goes back at least to Cox (1958), and perhaps implicit in both Fisher (1935) and Neyman (1935). Lewbel (Boston College) Identification Zoo / 75

15 SUTVA (Stable Unit Treatment Value Assumption) formal definition: For any two different people i and j, Y i (0) and Y i (1) are independent of T j. SUTVA essentially rules out social interactions, peer effects, most general equilibrium effects. Structural analogs to SUTVA are restrictions on correlations between {U 0i, U 1i } and {V 0j, V 1j, Z j }. Suffi cient much stronger than necessary for SUTVA: {U 0i, U 1i, V 0i, V 1i, Z i } independent across i. Assume this for simplicity. Lewbel (Boston College) Identification Zoo / 75

16 Y = U 0 + U 1 T and T = V 0 + V 1 Z. Same model as Y (T ) and T (Z ). Have assumed exclusions, and independence across people. What else? By construction T depends on V 0 and V 1. Endogenous T could also correlate with U 0 and U 1. Want Z to be an instrument, so add the causal assumption: {Y (1), Y (0), T (1), T (0)} is independent of Z. This is called unconfoundedness. would be implied by randomly assigned Z. Structurally, unconfoundedness is the same as: {U 1, U 0, V 0, V 1 } is independent of Z. Lewbel (Boston College) Identification Zoo / 75

17 Y = U 0 + U 1 T and T = V 0 + V 1 Z. Same model as Y (T ) and T (Z ). Unconfoundedness is: {Y (1), Y (0), T (1), T (0)} is independent of Z. equivalently, {U 1, U 0, V 0, V 1 } is independent of Z. typically justified by randomly assigned Z. somewhat stronger than usual structural assumptions. Assumptions so far: Exclusions/triangular, SUTVA/independence, unconfoundedness. Assume also structurally that E (V 1 ) = 0, or equivalently in the causal notation, that E (T (1) T (0)) = 0. This assumption ensures that the instrument Z is relevant. Lewbel (Boston College) Identification Zoo / 75

18 Given just these assumptions, what does the instrumental variables estimand c equal? c = cov (Z, Y ) cov (Z, T ) = cov (Z, U 0 + U 1 T ) cov (Z, (V 0 + V 1 Z )) = cov (Z, U 0 + U 1 (V 0 + V 1 Z )) cov (Z, (V 0 + V 1 Z )) = cov (Z, U 1V 1 Z ) cov (Z, V 1 Z ) = E (U 1V 1 ) var(z ) E (V 1 ) var (Z ) = E (U 1V 1 ) E (V 1 ) For this problem, the only difference between structural and causal approaches is that different additional assumptions are made to interpret the equation c = E (U 1 V 1 ) /E (V 1 ). Lewbel (Boston College) Identification Zoo / 75

19 Consider structural identification first: Model is still: Y = U 0 + U 1 T and T = V 0 + V 1 Z. Can rewrite this as: Y = a + bt + e, where b = E (U 1 ), e = (U 1 b) T + U 0 a. Structural identification question: When does IV estimand c = cov (Z, Y ) /cov (Z, T ) equal the structural coeffi cient b? Standard answer: Z being a valid structural model instrument requires cov (e, Z ) = 0. And what does b mean? Recall definition ATE = E [Y (1) Y (0)]. Under our assumptions, E [Y (t)] = E [(U 0 + U 1 t)] = E (U 0 ) + E (U 1 ) t for t = 0 and for t = 1. Therefore E [Y (1) Y (0)] = E (U 1 ) = b. So the structural coeffi cient b is precisely the causal ATE. Lewbel (Boston College) Identification Zoo / 75

20 With Y = a + bt + e, have Z is valid structural instrument if cov (e, Z ) = 0, and then IV estimand c = b. But, given our previous, causal type assumptions, what does cov (e, Z ) = 0 mean? What does this structural restriction imply? Since Z is independent of {U 1, U 0, V 0, V 1 }, we have that cov (e, Z ) = cov (U 1, V 1 ) var (Z ), so cov (e, Z ) = 0 if cov (U 1, V 1 ) = 0. Note from above that c = E (U 1V 1 ) E (V 1 ) = E (U 1 ) + cov (U 1, V 1 ) E (V 1 ) So again c = b if cov (U 1, V 1 ) = 0. = E (U 1) E (V 1 ) + cov (U 1, V 1 ) E (V 1 ) = b + cov (U 1, V 1 ) E (V 1 ) Summary: the structural identifying assumption is cov (U 1, V 1 ) = 0. Under this assumption, c = b = ATE for the population. Lewbel (Boston College) Identification Zoo / 75

21 Now look at causal identification, and then compare. Define a complier: A person i for whom Z i and T i are the same random variable. If a complier has Z = 0, then he has T = 0, and if has Z = 1, then has T = 1. Define a defier: A person i for whom Z i and 1 T i are the same random variable. We can t know who the compliers and defiers are! Because they are defined in terms of counterfactuals. If someone has Z = 0, we can see if they have T = 0 or T = 1, but we can t know what their T would have been if they had been assigned Z = 1. Lewbel (Boston College) Identification Zoo / 75

22 Recall T = V 0 + V 1 Z. All the possibilities relating T to Z are: Compliers: T = Z: V 0 = 0 and V 1 = 1. Defiers: T = 1 Z: V 0 = 1 and V 1 = 1. Always takers: T = 1 for any Z: V 0 = 1 and V 1 = 0. Never takers: T = 0 for any Z: V 0 = 0 and V 1 = 0. Compliers are the only type that have V 1 = 1. Angrist and Imbens (1994) define the LATE (local average treatment effect) as the ATE just among compliers. (note: their use of the word local is not the same as in local identification). In our notation here, LATE = E [Y (1) Y (0) V 1 = 1]. Lewbel (Boston College) Identification Zoo / 75

23 Now revisit the IV coeffi cient c. Let P v be the probability that V 1 = v. Then, by definition of expectations, c = E (U 1V 1 ) E (V 1 ) = E (U 1 V 1 V 1 = 1) P 1 + E (U 1 V 1 V 1 = 0) P 0 + E (U 1 V 1 V 1 = 1) P 1 E (V 1 V 1 = 1) P 1 + E (V 1 V 1 = 0) P 0 + E (V 1 V 1 = 1) P 1 = E (U 1 V 1 = 1) P 1 E (U 1 V 1 = 1) P 1 P 1 P 1 Angrist and Imbens (1994): Assume that there are no defiers in the population. This rules out the V 1 = 1 case, making P 1 = 0. Then above simplifies to: c = E (U 1 V 1 = 1) = LATE. So causal assumes V 1 = 1 (no defiers) and identifies LATE for just compliers. Lewbel (Boston College) Identification Zoo / 75

24 Comparing Structural and Causal: Model is Y = U 0 + U 1 T and T = V 0 + V 1 Z. where V 1 and V 1 + V 0 are binary. Given our assumptions: Exclusions/triangular, SUTVA/independence, unconfoundedness; Structural makes the additional assumption cov (U 1, V 1 ) = 0. This restricts heterogeneity of the treatment effect U 1. Identifies ATE for the population. Causal instead makes the additional assumption V 1 = 1 (no defiers). This restricts heterogeneity of types of individuals. Identifies LATE for compliers, says nothing about the treatment effect on non-compliers. Lewbel (Boston College) Identification Zoo / 75

25 Comparing Structural and Causal - continued: Limitations of LATE: Says nothing about treatment effect on noncompliers (ok if Z is a policy variable?). We don t know who the compliers are (though we can identify some observable characteristics of their distribution). Lewbel (Boston College) Identification Zoo / 75 Neither the structural nor causal assumptions above are a priori more plausible. Both require assumptions on unobservables, not just cov (U 1, V 1 ) = 0 or V 1 = 1, but also by assumptions like exclusion restrictions and SUTVA. Structural cov (U 1, V 1 ) = 0 restricts assumed heterogeneity of the treatment effect U 1. Assumes an individual s type V 1 is on average unrelated to the size of personal treatment effect U 1. But delivers b, the population ATE. Causal V 1 = 1 (no defiers) restricts heterogeneity of types of individuals. Compared to cov (U 1, V 1 ) = 0, has the advantage of not imposing restrictions on the Y equation. But only delivers LATE.

26 Another property of LATE: the definition of a complier depends on Z. Suppose we saw a different instrument Z instead of Z. ) ) Let c = cov (Z, Y ) /cov (Z, T ) and c = cov ( Z, Y /cov ( Z, T. If Z is a valid instrument in the structural sense, then c = c = b = ATE in the population. Also could then test instrument validity. If reject c = c, then either Z or Z is not a (structurally) valid instrument. But even if both are causally valid, will ususally have c = c. Both are LATEs, but for different (unknown) compliers. Are there any testable implications of causal instrument validity? Yes, there exist a few weak inequalities one can test, e.g., P 1 can t be negative. See, e.g., Kitagawa (2015). Lewbel (Boston College) Identification Zoo / 75

27 An Aside: We started with a nonparametric model Y = G ( ) T, Ũ with any G function and any error vector Ũ, and showed it was equivalent to a linear model Y = θt + e (possibly with T endogenous). Nonparametric model = linear model? Yes! But it s a very special property of binary regressors. Also, if we have Y = g (X ) + e for a discrete vector X (with a finite number of mass points), can also write as linear regression. Let D be vector of dummies. For each value x that X can take on, let an element of D equal one when X = x and zero otherwise. Then Y = g (X ) + e = c D + e. This linear regression is called a saturated regression model. Since g (X ) = c D, identifying vector c is the same as identifying g over the support of X. Lewbel (Boston College) Identification Zoo / 75

28 5.2 Causal vs Structural Simultaneous Systems Assume now a simultaneous system of equations, say Y = U 0 + U 1 X and X = H (Y, Z, V ). (U 0, U 1 ) and V are unobserved error vectors. Observe Y, X, and instrument Z. Using X not T because might not be treatment (need not be binary). As before, analyze meaning of c = cov (Z, Y ) /cov (Z, X ). Structual analysis is exactly the same as before: Have Y = a + bx + e where b = E (U 1 ) and e = U 0 + (U 1 b) X. If cov (e, X ) = 0 then c = b = average marginal effect of X on Y. In contrast, causal analysis is much more complicated. Angrist, Graddy and Imbens (2000) show under LATE type assumptions, c equals a complicated weighted average of U 1, with weights that depend on Z and some weights are zero. Lewbel (Boston College) Identification Zoo / 75

29 Why does causal become complicated? In part because the definition of potential outcomes imposes restrictions on the simultaneous system. Structurally we can write Y = G (X, U) and X = H (Y, Z, U). The corresponding counterfactuals would be Y (x) and X (y, z). Assuming the existence of these counterfactual random variables imposes restrictions on the structure. For example, if the structural model is incomplete (see section 3), then the counterfactuals are not well defined. Example: G and H describe actions of players X and Y in a game. If the game has multiple equilibria, then unique random variables Y (x) and X (y, z) do not exist. Any given Y (x) and X (y, z) variables correspond to assuming players follow a specific equilibrium selection rule. More generally, causal methods cannot be used to analyze potentially incomplete models. Lewbel (Boston College) Identification Zoo / 75

30 Assuming coherence and completeness, consider constructing a semi reduced form equation X = R (Z, U, V ), defined by solving the equation X = H (G (X, U), Z, V ) for X. Instead of counterfactuals Y (x) and X (y, z), in causal analyses it is far more common to define (and place assumptions directly on) counterfactuals Y (x) and X (z), corresponding to the semi reduced form, triangular model Y = G (X, U) and X = R (Z, U, V ). Indeed, causal inference methods are often called reduced form modeling. Lewbel (Boston College) Identification Zoo / 75

31 One more limitation in applying causal analyses to simultaneous systems: SUTVA. Many structural models exist in where treatment of one person affects outcomes of others. Examples: peer effect models, social interactions models, network models, general equilibrium models. In causal frameworks such effects are generally ruled out by the SUTVA. These diffi culties may be why causal methods have become popular in traditionally partial equilibrium analyses (e.g., labor economics and micro development), but not where general equilibrium models are the norm (e.g., industrial organization and macroeconomics). Lewbel (Boston College) Identification Zoo / 75

32 5.3 Causal vs Structural Identification: Summary Structural models summary: Provides information about underlying structure. Can incorporate behavioral restrictions implied by economic theory. Can gain identification from such restrictions, even without randomization. Structure, if correctly specified, provides external validity. But can be misspecified. Box (1979): "All models are wrong, but some are useful". Lewbel (Boston College) Identification Zoo / 75

33 Structural models summary - continued: Common objection to structural modeling approach: Where are the deep parameters that structural models have uncovered? What are their values? One answer: calibrated parameters in macro models. Calibrated parameters are deep parameters with values that users of these models largely agree upon. They are based generally on structural model estimates. Though a few, like rates of time preference and risk aversion, have been additionally reenforced by lab and field experiments. Also, experience from many structural models has lead to consensus regarding the ranges of values for many parameters (such as price and income elasticities) that are widely recognized as reasonable. Lewbel (Boston College) Identification Zoo / 75

34 Causal models summary: Can also be misspecified (e.g., there could be defiers). But requires far fewer behavioral restrictions, particularly on outcomes. External validity question: what is relevance of LATE to other data sets? How representative are compliers? Without structure, can t identify effects on unobservables of economic interest, e.g., utility, risk aversion, bargaining power, or expectations. Even with random assignment, treatment of large numbers or spread over time may violate SUTVA. Example: Progresa in Mexico, people move to communities that have the program, or change behavior from interacting with treated individuals, or in expectation that the program will expand to their own community (see, e.g., Behrman and Todd 1999). Lewbel (Boston College) Identification Zoo / 75

35 Causal vs Structural Identification Conclusions Researchers do NOT need to choose between either searching for randomized instruments or building structural models. For identification, the best strategy is to exploit both approaches. 1. Causal inference can be augmented with structural econometric methods to cope with attrition, sample selection, measurement error, and contamination bias. 2. Structural Identification often depends on independence assumptions, which are more likely to hold under randomization. 3. Causal effects can provide useful benchmarks for structure. For example, estimate a structural model from a large survey, check if treatment effects implied by estimated structural parameters equal causal treatment effects from small randomized trials drawn from the same underlying population. Lewbel (Boston College) Identification Zoo / 75

36 Causal vs Structural Identification Conclusions - continued 4. Can use structure to link estimated causal effects of a treatment (say, access to credit) on observable characteristics (like household s expenditures) to effects on less observable characteristics (like women s utility and bargaining power in a household). 5. Economic theory and structure can provide guidance regarding the external validity of causally identified parameters. E.g., Dong and Lewbel (2015): with a mild structural assumption called local policy invariance, can identify how treatment effects estimated by regression discontinuity would change if the cut off point changed. 6. Big data analyses on large data sets can uncover promising correlations. Structural analyses of such data might then be used to uncover possible economic and behavioral mechanisms that underlie these correlations, and randomization can be used to investigate the causal direction of these correlations. Lewbel (Boston College) Identification Zoo / 75

37 Causal vs Structural Identification Conclusions - continued Various attempts to formally unify structural and causal methods have been attempted over time. See, e.g., Heckman, Urzua and Vytlacil (2006), Pearl (2009), and Heckman (2010) Other notions of causality exist that are relevant to the economics and econometrics literature, going back to Hume (1739) and Mill (1851), and including temporal based definitions as in Granger (1969). See Hoover (2006) for a survey of alternative notions of causality in economics and econometrics. Lewbel (Boston College) Identification Zoo / 75

38 6. Limited Forms of Identification It s often hard to prove ordinary identification. Possible responses: 1. Set identification (estimation and inference is hard). 2. Ignore the problem and assume identification. A third way: Prove some more limited form of identification holds. Then the leap of faith in assuming ordinary identification is not as large. Limited forms of identification include local identification and generic identification. Lewbel (Boston College) Identification Zoo / 75

39 6.1 Local vs. Global Identification Another term for standard identification is global identification. Let Θ be the set of all possible values of θ. Global emphasizes that for each s S, we can identify a unique value of θ over the entire range of possible values Θ (see, e.g., Komunjar 2012). Local identification means θ is identified if restrict Θ to be a small open neighborhood of the true θ 0. Local identification differs from (and predates), the term local as used in LATE (local average treatment effect). In LATE, local means the mean parameter value for a particular subpopulation. Lewbel (Boston College) Identification Zoo / 75

40 Local and global identification: Suppose m (X ) is an identified, continuous function. Suppose θ is real, and m (θ) = 0. Consider three possible cases: Case a: Suppose we know m (x) is strictly monotonic. Then θ (if it exists) is globally identified. Strict monotonicity ensures only one value of θ can satisfy m (θ) = 0. Case b: Suppose m is known to be a J th order polynomial for some integer J. Then θ will typically not be globally identified. Are up to J different values of θ that satisfy m (θ) = 0. However, in this case θ is locally identified. There always exists a neighborhood Θ of the true θ small enough to exclude all other roots of m (x) = 0. Case c: Suppose all we know about m is that it is continuous. Then θ might not even be locally identified, because m (x) could equal zero for all values of x in some interval. Lewbel (Boston College) Identification Zoo / 75

41 If a real parameter vector θ is set identified, and the set has a finite number of elements, then θ is locally identified. Example: suppose multiple local optima when maximizing an extremum estimator objective function like maximum likelihood or GMM. If in the population there are a finite number of local optima, and they all have the same value of the objective function, then the parameters are locally but not globally identified. Lewbel (Boston College) Identification Zoo / 75

42 In nonlinear models, it is often much easier to prove local rather than global identification. Local identification may be suffi cient in practice if we have enough economic intuition about true parameter values to know that the correct θ should lie in a particular region. Example: Lewbel (2012). Parameter is a coeffi cient in a simultaneous system of equations, and the set has two elements, one positive and one negative. In this case have only local identification, unless economic model is suffi cient to determine the sign of the parameter a priori (e.g., it is the price coeffi cient in a supply equation). Here we were helped because regions where different θ s lie are known. Lewbel (Boston College) Identification Zoo / 75

43 The notion of local identification is described by Fisher (1966) for linear models. Generalized to other parametric models by Rothenberg (1971) and Sargan (1983), as follows: Let θ be vector of structural parameters let φ be set of reduced form parameters Consider structures S (φ θ) of the form r (φ (θ), θ) = 0 for some known vector valued function r. Let R (θ) = r (φ (θ), θ). A suffi cient condition for local identification of θ is that R (θ) be differentiable and that the rank of R (θ) / θ equal the number of elements of θ. Sargan (1983) calls violation of this condition first order (lack of) identification. Lewbel (Boston College) Identification Zoo / 75

44 For MLE models, the rank condition for local identification is equivalent to the condition that the information matrix evaluated at the true θ be nonsingular. Newey and McFadden (1994) and Chernozhukov, Imbens, and Newey (2007) extend local identification to other models. Chen, Chernozhukov, Lee, and Newey (2014) provide a rank condition for local identification of a finite parameter vector when the structure consists of conditional moment restrictions. Chen and Santos (2015) define a concept of local overidentification. Lewbel (Boston College) Identification Zoo / 75

45 6.2 Generic Identification Like local identification, generic identification is a weaker condition than ordinary identification that is sometimes easier to prove. Let Φ be the set of all possible φ. Define Φ Φ to be the set of all φ such that, for the given structure S, the parameters θ are not identified. Based on McManus (1992), θ is defined to be generically identified if Φ is a measure zero subset of Φ. Interpretation: Imagine nature picks a true φ 0 by drawing from a continuous distribution over all possible φ Φ. Then θ is generically identified if the probability that it is not point identified is zero. Lewbel (Boston College) Identification Zoo / 75

46 Generic identification is closely related to the order condition. Consider a linear regression system where the order condition holds. Then the rank condition holds, giving identification, if certain matrix of coeffi cient is nonsingular. In the set of all possible values for that matrix, if the subset that is singular has measure zero, then the order condition suffi ces for generic identification of the model. Similarly, the coeffi cients in a linear regression model Y = X β + e with E (e X ) = 0 are generically identified if the probability is zero that nature chooses a distribution function for X with the property that E (XX ) is singular. Lewbel (Boston College) Identification Zoo / 75

47 Another example: suppose iid of Y, X are observed with X = X + U, Y = X β + e, unobserved model error e, unobserved measurement error U, and unobserved true covariate X are mutually independent with mean zero. Despite not having instruments, β is identified when Y, X have any joint distribution except a normal (Reiersøl 1950). So β is generically identified if the set of possible Y, X distributions is suffi ciently large, e.g., if e could have been drawn from any continuous distribution. Lewbel (Boston College) Identification Zoo / 75

48 Given the same independence of U, e, and X, Schennach and Hu (2013) show that the function m in the nonparametric regression model Y = m (X ) + e is nonparametrically identified as long as m and the distribution of e are not members of a certain parametric class of functions. So again could say mutual independence of e, U, and X leads to generic identification of m, as long as m could have been any smooth function or if e could have been drawn from any smooth distribution. Generic Identification is commonly used in social interactions models. See, e.g., the survey by Blume, Brock, Durlauf, and Ioannides (2011). Lewbel (Boston College) Identification Zoo / 75

49 7. Identification Concepts that Affect Inference Identification preceeds estimation. We first show identification, then consider properties of estimators. However, sometimes the nature of the identification affects inference. The way θ is identified may impact the properties of any possible estimator θ These identification concepts that affect inference include weak identification, Identification at infinity, and ill-posedness. Note many previously discussed identification concepts also relate to estimation:. Extremum based identification, rates of nonparametric vs parametric estimation set vs point estimation. Lewbel (Boston College) Identification Zoo / 75

50 7.1 Weak vs. Strong Identification Weakly identified and strongly identified parameters are point identified. These should not, strictly speaking, be called identification. These instead refer to specific diffi culties relating to asymptotic inference. They re called forms of identification because they arise from features of the underlying structure s, and hence effect any possible estimator we might propose. Informally, weak identification arises in situations that are, in a particular way, close to being nonidentified. Lewbel (Boston College) Identification Zoo / 75

51 Usual source of weak identification: low correlations among variables used to attain identification, like instrument Z and covariate X. Parameters are not identified if the correlation is zero. Identification is weak (or the instruments Z are weak) when this correlation is close to zero. First stage of 2SLS is regress X on Z to get fitted values X. ( Some coeffi cients may be weakly identified if E X X ) is ill conditioned (nearly singular). More generally, in a GMM model weak identification may occur if the moments used for estimation yield noisy or generally uninformative estimates of the underlying parameters. Lewbel (Boston College) Identification Zoo / 75

52 Key feature of weak identification is NOT imprecisely estimates with large standard errors (though they do have that feature). Rather, standard asymptotics poorly approximate the true precision of parameter estimates when identification is weak. (Recall all asymptotics are merely approximations to true small sample distributions). Compare: nonparametric regressions are imprecise with large standard errors (due to slow convergence rates by the curse of dimensionality). But nonparametric regressions are not weakly identified, because standard asymptotic theory (e.g., a CLT with slow rates) adequately approximates their true estimation precision. Lewbel (Boston College) Identification Zoo / 75

53 Suppose that θ is identified, and can be estimated at rate root-n using an extremum estimator (maximizing some objective function, e.g., least squares or GMM or maximum likelihood). If some parameters are weakly identified, then any objective function used to identify and estimate them will be close to flat in some directions. Flatness yields imprecision, but more relevantly, flatness also means that standard errors and t-stats (either analytic or bootstrapped) will be poorly estimated. They depend on the inverse of a matrix of derivatives of the objective function. Flatness makes this matrix close to singular. Strongly identified parameters are not weak, standard estimated asymptotic distributions provide good approximations to their actual finite sample distributions. Lewbel (Boston College) Identification Zoo / 75

54 Weak identification resembles multicollinearity, which in linear regression means E (XX ) is ill-conditioned. Like multicollinearity, it is not that parameters either "are" or "are not" weakly identified. Weakness of identification depends on the sample size. For big enough n, standard asymptotic approximations must become valid. This is why weakness of identification is often judged by rules of thumb rather than formal tests. Staiger and Stock (1997) suggest the rule of thumb for linear two stage least squares models that should be concerned about potential weak instruments if the F-statistic on the excluded regressors in the first stage is less than 10. Lewbel (Boston College) Identification Zoo / 75

55 Two separate things: 1. Parameters that are weakly identified at reasonable sample sizes. 2. The models econometricians use to deal with weak identification. In real data, weak identification disappears when n gets suffi ciently large. This makes it diffi cult to provide a general asymptotic theory to deal with the problem. Econometrician s trick is an alternative asymptotic theory known as drifting parameter models. Lewbel (Boston College) Identification Zoo / 75

56 Consider Y = θx + U and X = βz + V, data are iid observations of scalars Y, X, Z. Errors E (UZ ) = E (VZ ) = 0, for simplicity all variables are mean zero. If β = 0, then θ is identified by θ = E (ZY ) /E (ZX ), corresponding to standard IV. Since E (ZX ) = βe ( Z 2), if β is close to zero then both E (ZY ) and E (ZX ) are close to zero, making θ weakly identified. But how close is close? The bigger the sample size, the more accurately E (ZX ), and E (ZY ) can be estimated, and hence the closer β can be to zero without causing trouble. To capture this idea asymptotically, pretend true β is not a constant, but instead is β n = bn 1/2. The larger n gets, the smaller the coeffi cient β n becomes. This gives us a model that has the weak identification problem at all sample sizes, and so can be analyzed asymptotically. Lewbel (Boston College) Identification Zoo / 75

57 If it were true that β n = bn 1/2, then b and hence β n is often not be identified, so tests and confidence regions for θ have been developed that are robust to weak instruments, that is, they do not depend on estimating β n. See, e,g., Andrews, Moreira, and Stock (2006) for an overview of such methods. Lewbel (Boston College) Identification Zoo / 75

58 In the literature, when we say θ "is" weakly identified, we mean we are providing asymptotic theory for inference that allows for identification to be based on a parameter that drifts towards zero. So in the above model would say θ is strongly identified when the asymptotics we use are based on β n = β, and would say θ is weakly identified when the asymptotics used for inference are based on (or at least allow for) β n = bn 1/2. Lewbel (Boston College) Identification Zoo / 75

59 Usually, we don t believe that parameters are actually drifting towards zero as n grows. Rather, when assuming weakly identification, we re expressing the belief that the drifting parameter model provides better asymptotic approximations to the true distribution than standard asymptotics. Other related terms include semi-strong identification and nonstandard-weak identification. See, e.g., Andrews and Guggenberger (2014). These and other variants have parameters drift to zero at other rates, or have models with a mix of drifting, nondrifting, and purely unidentified parameters. Lewbel (Boston College) Identification Zoo / 75

60 Historical notes on weak identification Earliest recognition of the problem may be Sargan (1983): "we can conjecture that if the model is almost unidentifiable then in finite samples it behaves in a way which is diffi cult to distinguish from the behavior of an exactly unidentifiable model." Other early related work: Phillips (1983), Rothenberg (1984). Bound, Jaeger, and Baker (1995) specifically raised the issue of weak instruments in an empirical context, and see Staiger and Stock (1997). A survey of the weak instruments problem is Stock, Wright, and Yogo (2002). Lewbel (Boston College) Identification Zoo / 75

61 7.2 Identification at infinity or zero; Irregular and Thin set identification Identification at infinity is when identification is based on the distribution of data at points were one or more variables goes to infinity. Suppose iid scalar random variables Y, D, Z. Assume Y = Y D Z Y, Y is a latent unobserved variable D is binary, lim z E (D Z = z) = 1. The goal is identification and estimation of θ = E (Y ). This is a selection model, Y is selected (observed) only when D = 1. So D could be a treatment indicator, Y is the outcome if one is treated, θ is average outcome everyone were treated, and Z is an observed variable (an instrument) that affects the probability of treatment, with the probability of treatment going to one as Z goes to infinity. Lewbel (Boston College) Identification Zoo / 75

62 Y = Y D, Z Y, Y is latent, D is binary, lim z E (D Z = z) = 1, θ = E (Y ). D and Y are correlated, so θ is identified only by lim z E (Y Z = z). Y and D may be correlated, E (Y ) confounds the two. But everyone who has Z infinite is treated, so looking at the mean of just those people eliminates the problem. In real data we would estimate θ as the average value of Y just for people that have Z > c for some chosen c, and then let c as the sample size grows Or could estimate θ as a weighted average of Y with weights that get arbitrarily large as Z gets arbitrarily large. Lewbel (Boston College) Identification Zoo / 75

63 First shown by Chamberlain (1986): Estimators based on identification at infinity usually converge slower than root-n. Same estimation problem arises whenever identification is based on a Z taking on a value or range of values that has probability zero. Kahn and Tamer (2010) call this thin set identification. Example: Manski s (1985) maximum score estimator for binary choice models is thin set identified, because it gets identifying power only from info at one point (the median) of a continuously distributed variable. Lewbel (Boston College) Identification Zoo / 75

64 Kahn and Tamer (2010) use irregular identification to specifically describe cases where thin set identification leads to slower than root-n rates of estimation. Not all thin set identified or identification at infinity parameters are irregular. For example, estimates of θ = E (Y ) in the selection problem can converge at rate root-n if Z has a strictly positive probability of equaling infinity. More subtly, the impossibility theorems of both Chamberlin and Khan and Tamer showing that some thin set identified models cannot converge at rate root n assume that the variables in the DGP have finite variances. This is one way that the Lewbel (2000) special regressor estimator can converge at rate root n. Lewbel (Boston College) Identification Zoo / 75

65 Irregular identification is not the same as weak identification. Irregularly identified parameters converge slowly for essentially the same reasons that kernel regressions converge slowly; they re based on a vanishingly small subset of the entire data set. Also like nonparametric regression, given suitable regularity conditions, standard methods can usually be used to derive asymptotic theory for irregularly identified parameters. However, the rates of convergence of thin set identified parameters can vary widely, from extremely slow up to root n, depending on details regarding the shape of the density function in the neighborhood of the identifying point. Lewbel (Boston College) Identification Zoo / 75

66 Sometimes it s easier to prove a parameter is identified by looking at a thin set, but the parameters may still strongly identified, because the data structure contains more information about the parameter than is being used in the proof. Example: Heckman and Honore (1989) showed that without parameterizing error distributions, the competing risks model is identified by data where covariates can drive individual risks to zero. Lee and Lewbel (2013) later showed that this model is actually strongly identified, using information over the entire DGP, not just where individual risks go to zero. Lewbel (Boston College) Identification Zoo / 75

67 Notes on irregular identification: Identification at infinity of the selection model given above is discussed by Chamberlain (1986), Heckman (1990), Andrews and Schafgans (1998), Lewbel (2007), and Khan and Tamer (2010). Khan and Tamer point out that the average treatment effect model of Hahn (1998) and Hirano, Imbens and Ridder (2003) is also generally irregularly identified, and so will not attain the parametric root n rates derived by those authors unless the instrument has thick tails Similarly, Lewbel s (2000) special regressor estimator requires thick tails to avoid being irregular. Lewbel (Boston College) Identification Zoo / 75

68 7.3 Ill-Posed Identification Let θ be (ordinary) identified from info φ by a structure s. Let s be such that, for some g, θ = g (φ). Ill-posedness is when g is not suffi ciently smooth. Ill-posedness is an identification concept like weak identification or identification at infinity: it s a feature of φ and s, and hence a property of the population. The concept of well-posedness (the opposite of ill-posedness) is due to Hadamard (1923). See Horowitz (2014) for a survey. Lewbel (Boston College) Identification Zoo / 75

69 Example: iid observations of W so φ is the distribution function of W, denote it F (w). A uniformly consistent, asymptotically n 1/2 normal estimator of F is the empirical distribution function: F (w) = n i=1 I (W i w) /n. Suppose θ = g (F ) for some known g. Then θ is (ordinary) identified. If g is continuous, then θ = g ( φ ) is consistent. If g is not continuous, then θ is generally not consistent, and we have ill-posedness. Lewbel (Boston College) Identification Zoo / 75

70 When ill-posed, consistent θ requires "regularization," smoothing out the discontinuity in g. Regularization introduces bias, and consistency needs a way to asymptotically shrink this bias to zero. Result: ill-posedness leads to slower convergence rates. Example: Estimation of the density function f (w) = df (w) /dw. There is not a continuous g such that f = g (F ). Can t take derivative of F (w) = n i=1 I (W i w) /n Kernel estimation is regularization. An example is f (w) = ( F (w + b) F (w b)) / (2b). Need b 0 as n. F converges n 1/2, kernel f is rate n 2/5. Lewbel (Boston College) Identification Zoo / 75

71 Nonparametric regression is similarly ill-posed. Regularization is kernel and bandwidth choice, or is the selection of basis functions and the number of terms to include in nonparametric sieve estimators. In some problems, ill-posedness is more severe, causing much slower convergence rates, like ln (n). Examples of problems that are often severely ill-posed are: Nonparametric instrumental variables, e.g., Newey and Powell (2003), Models containing mismeasured variables where the measurement error distribution is unknown. Random coeffi cient models where the distribution of the random coeffi cients is nonparametrically estimated. Lewbel (Boston College) Identification Zoo / 75

72 7.4 Bayesian Identification Ordinary identification is sometimes called frequentist identification or sampling identification, to contrast with Bayesian identification. Bayesian: parameter θ is random, has a prior and a posterior. Lindley (1971): identification is irrelevant for Bayesians: has no effect on whether one can specify a prior and obtain a poserior. Florens, Mouchart, and Rolin (1990), Florens and Simoni (2011): θ is Bayes identified if its posterior differs from its prior distribution. So θ is Bayes identified if data provides any info that updates the prior. Lewbel (Boston College) Identification Zoo / 75

73 Ordinary identification implies Bayes identification, because then the population updates the prior to a degenerate distribution. Set identified parameters are also typically Bayes identified. Enough info to confine θ to a set should be enough to update a prior. Example: Moon and Schorfheide (2012): When θ is set identified, the support of the posterior will typically lie inside the identified set. Intuition: The data tells us nothing about where θ could lie inside the identified set, but the prior provides information inside the set, and the posterior reflects that information. Lewbel (Boston College) Identification Zoo / 75

74 8. Conclusions Why is identification a rapidly growing area within econometrics (as evidenced by the growing terminology)? Rise of big data and computation creating ever more complicated models needing identification. Examples of increasing complexity: games and auction models, social interactions models, forward looking dynamic models, and models with nonseparable errors assigning behavioral meaning to unobservables. At the other extreme, the so-called credibility revolution: search for sources of randomization in constructed or natural experiments, to gain identification under minimal modeling assumptions. Lewbel (Boston College) Identification Zoo / 75

75 Unlike statistical inference, there is not a large body of general tools or techniques for proving identification. Identification theorems tend to be highly model specific. Some general techniques for identification: Control function methods as in Blundell and Powell (2004), Special regressors as in Lewbel (2000), Completeness as in Newey and Powell (2003), Observational equivalence characterizations as in Matzkin (2005). Characterizations of moments over unobservables as in Schennach (2014). Development of more such general techniques and principles would be a valuable area for future research. Given the increasing recognition of the importance of identification in econometrics, the Identification zoo is likely to keep expanding. Lewbel (Boston College) Identification Zoo / 75

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

More information

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers

More information

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011 IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS By Steven T. Berry and Philip A. Haile March 2011 Revised April 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1787R COWLES FOUNDATION

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

More information

Persuasion by Cheap Talk - Online Appendix

Persuasion by Cheap Talk - Online Appendix Persuasion by Cheap Talk - Online Appendix By ARCHISHMAN CHAKRABORTY AND RICK HARBAUGH Online appendix to Persuasion by Cheap Talk, American Economic Review Our results in the main text concern the case

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar

More information

THE IMPACT OF 401(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS

THE IMPACT OF 401(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS THE IMPACT OF 41(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS VICTOR CHERNOZHUKOV AND CHRISTIAN HANSEN Abstract. In this paper, we use the instrumental quantile

More information

CPC/CPA Hybrid Bidding in a Second Price Auction

CPC/CPA Hybrid Bidding in a Second Price Auction CPC/CPA Hybrid Bidding in a Second Price Auction Benjamin Edelman Hoan Soo Lee Working Paper 09-074 Copyright 2008 by Benjamin Edelman and Hoan Soo Lee Working papers are in draft form. This working paper

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

More information

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR) 2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

You Are What You Bet: Eliciting Risk Attitudes from Horse Races

You Are What You Bet: Eliciting Risk Attitudes from Horse Races You Are What You Bet: Eliciting Risk Attitudes from Horse Races Pierre-André Chiappori, Amit Gandhi, Bernard Salanié and Francois Salanié March 14, 2008 What Do We Know About Risk Preferences? Not that

More information

Generalized Random Coefficients With Equivalence Scale Applications

Generalized Random Coefficients With Equivalence Scale Applications Generalized Random Coefficients With Equivalence Scale Applications Arthur Lewbel and Krishna Pendakur Boston College and Simon Fraser University original Sept. 2011, revised June 2012 Abstract We propose

More information

Non Parametric Inference

Non Parametric Inference Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

More information

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015 ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

More information

Labor Economics, 14.661. Lecture 3: Education, Selection, and Signaling

Labor Economics, 14.661. Lecture 3: Education, Selection, and Signaling Labor Economics, 14.661. Lecture 3: Education, Selection, and Signaling Daron Acemoglu MIT November 3, 2011. Daron Acemoglu (MIT) Education, Selection, and Signaling November 3, 2011. 1 / 31 Introduction

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: [email protected]

More information

On Marginal Effects in Semiparametric Censored Regression Models

On Marginal Effects in Semiparametric Censored Regression Models On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

A Simple Model of Price Dispersion *

A Simple Model of Price Dispersion * Federal Reserve Bank of Dallas Globalization and Monetary Policy Institute Working Paper No. 112 http://www.dallasfed.org/assets/documents/institute/wpapers/2012/0112.pdf A Simple Model of Price Dispersion

More information

Please Call Again: Correcting Non-Response Bias in Treatment Effect Models

Please Call Again: Correcting Non-Response Bias in Treatment Effect Models DISCUSSION PAPER SERIES IZA DP No. 6751 Please Call Again: Correcting Non-Response Bias in Treatment Effect Models Luc Behaghel Bruno Crépon Marc Gurgand Thomas Le Barbanchon July 2012 Forschungsinstitut

More information

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Stochastic Inventory Control

Stochastic Inventory Control Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

More information

Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets

Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets Nathaniel Hendren January, 2014 Abstract Both Akerlof (1970) and Rothschild and Stiglitz (1976) show that

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Inference of Probability Distributions for Trust and Security applications

Inference of Probability Distributions for Trust and Security applications Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: [email protected] Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

All you need is LATE Primary Job Market Paper

All you need is LATE Primary Job Market Paper All you need is LATE Primary Job Market Paper Clément de Chaisemartin November 2012 Abstract Instrumental variable (IV) is a standard tool to measure the effect of a treatment. However, it relies on a

More information

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.

More information

Online Appendix Feedback Effects, Asymmetric Trading, and the Limits to Arbitrage

Online Appendix Feedback Effects, Asymmetric Trading, and the Limits to Arbitrage Online Appendix Feedback Effects, Asymmetric Trading, and the Limits to Arbitrage Alex Edmans LBS, NBER, CEPR, and ECGI Itay Goldstein Wharton Wei Jiang Columbia May 8, 05 A Proofs of Propositions and

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

The equivalence of logistic regression and maximum entropy models

The equivalence of logistic regression and maximum entropy models The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/20/09/the-simplerderivation-of-logistic-regression/

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Chapter 7. Sealed-bid Auctions

Chapter 7. Sealed-bid Auctions Chapter 7 Sealed-bid Auctions An auction is a procedure used for selling and buying items by offering them up for bid. Auctions are often used to sell objects that have a variable price (for example oil)

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

2. Information Economics

2. Information Economics 2. Information Economics In General Equilibrium Theory all agents had full information regarding any variable of interest (prices, commodities, state of nature, cost function, preferences, etc.) In many

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Inflation. Chapter 8. 8.1 Money Supply and Demand

Inflation. Chapter 8. 8.1 Money Supply and Demand Chapter 8 Inflation This chapter examines the causes and consequences of inflation. Sections 8.1 and 8.2 relate inflation to money supply and demand. Although the presentation differs somewhat from that

More information

Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

More information

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 E-Mail: [email protected]

More information

Economics 1011a: Intermediate Microeconomics

Economics 1011a: Intermediate Microeconomics Lecture 12: More Uncertainty Economics 1011a: Intermediate Microeconomics Lecture 12: More on Uncertainty Thursday, October 23, 2008 Last class we introduced choice under uncertainty. Today we will explore

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania Moral Hazard Itay Goldstein Wharton School, University of Pennsylvania 1 Principal-Agent Problem Basic problem in corporate finance: separation of ownership and control: o The owners of the firm are typically

More information

Not Only What But also When: A Theory of Dynamic Voluntary Disclosure

Not Only What But also When: A Theory of Dynamic Voluntary Disclosure Not Only What But also When: A Theory of Dynamic Voluntary Disclosure Ilan Guttman, Ilan Kremer, and Andrzej Skrzypacz Stanford Graduate School of Business September 2012 Abstract The extant theoretical

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Imbens/Wooldridge, Lecture Notes 5, Summer 07 1

Imbens/Wooldridge, Lecture Notes 5, Summer 07 1 Imbens/Wooldridge, Lecture Notes 5, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 5, Monday, July 30th, 4.30-5.30pm Instrumental Variables with Treatment Effect Heterogeneity: Local

More information

Quantile Regression under misspecification, with an application to the U.S. wage structure

Quantile Regression under misspecification, with an application to the U.S. wage structure Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The

More information

FIRST YEAR CALCULUS. Chapter 7 CONTINUITY. It is a parabola, and we can draw this parabola without lifting our pencil from the paper.

FIRST YEAR CALCULUS. Chapter 7 CONTINUITY. It is a parabola, and we can draw this parabola without lifting our pencil from the paper. FIRST YEAR CALCULUS WWLCHENW L c WWWL W L Chen, 1982, 2008. 2006. This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990. It It is is

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games 6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games Asu Ozdaglar MIT February 4, 2009 1 Introduction Outline Decisions, utility maximization Strategic form games Best responses

More information

Introduction to time series analysis

Introduction to time series analysis Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

More information

Magne Mogstad and Matthew Wiswall

Magne Mogstad and Matthew Wiswall Discussion Papers No. 586, May 2009 Statistics Norway, Research Department Magne Mogstad and Matthew Wiswall How Linear Models Can Mask Non-Linear Causal Relationships An Application to Family Size and

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Revised Version of Chapter 23. We learned long ago how to solve linear congruences. ax c (mod m)

Revised Version of Chapter 23. We learned long ago how to solve linear congruences. ax c (mod m) Chapter 23 Squares Modulo p Revised Version of Chapter 23 We learned long ago how to solve linear congruences ax c (mod m) (see Chapter 8). It s now time to take the plunge and move on to quadratic equations.

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Introduction to nonparametric regression: Least squares vs. Nearest neighbors Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information