Introduction to Quantile Regression


 MargaretMargaret Lee
 1 years ago
 Views:
Transcription
1 Introduction to Quantile Regression CHUNGMING KUAN Department of Finance & CRETA National Taiwan University June 13, 2011 C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
2 Lecture Outline 1 Introduction 2 Quantile Regression Quantiles Quantile Regression (QR) Method QR Models 3 Algebraic Properties Equivariance Gooness of Fit 4 Asymptotic Properties Heuristics QR Estimator as a GMM Estimator Asymptotic Distribution C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
3 Lecture Outline (cont d) 5 Estimation of Asymptotic Covariance Matrix 6 Hypothesis Testing Wald Tests Likelihood Ratio Tests 7 Treatment Effect Average Treatment Effect (ATE) Quantile Treatment Effect (QTE) 8 Empirical Applications ReturnVolume Relations Effects of NHI on Saving C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
4 Introduction The behavior of a random variable is governed by its distribution. Moment or summary measures: Location measures: mean, median Dispersion measures: variance, range Other moments: skewness, kurtosis, etc. Quantiles: quartiles, deciles, percentiles Except in some special cases, a distribution can not be completely characterized by its moments or by a few qunatiles. Mean and median characterize the average and center of y but may provide little info about the tails. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
5 Conventional Methods For the behavior of y conditional on x, consider regression y t = x tβ + e t. Least squares (LS): Legendre (1805) Minimizing T t=1 (y t x tβ) 2 to obtain ˆβ T. x ˆβ T approximates the conditional mean of y given x. Least absolute deviation (LAD): Boscovich (1755) Minimizing T t=1 y t x tβ to obtain ˇβ T. x ˇβT approximates the conditional median of y given x. Both the LS and LAD methods provide only partial description of the conditional distribution of y. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
6 Mosteller F. and J. Tukey, Data Analysis and Regression: What the regression curve does is (to) give a grand summary for the averages of the distributions corresponding to the set of xs. We could go further and compute several different regression curves corresponding to the various percentage points of the distributions and thus get a more complete picture of the set. Ordinarily this is not done, and so regression often gives a rather incomplete picture. Just as the mean gives an incomplete picture of a single distribution, so the regression curve gives a correspondingly incomplete picture for a set of distributions. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
7 Quantiles The θ th (0 < θ < 1) quantile of F Y is q Y (θ) := F 1 Y (θ) = inf{y : F Y (y) θ}. q Y (θ) is an order statistic, and it can also be obtained by minimizing an asymmetric (linear) loss function: θ y q df Y (y) + (1 θ) y>q y<q y q df Y (y). The first order condition of this minimization problem is 0 = θ df Y (y) + (1 θ) df Y (y) y>q y<q = θ[1 F Y (q)] + (1 θ)f Y (q) = θ + F Y (q). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
8 Sample Quantiles The sample counterpart of the asymmetric linear loss function is 1 T T ρ θ (y t q) = 1 [θ y T t q + (1 θ) t=1 t:y t q t:y t<q where ρ θ (u) = (θ 1 {u<0} )u is known as the check function. ] y t q, Given θ, minimizing this function yields the θ th sample quantile of y. Key point: Other than sorting the data, a sample quantile can also be found via an optimization program. Given various θ values, we can compute a collection of sample quantiles, from which the distribution can be traced out. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
9 ρ θ θ =0.8 θ =0.5 θ =0.2 Figure: Check function ρ θ (u) = (θ 1 {u<0} )u. u C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
10 Quantile Regression (QR) Method Koenker and Basset (1978) Given y t = x tβ + e t, the θ th QR estimator ˆβ(θ) minimizes V T (β; θ) = 1 T T ρ θ (y t x tβ) t=1 where ρ θ (e) = (θ 1 {e<0} )e. For θ = 0.5, V T is symmetric, and ˆβ(0.5) is the LAD estimator. x ˆβ(θ) approximates the θ th conditional quantile function Q y x (θ), with ˆβ i (θ) the estimated marginal effect of the i th regressor on Q y x (θ). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
11 Finding the Solution to V T Difficulties in estimation: The QR estimator ˆβ(θ) does not have a closed form. V T is not everywhere differentiable, so that standard numerical algorithms do not work. A minimizer of V T (β; θ) is such that the directional derivatives at that point are nonnegative in all directions w: d dδ V T (β + δw; θ) = 1 δ=0 T T ψθ (y t x tβ, x tw)x tw, t=1 ψ θ (a, b) = θ 1 {a<0} if a 0, ψ θ (a, b) = θ 1 {b<0} if a = 0. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
12 Let b be the point such that y t = x tb for t = 1,..., k. This is a minimizer of V k because the directional derivative is 1 k k ( ) θ 1{ x t x w<0} t w, t=1 which must be nonnegative for any w. Thus, b a basic solution to the minimization of V T. Other basic solutions: b(κ) = X(κ) 1 y(κ), each yielding a perfect fit of k observations. The desired estimator ˆβ(θ) can be obtained by searching among those basic solutions, for which a linear programming algorithm will do. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
13 Linear Programming y = Xβ + e can be expressed as y = X(β + β ) + (e + e ) = Az, where A = [X, X, I T, I T ] and z = [ β +, β, e +, e ], with β + and β the positive and negative parts of β, respectively. Let c = [0, 0, θ1, (1 θ)1 ]. Minimizing V T (β; θ) with respect to β is equivalent to the following linear program: 1 min z T c z, s.t. y = Az, z 0. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
14 Some Remarks ˆβ(θ) is also the QMLE based on an asymmetric Laplace density: f θ (e) = θ(1 θ) exp[ ρ θ (e)]. Due to linear loss function, ˆβ(θ) is more robust to outliers than the LS estimator. The estimated θ th quantile regression hyperplane must interpolate k observations in the sample. (Why?) QR is not the same as the regressions based on split samples because every quantile regression utilizes all sample data (with different weights). Thus, QR also avoids the sample selection problem arising from sample splitting. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
15 QR: Location Shift Model DGP: y t = x tβ o + ε t = β 0 + x tβ 1 + ε t, where ε t are i.i.d. with the common distribution function F ε. The θth quantile function of y is Q y x (θ) = β 0 + x β 1 + Fε 1 (θ), and hence quantile functions differ only by the intercept term and are a vertical displacement of one another. The model can also be expressed as y t = [β 0 + Fε 1 (θ)] + x } {{ } tβ 1 + ε t,θ, β 0 (θ) where Q εθ x(θ) = 0. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
16 QR: LocationScale Shift Model DGP: y t = x tβ o + (x tγ o )ε t, where ε t are i.i.d. with the df F ε. The θ th quantile function of y is Q y x (θ) = x tβ o + (x tγ o )Fε 1 (θ), and hence quantile functions differ not only by the intercept but also the slope term. The model can also be expressed as y t = x t[β o + γ o Fε 1 (θ)] + ε } {{ } t,θ, β(θ) where Q εθ x(θ) = 0. The QR estimator ˆβ(θ) converges to β(θ), and x ˆβ(θ) approximates the θ th quantile function of y given x, Q y x (θ). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
17 Algebraic Properties: Equivariance Let ˆβ(θ) be the QR estimator of the quantile regression of y t on x t. Scale equivariance: For y t = c y t, let ˆβ (θ) be the QR estimator of the quantile regression of y t on x t. For c > 0, ˆβ (θ) = c ˆβ(θ). For c < 0, ˆβ (1 θ) = c ˆβ(θ). ˆβ (0.5) = c ˆβ(0.5), regardless of the sign of c. Shift equivariance: For y t = y t + x tγ, let ˆβ (θ) be the QR estimator of the quantile regression of y t on x t. Then, ˆβ (θ) = ˆβ(θ) + γ. Equivariance to reparameterization of design: Given X = XA for some nonsingular A, ˆβ (θ) = A 1 ˆβ(θ). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
18 Equivariance to monotonic transformations: For a nondecreasing function h, IP{y a} = IP{h(y) h(a)}, so that Q h(y) x (θ) = h ( Q y x (θ) ). Note that the expectation operator does not have this property because IE[h(y)] h(ie(y)) in general, except when h is linear. Example: If x β is the θ th conditional quantile of ln y, then exp(x β) is the θ th conditional quantile of y. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
19 Goodness of Fit Specification: y t = x 1t β 1 + x 2t β 2 + e t. A measure of the relative contribution of additional regressors x 2t is 1 V T (ˆβ1 (θ), ˆβ 2 (θ); θ ) V T ( β1 (θ), 0; θ ), where V T ( β 1 (θ), 0; θ ) is computed under the constraint β 2 = 0. A measure of the goodnessoffit of a specification is thus R 1 (θ) = 1 V T (ˆβ(θ); θ ) ). V T (ˆq(θ), 0; θ where ˆq(θ) is the sample quantile and V T (ˆq(θ), 0; θ ) is obtained from the model with the constant term only. Clearly, 0 < R 1 (θ) < 1. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
20 Asymptotic Properties: Heuristics Ignoring y t = q, the FOC of minimizing T 1 T t=1 ρ θ(y t q) is g T (q) := 1 T T (1 {yt<q} θ). t=1 Clearly, g T (q) is nondecreasing in q (why?), so that ˆq(θ) > q iff g T (q) < 0. Thus, We have IP [ T (ˆq(θ) q(θ) > c ] = IP [ g T ( q(θ) + c/ T ) < 0 ]. IE [g T ( q(θ) + ( var [g T q(θ) + c )] ( = F q(θ) + T c ) θ f (q(θ)) c T T c )] = 1 T T F (1 F ) 1 θ(1 θ). T C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
21 Setting λ 2 = θ(1 θ)/f 2 (q(θ)), IP [ T (ˆq(θ) q(θ) ) > c ] [ ( ) gt q(θ) + c/ T = IP θ(1 θ)/t < 0 [ gt ( q(θ) + c/ T ) = IP c θ(1 θ)/t λ < c λ [ ( ) gt q(θ) + c/ T f (q(θ))c/ T = IP θ(1 θ)/t ] ] < c λ ] D 1 Φ(c/λ), by a CLT. This implies T (ˆq(θ) q(θ) ) D N (0, λ 2 ). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
22 GMM Estimation Given q moment conditions IE[m(w t ; β o )] = 0, β o (k 1) is exactly identified if q = k and overidentified if q > k. When β o is exactly identified, the GMM estimator ˆβ of β o solves T 1 T t=1 m(w t; β) = 0. Asymptotic Distribution of the GMM Estimator Given the GMM estimator ˆβ of β o, T (ˆβ β o ) A N ( 0, G 1 o Σ o G 1 o with Σ o = IE[m(w t ; β o )m(w t ; β o ) ], and 1 T T IP β m(w t ; β o ) G o := IE[ β m(w t ; β o )]. t=1 ), C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
23 QR Estimator as a GMM Estimator The QR estimator ˆβ(θ) satisfies the asymptotic FOC : 1 T T t=1 ϕ θ (y t x t ˆβ(θ)) := 1 T T t=1 The (approximate) estimating function is thus 1 T T ( ) x t θ 1{yt x t. β<0} t=1 The expectation of the estimating function is x t ( θ 1{yt x t ˆβ(θ)<0}) = oip (1). IE { x t [ θ IE ( 1{yt x t β<0} x t )]} = IE { xt [ θ Fy x (x tβ) ]}. When β is evaluated at β(θ), F y x (x tβ) must be θ so that the moment conditions are IE [ ϕ θ ( yt x tβ(θ) )] = 0. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
24 Asymptotic Distribution When integration and differentiation can be interchanged, G(β) = IE [ β ϕ θ (y t x tβ) ] = β IE { x t [ θ Fy x (x tβ) ]} = IE [ x t x tf y x (x tβ) ]. Then, G(β(θ)) = IE [ x t x tf eθ x(0) ]. 1 {yt x t β(θ)<0} is Bernoulli with mean θ and variance θ(1 θ), so that ( Σ(β) = IE x t x t IE [( ) 2 ] ) θ 1 {yt x t β<0} xt. Then, Σ(β(θ)) = θ(1 θ) IE(x t x t) =: θ(1 θ)m xx. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
25 Asymptotic Normality of the QR Estimator T [ˆβ(θ) β(θ) ] D N (0, θ(1 θ)g ( β(θ) ) 1 Mxx G ( β(θ) ) 1 ), where M xx = IE(x t x t) and G(β(θ)) = IE [ x t x tf eθ x(0) ]. Conditional heterogeneity is characterized by the conditional density f eθ x(0) in G(β(θ)), which is not limited to heteroskedasticity. If f eθ x(0) = f eθ (0), i.e., conditional homogeneity, T [ˆβ(θ) β(θ) ] ( D N 0, ) θ(1 θ) [f eθ (0)] 2 M 1 xx. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
26 Estimation of Asymptotic Covariance Matrix Consistent estimation of D(β(θ)) = G ( β(θ) ) 1 Mxx G ( β(θ) ) 1. Estimation of M xx : M T = T 1 T t=1 x tx t. Digression: Differentiating both sides of F (F 1 (θ)) = θ: df 1 (θ) dθ = 1 f (F 1 =: s(θ), (θ)) differentiating a quantile function yields a sparsity function. Estimating the sparsity function: Using a difference quotient of empirical quantiles ŝ T (θ) = [ F 1 T (θ + h 1 T ) F T (θ h T ) ] /(2h T ). 1 F T (θ): Letting ê (i) be the i th order statistic of QR residuals ê t, F 1 T (τ) = ê (i), τ [(i 1)/T, i/t ). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
27 Hendricks and Koenker (1991): Estimating f e(θ) x (0) in G(β(θ)) by ˆf t = x t 2h T [ˆβ(θ + h T ) ˆβ(θ h T ) ], and estimating G by ĜT = 1 T T t=1 ˆf t x t x t. Powell (1991): Estimating G(β(θ)) by ĜT = 1 2Tc T T 1 { êt(θ) <ct }x t x t, t=1 where c T 0 and T 1/2 c T as T. STATA: Bootstrap C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
28 Standard Wald Test H 0 : Rβ(θ) = r, where R is q k and r is q 1. T [ˆβ(θ) β(θ) ] D ( ) N 0, θ(1 θ)d(β(θ)). Under the null, T R (ˆβ(θ) β(θ) ) = T ( Rˆβ(θ) r ) where Γ(β(θ)) = RD(β(θ))R. The Null Distribution of the Wald Test D N ( 0, θ(1 θ)γ(β(θ)) ), W T (θ) = T [ Rˆβ(θ) r ] Γ(θ) 1 [ Rˆβ(θ) r ] /[θ(1 θ)] D χ 2 (q), where Γ(θ) = R D(θ)R, with D(θ) a consistent estimator of D(β(θ)). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
29 SupWald Test H 0 : Rβ(θ) = r for all θ S (0, 1). The Brownian bridge: B q (θ) d = [θ(1 θ)] 1/2 N (0, I q ), and hence Γ(θ) 1/2 T [ Rˆβ(θ) r ] Thus, W T (θ) D Bq (θ). D B q (θ)/ θ(1 θ) 2, uniformly in θ. The Null Distribution of the SupWald Test sup θ S B q (θ) 2 W T (θ) sup, θ(1 θ) θ S where S is a compact set in (0, 1). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
30 To test Rβ(θ) = r, θ [a, b], set a = θ 1 <... < θ n = b and compute sup W T = sup W T (θ i ). i=1,...,n Koenker and Machado (1999): [a, b] = [ɛ, 1 ɛ] with ɛ small. For s = θ/(1 θ), B(θ)/ θ(1 θ) = d W (s)/ s, so that IP sup B q (θ) 2 { } < c θ(1 θ) = IP W sup q (s) 2 < c, s θ [a,b] with s 1 = a/(1 a), s 2 = b/(1 b). s [1,s 2 /s 1 ] Some critical values were tabulated in DeLong (1981) and Andrews (1993); the other can be obtained via simulations. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
31 Likelihood Ratio Tests Let ˆβ(θ) and β(θ) be the constrained and unconstrained estimators and ˆV T (θ) = V T (ˆβ(θ); θ) and Ṽ T (θ) = V T ( β(θ); θ) be the corresponding objective functions. Given the asymmetric Laplace density: f θ (u) = θ(1 θ) exp[ ρ θ (u)], the loglikelihood is T L T (β; θ) = T log(θ(1 θ)) ρ θ (y t x tβ). t=1 2 times the loglikelihood ratio is 2 [ L T (ˆβ(θ); θ) L T ( β(θ); θ) ] = 2 [ Ṽ T (θ) ˆV T (θ) ]. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
32 Koenker and Machado (1999): LR T (θ) = 2[ Ṽ T (θ) ˆV T (θ) ] θ(1 θ)[f eθ (0)] 1 D χ 2 (q). This test is also known as the quantile ρ test. Koenker and Bassett (1982): For median regression, LR T (0.5) = 8[ Ṽ T (0.5) ˆV T (0.5) ] because f e0.5 (0) = 1/4. [f e0.5 (0)] 1 = 2 [ Ṽ T (0.5) ˆV T (0.5) ], C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
33 Average Treatment Effect (ATE) Evaluating the impact of a treatment (program, policy, intervention). Let D be the binary indicator of treatment and X be covariates. Y 1 (Y 0 ) is the potential outcome when an agent is (is not) exposed to the treatment. The observed outcome is Y = DY 1 + (1 D)Y 0. We observe only one potential outcome (Y 1i or Y 0i ) and hence can not identify the individual treatment effect, Y 1i Y 0i. We may estimate the ATE: IE(Y 1 Y 0 ). Under conditional independence: (Y 1, Y 0 ) D X, IE(Y D = 1, X ) IE(Y D = 0, X ) = IE(Y 1 Y 0 X ), so that the ATE is IE(Y 1 Y 0 ) = IE[IE(Y 1 Y 0 X )]. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
34 Using the sample counterpart of IE(Y D = 1, X ) IE(Y D = 0, X ) we have ÂTE = 1 N [ˆµ1 (X N i ) ˆµ 0 (X i ) ]. i=1 For the dummyvariable regression: Y i = α + D i γ + X i β +e } {{ } i, i = 1,..., n, µ D the LS estimate of γ is ÂTE. Other estimators: Kernel matching, nearest neighbor matching, propensity score matching (based on p(x) = IP(D = 1 X = x)), etc. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
35 Quantile Treatment Effect (QTE) Let F 0 and F 1 be, resp., the distributions of control and treatment responses. Let (η) be the horizontal shift from F 0 to F 1 : F 0 (η) = F 1 (η + (η)). Then, (η) = F 1 1 (F 0 (η)) η, and the θ th QTE is, for F 0 (η) = θ, QTE(θ) = F 1 1 (θ) F 1 0 (θ) = q Y1 (θ) q Y0 (θ), the difference between the quantiles of two distributions. We may apply the QR method to Y i = α + D i γ + X i β + e i, the resulting QR estimate ˆγ(θ) is the estimated θ th QTE. Other: A weighting estimator based on the propensity score. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
36 Difference in Differences The impact of a program (policy) may be observed after certain period of time. To identify the true treatment effect, the potential change due to time (other factors) must be excluded first. Define the following dummy variables: (i) D i,τ = 1 if the i th individual receives the treatment; (ii) D i,a = 1 if the i th individual s in the postprogram period; (iii) D i,aτ = D i,τ D i,a. Model: Y i = α + α 1 D i,τ + α 2 D i,a + α 3 D i,aτ + X i β + e i. For the treatment group in pre and postprogram periods, the time effect is α 2 + α 3. For the control group in pre and postprogram periods, the time effect is α 2. The treatment effect is the difference between these two effects: α 3. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
37 Empirical Study: ReturnVolume Relations Granger noncausality is defined in terms of distribution. Existing tests focus on its implications: Noncausality in mean (linear model): Granger (1969, 1980). Noncausality in variance: Granger et al. (1986), Cheung and Ng (1996). Nonlinear causality: Hiemstra and Jones (1994). Noncausality in some quantiles: Lee and Yang (2006), Hong et al. (2006). Empirical studies on returnvolume relations are typically based on leastsquares regressions and find that volume does not Granger cause return. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
38 Notions of Granger NonCausality x does not Granger cause y in distribution if F yt (η (Y, X ) t 1 ) = F yt (η Y t 1 ), η R. x does not Granger cause y in mean if IE[y t (Y, X ) t 1 ] = IE(y t Y t 1 ). x does not Granger cause y in all quantiles if Q yt (τ (Y, X ) t 1 ) = Q yt (τ Y t 1 ), τ [0, 1]; cf. Lee and Yang (2006), Hong et al. (2006). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
39 Notions of Granger NonCausality x does not Granger cause y in distribution if F yt (η (Y, X ) t 1 ) = F yt (η Y t 1 ), η R. x does not Granger cause y in mean if IE[y t (Y, X ) t 1 ] = IE(y t Y t 1 ). x does not Granger cause y in all quantiles if Q yt (τ (Y, X ) t 1 ) = Q yt (τ Y t 1 ), τ [0, 1]; cf. Lee and Yang (2006), Hong et al. (2006). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
40 Notions of Granger NonCausality x does not Granger cause y in distribution if F yt (η (Y, X ) t 1 ) = F yt (η Y t 1 ), η R. x does not Granger cause y in mean if IE[y t (Y, X ) t 1 ] = IE(y t Y t 1 ). x does not Granger cause y in all quantiles if Q yt (τ (Y, X ) t 1 ) = Q yt (τ Y t 1 ), τ [0, 1]; cf. Lee and Yang (2006), Hong et al. (2006). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
41 Causal Relations between Return and Volume Chuang, Kuan, and Lin (2009, JBF): 2 stock market indices: NYSE and S&P 500, from Jan 1990 to June 30, 2006 with 4135 and 4161 obs. Model: r t = a(τ)+b(τ) t ( t ) 2+ q q T +c(τ) α T j (τ)r t j + β j (τ) ln v t j +e t, where β j (τ) represents the quantile causal effect of ln v t j on r t, cf. Gallant, Rossi, and Tauchen (1992). Null hypothesis: β j (τ) = 0 for all τ in (0, 1), i.e., Granger noncausality in quantiles. A supwald test will do. j=1 j=1 C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
42 S&P 500: β 1 (τ) S&P 500: β 2 (τ) Figure: QR and LS estimates of the causal effects of log volume on return. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
43 NYSE: β 1 (τ) NYSE: β 2 (τ) NYSE: β 3 (τ) C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
44 A Summary There is no causality in mean, but there are significant and heterogeneous causal effects of volume on return quantiles. Causal effects have opposite signs at the two sides of dist. Causal effects are stronger at more extreme quantiles. The pairwise causal effects are symmetric about the median. With log volume (return) on the vertical (horizontal) axis, causal relations exhibit symmetric V shapes across quantiles, cf. Karpoff (1987), Gallant et al. (1992), and Blume et al. (1994). This implies that return dispersion (volatility) increases with volume. These results remain valid when the model includes rt j 2, and they are robust in different sample periods. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
45 NYSE: β 1 (τ) S&P 500: β 1 (τ) Figure: QR and LS estimates of the causal effects of log volume on return: C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
46 NYSE: β 1 (τ) S&P 500: β 1 (τ) Figure: QR and LS estimates of the causal effects of log volume on return: C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
47 Empirical Study: Effects of NHI on Saving There were 3 major health insurance programs in Taiwan: Labor Insurance (1950), Government Employees Insurance (1958), and Farmers Health Insurance (1985). Government Employees Insurance (GEI): Covers the employees (including retirees) in the public sector; the coverage was extended to spouses in 1982, to parents in 1989, and to children in Labor Insurance: Covers only the employees in the private sector whose ages are 15 60, but not their spouse, parents, and children. The NHI launched in 1995 covers every citizen in Taiwan without discrimination, and its coverage is the same as that of GEI. We want to examine the effect of the enforcement of NHI on the precautionary saving in Taiwan. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
48 Data from the SFIE The data are taken from the Survey of Family Income and Expenditure (SFIE). This nationwide survey is not panel and uses new sampling every year. The sample size is about 16,000 each year in the early 90 s and drops to about 14,000 from the mid90 s. The modules in the SFIE include individual sociodemographic and socioeconomic characteristics along with income and outlays of each income earner within the household. The SFIE provides details on each household member s income sources and job attributes that are crucial for estimation of permanent income and identification of treatment/control groups. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
49 We employ 10 waves of SFIE from 1990 through 2000 (excluding 1995), with and as the periods before and after the enforcement of NHI. Our sample includes only the households with nonfarm heads aged years. There are 68,738 and 58,355 observations for the periods and , respectively. While the average household real income increases moderately from NTD 0.97 million during to more than NTD 1.2 million during , the corresponding average saving rates decrease by 50%, from 0.23 during to 0.12 during C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
50 Treatment and Control Groups Groups are partitioned according to the working status of household head and his/her spouse. 1 Case 1: 2 Case 2: Control: At least one of head and spouse in the public sector; Treatment: Neither head nor spouse in the public sector. Strict control: Head and spouse in the public sector; Quasi control: One of head and spouse in the public sector; Treatment 1: Head and spouse in the private sector; Treatment 2: One of head and spouse in the private sector. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
51 Unconditional Group Means and ATE Sample Obs. Mean S.D. Obs. Mean S.D. Difference Full Sample 68, , (0.001) Control 13, , (0.003) Treatment 55, , (0.002) Strict Control 2, , (0.006) Quasi Control 11, , (0.003) Treatment 1 24, , (0.002) Treatment 2 30, , (0.003) C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
52 Comparison with Chou, Liu, & Hammitt (2003) 1 Data: CLH03 include 1995 but we do not. CLH03 exclude the sample with negative saving but we do not. Note that such sample was 11.8% in but jumped to 27.2% in , leading to an average of 18.9% of the entire sample. 2 Dependent variable: CLH03 uses ln(y C), but we approximate the saving rate by ln Y ln C. 3 Covariates: We partition some covariates (income and head age) into groups and study the groupwise ATE and QTEs. 4 Model: CLH03 make pairwise comparisons, but we estimate a model with multiple treatment groups. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
53 Empirical Models 1 Control vs. one treatment group: S i = α + X i β + D i γ + (NHI i )δ + [(NHI i ) D i ]ζ + e i. 2 Control vs. multiple treatment groups: S i = α + X i β D i (j)γ j + (NHI i )δ j=0 2 [(NHI i ) D i (j)]ζ j + e i. j=0 where D(j) are the indicators of the jth treatment group (j = 0 for quasicontrol group, j = 1 and 2 for treatment 1 and 2 groups). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
54 1 To explore possible heterogeneity of the treatment effects, we partition the data into 5 permanent income groups (based on income quintiles) and 5 age groups (20 29,..., 60 69). We estimate the following model: S i = α + X i β D i K i (j)γ j + j=1 5 (NHI i K i (j))δ j j=1 5 [(NHI i ) D i K i (j)]ζ j + e i, j=1 where D is the binary indicator of the treatment group, and K(j), j = 1,..., 5, are the indicators of 5 income groups or 5 age groups. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
55 Empirical Result theta Figure: DID estimates of the ATE and QTE: 1 treatment group. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
56 Empirical Result theta theta theta Figure: DID estimates of the ATE and QTE: Quasi control (left), treatment 1 (middle), and treatment 2 (right). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
57 Empirical Result Yq1 Yq2 Yq3 Yq4 Yq Figure: DID estimates of the ATE and QTE: 5 income groups (left) and 5 age groups (right). C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
58 A Summary The NHI has significantly negative (crowdout) effect on precautionary saving, and such impact (QTE) is stronger for higher savers (magnitude increasing with saving quantile). The estimated conditional ATEs are quite close to the unconditional ATEs. For example, for the treatment 1 and treatment 2 groups, the estimated ATEs are 4.3% and 3.5% (vs. unconditional ATEs: 4.3% and 4.1%). For the quasi control group, the ATE is insignificant. The QTE patterns for treatment 1 and 2 groups are opposite to those of CLH03. The negative impact is the largest for the highest income group and the oldest (age 60 69) group, cf. CLH04. C.M. Kuan (Finance & CREAT, NTU) Intro. to Quantile Regression June 13, / 56
Effects of National Health Insurance on precautionary saving: new evidence from Taiwan
Empir Econ (2013) 44:921 943 DOI 10.1007/s0018101105335 Effects of National Health Insurance on precautionary saving: new evidence from Taiwan ChungMing Kuan ChienLiang Chen Received: 22 January 2010
More informationEconometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England
Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulationbased method for estimating the parameters of economic models. Its
More informationLecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationDetekce změn v autoregresních posloupnostech
Nové Hrady 2012 Outline 1 Introduction 2 3 4 Change point problem (retrospective) The data Y 1,..., Y n follow a statistical model, which may change once or several times during the observation period
More informationLecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More information**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.
**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationAssessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope
Assessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope Dukpa Kim Boston University Pierre Perron Boston University December 4, 2006 THE TESTING
More informationQuantile Treatmenet Effects of National Health Insurance on Household Precautionary Saving: Evidence from a Regime Switch in Taiwan
Quantile Treatmenet Effects of National Health Insurance on Household Precautionary Saving: Evidence from a Regime Switch in Taiwan CHUNGMING KUAN a Institute of Economics Academia Sinica CHIENLIANG
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationTESTING THE ONEPART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWOPART MODEL
TESTING THE ONEPART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWOPART MODEL HARALD OBERHOFER AND MICHAEL PFAFFERMAYR WORKING PAPER NO. 201101 Testing the OnePart Fractional Response Model against
More informationRegression III: Advanced Methods
Lecture 4: Transformations Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture The Ladder of Roots and Powers Changing the shape of distributions Transforming
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationNumerical Summarization of Data OPRE 6301
Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit nonresponse. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationTechnology StepbyStep Using StatCrunch
Technology StepbyStep Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
More informationNonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity HalfDay : Improved Inference
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN Linear Algebra Slide 1 of
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationReview of Random Variables
Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal  the stuff biology is not
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationLOGNORMAL MODEL FOR STOCK PRICES
LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as
More informationName: Date: Use the following to answer questions 23:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics  HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationSome Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
More informationLabour Income Dynamics and the Insurance from Taxes, Transfers, and the Family
Labour Income Dynamics and the Insurance from Taxes, Transfers, and the Family Richard Blundell 1 Michael Graber 1,2 Magne Mogstad 1,2 1 University College London and Institute for Fiscal Studies 2 Statistics
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationAn extension of the factoring likelihood approach for nonmonotone missing data
An extension of the factoring likelihood approach for nonmonotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions
More informationProbability and Statistics
CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b  0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute  Systems and Modeling GIGA  Bioinformatics ULg kristel.vansteen@ulg.ac.be
More informationTEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND
I J A B E R, Vol. 13, No. 4, (2015): 15251534 TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND Komain Jiranyakul * Abstract: This study
More information1 The Pareto Distribution
Estimating the Parameters of a Pareto Distribution Introducing a Quantile Regression Method Joseph Lee Petersen Introduction. A broad approach to using correlation coefficients for parameter estimation
More informationTHE IMPACT OF 401(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS
THE IMPACT OF 41(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS VICTOR CHERNOZHUKOV AND CHRISTIAN HANSEN Abstract. In this paper, we use the instrumental quantile
More informationThe Best of Both Worlds:
The Best of Both Worlds: A Hybrid Approach to Calculating Value at Risk Jacob Boudoukh 1, Matthew Richardson and Robert F. Whitelaw Stern School of Business, NYU The hybrid approach combines the two most
More informationForecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
More informationStochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More informationComparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
More informationQuantile Regression under misspecification, with an application to the U.S. wage structure
Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and FernandezVal Reading Group Econometrics November 2, 2010 Intro: initial problem The
More informationA Trading Strategy Based on the LeadLag Relationship of Spot and Futures Prices of the S&P 500
A Trading Strategy Based on the LeadLag Relationship of Spot and Futures Prices of the S&P 500 FE8827 Quantitative Trading Strategies 2010/11 MiniTerm 5 Nanyang Technological University Submitted By:
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationAssumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model
Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationRecent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago
Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:
More informationModels for Count Data With Overdispersion
Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extrapoisson variation and the negative binomial model, with brief appearances
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More informationPITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
More informationLocal outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #47/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationProblem Set 6  Solutions
ECO573 Financial Economics Problem Set 6  Solutions 1. Debt Restructuring CAPM. a Before refinancing the stoc the asset have the same beta: β a = β e = 1.2. After restructuring the company has the same
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More informationAverage Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation
Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationBig data in Finance. Finance Research Group, IGIDR. July 25, 2014
Big data in Finance Finance Research Group, IGIDR July 25, 2014 Introduction Who we are? A research group working in: Securities markets Corporate governance Household finance We try to answer policy questions
More informationproblem arises when only a nonrandom sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a nonrandom
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA Email: peverso1@swarthmore.edu 1. Introduction
More informationA Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals
A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals Achim Zeileis Wirtschaftsuniversität Wien Abstract Three classes of structural change tests (or tests for
More informationOptimization of technical trading strategies and the profitability in security markets
Economics Letters 59 (1998) 249 254 Optimization of technical trading strategies and the profitability in security markets Ramazan Gençay 1, * University of Windsor, Department of Economics, 401 Sunset,
More informationOption Values. Determinants of Call Option Values. CHAPTER 16 Option Valuation. Figure 16.1 Call Option Value Before Expiration
CHAPTER 16 Option Valuation 16.1 OPTION VALUATION: INTRODUCTION Option Values Intrinsic value  profit that could be made if the option was immediately exercised Call: stock price  exercise price Put:
More informationViolent crime total. Problem Set 1
Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the
More informationNonLinearity in the Determinants of Capital Structure: Evidence from UK Firms
NonLinearity in the Determinants of Capital Structure: Evidence from UK Firms by Bassam Fattouh Pasquale Scaramozzino Laurence Harris CeFiMS, SOAS CeFiMS, SOAS CeFiMS, SOAS Abstract Drawing on the agency
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 16233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In nonlinear regression models, such as the heteroskedastic
More informationIs the Basis of the Stock Index Futures Markets Nonlinear?
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC)  Conference Papers Faculty of Engineering and Information Sciences 2011 Is the Basis of the Stock
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More information