Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

Size: px
Start display at page:

Download "Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013"

Transcription

1 Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1

2 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2

3 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent) variable y i, i = 1, 2,..., n p Explanatory (independent) variables x i = (x i,1, x i,2,..., x i,p ) T, i = 1, 2,..., n Goal of Regression Analysis: Extract/exploit relationship between y i and x i. Examples Prediction Causal Inference Approximation Functional Relationships MIT 18.S096 Regression Analysis 3

4 General Linear Model: For each case i, the conditional distribution [y i x i ] is given by y i = ŷ i + ɛ i where ŷ i = β 1 x i,1 + β 2 x i,2 + + β i,p x i,p β = (β 1, β 2,..., β p ) T are p regression parameters (constant over all cases) ɛ i Residual (error) variable (varies over all cases) Extensive breadth of possible models Polynomial approximation (x i,j = (x i ) j, explanatory variables are different powers of the same variable x = x i ) Fourier Series: (x i,j = sin(jx i ) or cos(jx i ), explanatory variables are different sin/cos terms of a Fourier series expansion) Time series regressions: time indexed by i, and explanatory variables include lagged response values. Note: Linearity of ŷ i (in regression parameters) maintained with non-linear x. MIT 18.S096 Regression Analysis 4

5 Steps for Fitting a Model (1) Propose a model in terms of Response variable Y (specify the scale) Explanatory variables X 1, X 2,... X p (include different functions of explanatory variables if appropriate) Assumptions about the distribution of ɛ over the cases (2) Specify/define a criterion for judging different estimators. (3) Characterize the best estimator and apply it to the given data. (4) Check the assumptions in (1). (5) If necessary modify model and/or assumptions and go to (1). MIT 18.S096 Regression Analysis 5

6 Specifying Assumptions in (1) for Residual Distribution Gauss-Markov: zero mean, constant variance, uncorrelated Normal-linear models: ɛ i are i.i.d. N(0, σ 2 ) r.v.s Generalized Gauss-Markov: zero mean, and general covariance matrix (possibly correlated,possibly heteroscedastic) Non-normal/non-Gaussian distributions (e.g., Laplace, Pareto, Contaminated normal: some fraction (1 δ) of the ɛ i are i.i.d. N(0, σ 2 ) r.v.s the remaining fraction (δ) follows some contamination distribution). MIT 18.S096 Regression Analysis 6

7 Specifying Estimator Criterion in (2) Least Squares Maximum Likelihood Robust (Contamination-resistant) Bayes (assume β j are r.v. s with known prior distribution) Accommodating incomplete/missing data Case Analyses for (4) Checking Assumptions Residual analysis Model errors ɛ i are unobservable Model residuals for fitted regression parameters β j are: e i = y i [β 1x i,1 + β 2 x i,2 + + β px i,p ] Influence diagnostics (identify cases which are highly influential?) Outlier detection MIT 18.S096 Regression Analysis 7

8 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 8

9 Ordinary Least Squares Estimates Least Squares Criterion: For β = (β 1, β 2,..., β p ), define N Q(β) = i=1 [y i ŷ i] 2 = N i=1 [y i (β 1 x i,1 + β 2 x i,2 + + β i,p x 2 i,p)] Ordinary Least-Squares (OLS) estimate βˆ: minimizes Q(β). Matrix Notation y x x 1 1,1 x 1,2 1,p y = y 2 x 2,1 x 2,2 x 2,p X =. β = y n x n,2 x p,n x n,1 T β 1 β p MIT 18.S096 Regression Analysis 9

10 Solving for OLS Estimate βˆ ŷ = ŷ 1 ŷ 2. ŷ n = Xβ and Q(β) = n i=1 (y i ŷ i ) 2 = (y ŷ) T (y ŷ) = (y Xβ) T (y Xβ) OLS ˆβ solves Q(β) β j =0, j = 1, 2,..., p Q(β) β = ( n j β j i=1 [y i (x i,1 β 1 + x i,2 β 2 + x i,p β p )] 2) = n i=1 2( x i,j)[y i (x i,1 β 1 + x i,2 β 2 + x i,p β p )] = 2(X [j] ) T (y Xβ) where X [j] is the jth column of X MIT 18.S096 Regression Analysis 10

11 Solving for OLS Estimate βˆ Q β = Q β 1 Q β 2. Q βp XT [1] (y Xβ) X T [2](y Xβ) = 2 = 2X T (y Xβ). X T (y Xβ) So the OLS Estimate βˆ solves the Normal Equations X T (y Xβ) = 0 X T Xβˆ = X T y = βˆ = (X T X) 1 X T y [p] N.B. For βˆ to exist (uniquely) (X T X) must be invertible X must have Full Column Rank MIT 18.S096 Regression Analysis 11

12 (Ordinary) Least Squares Fit OLS Estimate: βˆ1 βˆ = βˆ 2. T = (X X) 1 X T y Fitted Values: Where βˆ p ŷ x βˆ + + x βˆ 1 1,1 1 1,p p ŷ 2 x βˆ + + x βˆ y = 2 1 ˆ,1 2,p p =.. ŷ n xn,1βˆ x n,p βˆ p = Xβˆ = X(X T X) 1 X T y = Hy H = X(X T X) 1 X T is the n n Hat Matrix MIT 18.S096 Regression Analysis 12

13 (Ordinary) Least Squares Fit The Hat Matrix H projects R n onto the column-space of X Residuals: ɛˆi = y i ŷ i, i = 1, 2,..., n ɛˆ1 ˆɛ = = y ŷ = (I ɛˆ2 n. H)y ɛˆn 0 Normal Equations: X T (y Xβˆ) = X T ˆɛ = 0 p =. 0 N.B. The Least-Squares Residuals vector ˆɛ is orthogonal to the column space of X MIT 18.S096 Regression Analysis 13

14 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 14

15 : Assumptions y1 x1, 1 x 1,2 x 1,p y 2 Data y = and X = x 2,1 x 2,2 x 2,p y n x n,1 x n,2 x p,n follow a linear model satisfying the Gauss-Markov Assumptions if y is an observation of random vector Y = (Y 1, Y 2,... Y N ) T and E(Y X, β) = Xβ, where β = (β 1, β T 2,... β p ) is the p-vector of regression parameters. Cov(Y X, β) = σ 2 I n, for some σ 2 > 0. I.e., the random variables generating the observations are uncorrelated and have constant variance σ 2 (conditional on X, and β). MIT 18.S096 Regression Analysis 15

16 For known constants c 1, c 2,..., c p, c p+1, consider the problem of estimating θ = c 1 β 1 + c 2 β 2 + c p β p + c p+1. Under the Gauss-Markov assumptions, the estimator θˆ = c1βˆ 1 + c 2 βˆ2 + cpβˆ p + c p+1, where βˆ1, βˆ 2,... β ˆ p are the least squares estimates is 1) An Unbiased Estimator of θ 2) A Linear Estimator of θ, that is θˆ = n i=1 b iy i, for some known (given X) constants b i. Theorem: Under the Gauss-Markov Assumptions, the estimator θˆ has the smallest (Best) variance among all Linear Unbiased Estimators of θ, i.e., θˆ is BLUE. MIT 18.S096 Regression Analysis 16

17 : Proof Proof: Without loss of generality, assume c p+1 = 0 and define c =(c 1, c T 2,..., c p ). The Least Squares Estimate of θ = c T β is: θˆ = c T βˆ = c T (X T X) 1 X T y d T y a linear estimate in y given by coefficients d = (d T 1, d 2,..., d n ). Consider an alternative linear estimate of θ: θ = b T y with fixed coefficients given by b = (b T 1,..., b n ). Define f = b d and note that θ = b T y = (d + f) T y = θˆ + f T y If θ is unbiased then because θˆ is unbiased 0 = E(f T y) = d T E(y) = f T (Xβ) for all β R p = f is orthogonal to column space of X = f is orthogonal to d = X(X T X) 1 c MIT 18.S096 Regression Analysis 17

18 If θ is unbiased then The orthogonality of f to d implies Var(θ ) = Var(b T y) = Var(d T y + f T y) = Var(d T y) + Var(f T y) + 2Cov(d T y, f T y) = Var(θˆ) + Var(f T y) + 2d T Cov(y)f = Var(θˆ) + Var(f T y) + 2d T (σ 2 I n )f = Var(θˆ) + Var(f T y) + 2σ 2 d T f = Var(θˆ) + Var(f T y) + 2σ 2 0 Var(θˆ) MIT 18.S096 Regression Analysis 18

19 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 19

20 Estimates Consider generalizing the Gauss-Markov assumptions for the linear regression model to Y = Xβ + ɛ where the random n-vector ɛ: E[ɛ] = 0 n and E[ɛɛ ] = σ 2 Σ. σ 2 is an unknown scale parameter Σ is a known (n n) positive definite matrix specifying the relative variances and correlations of the component observations. 1 Transform the data (Y, X) to Y = Σ 2 Y and X = Σ 1 2 X and the model becomes Y = X β + ɛ, where E[ɛ ] = 0 n and E[ɛ (ɛ ) ] = σ 2 I n By the, the BLUE ( GLS ) of β is βˆ = [(X ) T (X )] 1 (X ) T (Y ) = [X T Σ 1 X] 1 (X T Σ 1 Y) MIT 18.S096 Regression Analysis 20

21 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 21

22 Normal Linear Regression Models Distribution Theory Y i = x i,1 β 1 + x i,2 β 2 + x i,p β p + ɛ i = µ i + ɛ i Assume {ɛ 1, ɛ 2,..., ɛ n } are i.i.d N(0, σ 2 ). = [Y i x i,1, x i,2,..., x 2 i,p, β, σ ] N(µ i, σ 2 ), independent over i = 1, 2,... n. Conditioning on X, β, and σ 2 ɛ 1 Y = Xβ + ɛ, where ɛ = ɛ 2 N 2 n(o n, σ I n ). ɛ n MIT 18.S096 Regression Analysis 22

23 Distribution Theory Regression Analysis µ 1 µ =. µ n = E(Y X, β, σ 2 ) = Xβ MIT 18.S096 Regression Analysis 23

24 σ σ Σ = Cov(Y X, β, σ 2 ) = 0 0 σ σ 2 That is, Σ i,j = Cov(Y i, Y j X, β, σ 2 ) = σ 2 δ i,j. = σ 2 I n Apply Moment-Generating Functions (MGFs) to derive Joint distribution of Y = (Y 1, Y 2,..., Y n ) T Joint distribution of βˆ = (βˆ1, βˆ2,..., βˆp) T. MIT 18.S096 Regression Analysis 24

25 MGF of Y For the n-variate r.v. Y, and constant n vector t = (t 1,..., t n ) T, T M Y (t) = E(e t Y ) = E(e t 1Y 1 +t 2 Y 2 + t ny n ) = E(e t 1Y 1 ) E(e t 2Y 2 ) E(e tnyn ) = M Y 1 (t 1 ) M Y2 (t 2 ) M Yn (t n ) n 1 = =1 et i µ i + 2 i t2 i σ2 = e n i=1 t i µ i + 1 n 2 i,k=1 t i Σ i,k t k = e tt u+ 1 t T Σt 2 = Y N n (µ, Σ) Multivariate Normal with mean µ and covariance Σ MIT 18.S096 Regression Analysis 25

26 MGF of βˆ For the p-variate r.v. βˆ, and constant p vector τ = (τ 1,..., τ p ) T, M (τ ) = E(e τ T βˆ ) = E(e τ 1βˆ1+τ2βˆ 2 + τ pβ p βˆ ) Defining A = (X T X) 1 X T we can express βˆ = (X T X) 1 X T y = AY and T ˆ Mˆ(τ β ) = E(e τ β ) = E(e τ T AY ) T = E(e t Y ), with t = A T τ = M Y (t) 1 = e tt u+ 2 tt Σt MIT 18.S096 Regression Analysis 26

27 MGF of βˆ Regression Analysis For Plug in: Gives: M βˆ(τ ) = E(e τ T βˆ) = e tt u+ 1 t T Σt 2 t = A T τ = X(X T X) 1 τ µ = Xβ Σ = σ 2 I n t T µ = τ T β t T Σt = τ T (X T X) 1 X T [σ 2 I n ]X(X T X) 1 τ = τ T [σ 2 (X T X) 1 ]τ So the MGF of βˆ is 1 M βˆ(τ ) = e τ T β+ 2 τ T [σ 2 (X T X) 1 ]τ ˆ 2 T 1 MIT 18.S096 Regression Analysis 27

28 Marginal Distributions of Least Squares Estimates Because βˆ N p (β, σ 2 (X T X) 1 ) the marginal distribution of each βˆj is: βˆj N(β j, σ 2 C j,j ) where C j.j = jth diagonal element of (X T X) 1 MIT 18.S096 Regression Analysis 28

29 The Q-R Decomposition of X Consider expressing the (n p) matrix X of explanatory variables as X = Q R where Q is an (n p) orthonormal matrix, i.e., Q T Q = I p. R is a (p p) upper-triangular matrix. The columns of Q = [Q [1], Q [2],..., Q [p] ] can be constructed by performing the Gram-Schmidt Orthonormalization procedure on the columns of X = [X [1], X [2],..., X [p] ] MIT 18.S096 Regression Analysis 29

30 If Regression Analysis r1,1 r 1,2 r1,p 1 r 1,p 0 r 2,2 r 2,p 1 r 2,p R = , then 0 0 r p 1,p 1 rp 1,p r p,p X [1] = Q [1] r 1,1 = r1,1 2 = X T [1] X [1] Q [1] = X [1] /r 1,1 X [2] = Q [1] r 1,2 + Q [2] r 2,2 = Q[1] T X [2] = Q[1] T Q[1] r 1,2 + Q[1] T [2]r 2,2 = 1 r 1,2 + 0 r 2,2 = r 1,2 (known since Q [1] specfied) MIT 18.S096 Regression Analysis 30

31 With r 1,2 and Q [1] specfied we can solve for r 2,2 : = Q [2] r 2,2 = X [2] Q [1] r 1,2 Take squared norm of both sides: r 2 = X T X 2r Q T X + r 2 2,2 [2] [2] 1,2 [1] [2] 1,2 (all terms on RHS are known) With r 2,2 specified = Q [2] = 1 r 2,2 [ X[2] r 1,2 Q [1] ] Etc. (solve for elements of R, and columns of Q) MIT 18.S096 Regression Analysis 31

32 With the Q-R Decomposition X = QR (Q T Q = I p, and R is p p upper-triangular) ˆβ = (X T X) 1 X T y = R 1 Q T y (plug in X = QR and simplify) Cov(ˆβ) = σ 2 (X T X) 1 = σ 2 R 1 (R 1 ) T H = X(X T X) 1 X T = QQ T (giving ŷ = Hy and ˆɛ = (I n H)y) MIT 18.S096 Regression Analysis 32

33 More Distribution Theory Assume y = Xβ + ɛ, where {ɛ i } are i.i.d. N(0, σ 2 ), i.e., ɛ N n (0 n, σ 2 I n ) or y N n (Xβ, σ 2 I n ) Theorem* For any (m n) matrix A of rank m n, the random normal vector y transformed by A, z = Ay is also a random normal vector: z N m (µ z, Σ z ) where µ z = AE(y) = AXβ, and Σ z = ACov(y)A T = σ 2 AA T. Earlier, A = (X T X) 1 X T yields the distribution of βˆ = Ay With a different definition of A (and z) we give an easy proof of: MIT 18.S096 Regression Analysis 33

34 Theorem For the normal linear regression model y = Xβ + ɛ, where X (n p) has rank p and ɛ N n (0 2 n, σ I n ). (a) βˆ = (X T X) 1 X T y and ˆɛ = y Xβˆ are independent r.v.s (b) βˆ N p (β, σ 2 (X T X) 1 ) n (c) i=1 ɛˆ2i = ˆɛ T ˆɛ σ 2 χ 2 n p (Chi-squared r.v.) (d) For each j = 1, 2,..., p βˆj j ˆ β t j = t ˆσC n p (t distribution) j,j where ˆσ 2 = 1 n n p i=1 ɛˆ2i C j,j = [(X T X) 1 ] j,j MIT 18.S096 Regression Analysis 34

35 Proof: Note that (d) follows immediately from (a), (b), (c) [ ] Q T Define A =, where W T A is an (n n) orthogonal matrix (i.e. A T = A 1 ) Q is the column-orthonormal matrix in a Q-R decomposition of X Note: W can be constructed by continuing the Gram-Schmidt Orthonormalization process (which was used to construct Q from X) with X = [ X I n ]. Then, consider [ ] [ ] Q T y z Q ( p 1) z = Ay = T = W y (n p) 1 z W MIT 18.S096 Regression Analysis 35

36 The distribution of z = Ay is N n (µ z, Σ z ) where [ ] T Q µ z = [A][Xβ] = T [Q R β] W [ ] Q T Q = W T [R β] [ Q ] Ip = [R β] [ 0 (n p) p ] R β = 0 (n p) p 2 Σ z = A [σ I n ] A T = σ 2 [AA T ] = σ 2 I n since A T = A 1 MIT 18.S096 Regression Analysis 36

37 ( ) [( ) ] zq Rβ Thus z = N σ z 2 n, In W On p = z Q N p [(Rβ), σ 2 I p ] 2 z W N(n p)[(o (n p), σ I (n p) ] and z Q and z W are independent. The Theorem follows by showing (a*) βˆ = R 1 z Q and ˆɛ = Wz W, (i.e. βˆ and ˆɛ are functions of different independent vecctors). (b*) Deducing the distribution of βˆ = R 1 z Q, applying Theorem* with A = R 1 and y = z Q (c*) ˆɛ T ˆɛ = zw T z W = sum of (n p) squared r.v s which are i.i.d. N(0, σ 2 ). σ 2 χ 2, a scaled Chi-Squared r.v. (n p) MIT 18.S096 Regression Analysis 37

38 Proof of (a*) βˆ = R 1 z Q follows from βˆ = (X T X) 1 Xy and X = QR with Q : Q T Q = I p ˆɛ = y ŷ = y Xβˆ = y (QR) (R 1 z Q ) = y Qz Q = y QQ T y = (In QQ T )y = WW T y (since I n = A T A = QQ T + WW T ) = Wz W MIT 18.S096 Regression Analysis 38

39 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 39

40 Maximum-Likelihood Estimation Consider the normal linear regression model: y = Xβ + ɛ, where {ɛ i } are i.i.d. N(0, σ 2 ), i.e., ɛ N n (0 n, σ 2 I n ) or y N n (Xβ, σ 2 I n ) Definitions: The likelihood function is L(β, σ 2 ) = p(y X, B, σ 2 ) where p(y X, B, σ 2 ) is the joint probability density function (pdf) of the conditional distribution of y given data X, (known) and parameters (β, σ 2 ) (unknown). The maximum likelihood estimates of (β, σ 2 ) are the values maximizing L(β, σ 2 ), i.e., those which make the observed data y most likely in terms of its pdf. MIT 18.S096 Regression Analysis 40

41 Because the y i are independent r.v. s with y i N(µ 2 i, σ ) where p µ i = j=1 β j x i,j, L(β, σ 2 ) = n i=1 p [ (y 2 i β, σ ) n 2σ 2 (y i j=1 β j x i,j ) 2] 1 = 1 i=1 2πσ 2 e 1 (y Xβ) = 1 e 1 (y Xβ) T (σ 2 I n) (2πσ 2 ) n/2 2 The maximum likelihood estimates (βˆ, σˆ2) maximize the log-likeliood function (dropping constant terms) logl(β, σ 2 ) = n 2 log(σ2 ) 1 2 (y Xβ)T (σ 2 I n ) 1 (y Xβ) = n 2 log(σ2 ) 1 Q(β) 2σ 2 where Q(β) = (y Xβ) T (y Xβ) ( Least-Squares Criterion!) The OLS estimate ˆβ is also the ML-estimate. The ML estimate of σ 2 solves 2 log L(βˆ,σ ) (σ 2 = 0,i.e., n 1 ) 2 σ 2 1 ( 1)(σ 2 ) 2 Q(βˆ) = 0 2 = σ ˆ2 ML = Q(βˆ)/n = ( n i=1 ɛ ˆ2)/n i (biased!) MIT 18.S096 Regression Analysis 41

42 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 42

43 For data y, X fit the linear regression model y i = x T i β + ɛ i, i = 1, 2,..., n. by specifying β = βˆ to minimize Q(β) = n i=1 h(y i, x 2 i, β, σ ) The choice of the function h( ) distinguishes different estimators. (1) Least Squares: h(y i, x i, β, σ 2 ) = (y i x T i β) 2 (2) Mean Absolue Deviation (MAD): h(y i, x 2 T i, β, σ ) = y i x i β (3) Maximum Likelihood (ML): Assume the y i are independent with pdf s p(y i β, x i, σ 2 ), h(y i, x i, β, σ 2 ) = log p(y i β, x i, σ 2 ) (4) Robust M Estimator: h(y i, x i, β, σ 2 ) = χ(y i x T i β) χ( ) is even, monotone increasing on (0, ). MIT 18.S096 Regression Analysis 43

44 (5) Quantile Estimator: For { τ : 0 < τ < 1, a fixed quantile τ y T 2 i x i β, if y h(y i x i, x i, β, σ ) = i β (1 τ) y i x T i β, if y i < x i β E.g., τ = 0.90 corresponds to the 90th quantile / upper-decile. τ = 0.50 corresponds to the MAD Estimator MIT 18.S096 Regression Analysis 44

45 MIT OpenCourseWare 18.S096 Topics in Mathematics with Applications in Finance Fall 2013 For information about citing these materials or our Terms of Use, visit:

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

Time Series Analysis III

Time Series Analysis III Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Linear Models for Continuous Data

Linear Models for Continuous Data Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

Factor Analysis. Factor Analysis

Factor Analysis. Factor Analysis Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

More information

Linear Regression. Guy Lebanon

Linear Regression. Guy Lebanon Linear Regression Guy Lebanon Linear Regression Model and Least Squares Estimation Linear regression is probably the most popular model for predicting a RV Y R based on multiple RVs X 1,..., X d R. It

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Estimating an ARMA Process

Estimating an ARMA Process Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

More information

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Econometric Methods for Panel Data

Econometric Methods for Panel Data Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models: Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

8. Linear least-squares

8. Linear least-squares 8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

More information

Trend and Seasonal Components

Trend and Seasonal Components Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing

More information

Sales forecasting # 1

Sales forecasting # 1 Sales forecasting # 1 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Analyzing Structural Equation Models With Missing Data

Analyzing Structural Equation Models With Missing Data Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.

More information

Ridge Regression. Patrick Breheny. September 1. Ridge regression Selection of λ Ridge regression in R/SAS

Ridge Regression. Patrick Breheny. September 1. Ridge regression Selection of λ Ridge regression in R/SAS Ridge Regression Patrick Breheny September 1 Patrick Breheny BST 764: Applied Statistical Modeling 1/22 Ridge regression: Definition Definition and solution Properties As mentioned in the previous lecture,

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Credit Risk Models: An Overview

Credit Risk Models: An Overview Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Forecasting in supply chains

Forecasting in supply chains 1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

1 Another method of estimation: least squares

1 Another method of estimation: least squares 1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Lecture 21. The Multivariate Normal Distribution

Lecture 21. The Multivariate Normal Distribution Lecture. The Multivariate Normal Distribution. Definitions and Comments The joint moment-generating function of X,...,X n [also called the moment-generating function of the random vector (X,...,X n )]

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

The Method of Least Squares

The Method of Least Squares Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

More information

Regression Analysis LECTURE 2. The Multiple Regression Model in Matrices Consider the regression equation. (1) y = β 0 + β 1 x 1 + + β k x k + ε,

Regression Analysis LECTURE 2. The Multiple Regression Model in Matrices Consider the regression equation. (1) y = β 0 + β 1 x 1 + + β k x k + ε, LECTURE 2 Regression Analysis The Multiple Regression Model in Matrices Consider the regression equation (1) y = β 0 + β 1 x 1 + + β k x k + ε, and imagine that T observations on the variables y, x 1,,x

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Computer exercise 4 Poisson Regression

Computer exercise 4 Poisson Regression Chalmers-University of Gothenburg Department of Mathematical Sciences Probability, Statistics and Risk MVE300 Computer exercise 4 Poisson Regression When dealing with two or more variables, the functional

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Time Series Analysis 1. Lecture 8: Time Series Analysis. Time Series Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 MIT 18.S096

Time Series Analysis 1. Lecture 8: Time Series Analysis. Time Series Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 MIT 18.S096 Lecture 8: Time Series Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis 1 Outline Time Series Analysis 1 Time Series Analysis MIT 18.S096 Time Series Analysis 2 A stochastic

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Panel Data: Linear Models

Panel Data: Linear Models Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

Efficiency and the Cramér-Rao Inequality

Efficiency and the Cramér-Rao Inequality Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.

More information

Variations of Statistical Models

Variations of Statistical Models 38. Statistics 1 38. STATISTICS Revised September 2013 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a

More information

Part II. Multiple Linear Regression

Part II. Multiple Linear Regression Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Lecture Notes 1. Brief Review of Basic Probability

Lecture Notes 1. Brief Review of Basic Probability Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

More information