Statistics in Geophysics: Linear Regression II


 Tiffany Melanie Stevenson
 2 years ago
 Views:
Transcription
1 Statistics in Geophysics: Linear Regression II Steffen Unkel Department of Statistics LudwigMaximiliansUniversity Munich, Germany Winter Term 2013/14 1/28
2 Model definition Suppose we have the following model under consideration: y = Xβ + ɛ, where y = (y 1..., y n ) is an n 1 vector of observations on the response, β = (β 0, β 1,..., β k ) is a (k + 1) 1 vector of parameters, ɛ = (ɛ 1,..., ɛ n ) is an n 1 vector of random errors, and X is the n (k + 1) design matrix with X = 1 x 11 x 1k... 1 x n1 x nk. Winter Term 2013/14 2/28
3 Model assumptions The following assumptions are made: 1 E(ɛ) = 0. 2 Cov(ɛ) = σ 2 I. 3 The design matrix has full column rank, that is, rank(x) = k + 1 = p. 4 The normal regression model is obtained if additionally ɛ N (0, σ 2 I). For stochastic covariates these assumptions are to be understood conditionally on X. It follows that E(y) = Xβ and Cov(y) = σ 2 I. In case of assumption 4, we have y N (Xβ, σ 2 I). Winter Term 2013/14 3/28
4 Modelling the effects of continuous covariates We can fit nonlinear relationships between continuous covariates and the response within the scope of linear models. Two simple methods for dealing with nonlinearity: 1 Variable transformation; 2 Polynomial regression. Sometimes it is customary to transform the continuous response as well. Winter Term 2013/14 4/28
5 Modelling the effects of categorical covariates We might want to include a categorical variable with two or more distinct levels. In such a case we cannot set up a continuous scale for the categorical variable. A remedy is to define new covariates, socalled dummy variables, and estimate a separate effect for each category of the original covariate. We can deal with c categories by the introduction of c 1 dummy variables. Winter Term 2013/14 5/28
6 Example: Turkey data weightp agew origin G G G G V V V V W W W W W Turkey weights in pounds, ages in weeks, and origin (Georgia (G); Virgina (V); Wisconsin (W)), of 13 Thanksgiving turkeys. Winter Term 2013/14 6/28
7 Dummy coding for categorical covariates Given a covariate x {1,..., c} with c categories, we define the c 1 dummy variables x i1 = { 1 xi = 1 0 otherwise,... x i,c 1 = { 1 xi = c 1 0 otherwise, for i = 1..., n, and include them as explanatory variables in the regression model y i = β 0 + β 1 x i β i,c 1 x i,c ɛ i. For reasons of identifiability, we omit the dummy variable for category c, where c is the reference category. Winter Term 2013/14 7/28
8 Design matrix for the turkey data using dummy coding > X (Intercept) agew originv originw Winter Term 2013/14 8/28
9 Interactions between covariates An interaction between predictor variables exists if the effect of a covariate depends on the value of at least one other covariate. Consider the following model between a response y and two covariates x 1 and x 2 : y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + ɛ, where the term β 3 x 1 x 2 is called an interaction between x 1 and x 2. The terms β 1 x 1 and β 2 x 2 depend on only one variable and are called main effects. Winter Term 2013/14 9/28
10 Least squares estimation of regression coefficients The error sum of squares is ɛ ɛ = (y Xβ) (y Xβ) = y y β X y y Xβ + β X Xβ = y y 2β X y + β X Xβ. The least squares estimator of β is the value ˆβ, which minimizes ɛ ɛ. Minimizing ɛ ɛ with respect to β yields ˆβ = (X X) 1 X y, which is obtained irrespective of any distribution properties of the errors. Winter Term 2013/14 10/28
11 Maximum likelihood estimation of regression coefficients Assuming normally distributed errors yields the likelihood ( L(β, σ 2 1 ) = exp 1 ) (2πσ 2 n/2 ) 2σ 2 (y Xβ) (y Xβ). The loglikelihood is given by l(β, σ 2 ) = n 2 log(2π) n 2 log(σ2 ) 1 2σ 2 (y Xβ) (y Xβ). Maximizing l(β, σ 2 ) with respect to β is equivalent to minimizing (y Xβ) (y Xβ), which is the least squares criterion. The MLE, ˆβ ML, therefore is identical to the least squares estimator. Winter Term 2013/14 11/28
12 Fitted values and residuals Based on ˆβ = (X X) 1 X y, we can estimate the (conditional) mean of y by Ê(y) = ŷ = Xˆβ. Substituting the least squares estimator further results in ŷ = X(X X) 1 X y = Hy, where the n n matrix H is called the hat matrix. The residuals are ˆɛ = y ŷ = y Hy = (I H)y. Winter Term 2013/14 12/28
13 Estimation of the error variance Maximization of the loglikelihood with respect to σ 2 yields ˆσ 2 ML = 1 n (y Xˆβ) (y Xˆβ) = 1 n (y ŷ) (y ŷ) = 1 n ˆɛ ˆɛ. Since E(ˆσ ML) 2 = n p n the MLE of σ 2 is biased. σ 2, An unbiased estimator is ˆσ 2 = 1 n p ˆɛ ˆɛ. Winter Term 2013/14 13/28
14 Properties of the least squares estimator For the least squares estimator we have E(ˆβ) = β. Cov(ˆβ) = σ 2 (X X) 1. GaussMarkov Theorem: Among all linear and unbiased estimators ˆβ L, the least squares estimator has minimal variances, implying Var( ˆβ j ) Var( ˆβ L j ), j = 0,..., k. If ɛ N (0, σ 2 I), then ˆβ N (β, σ 2 (X X) 1 ). Winter Term 2013/14 14/28
15 ANOVA table for multiple linear regression and R 2 Source Degrees of freedom Sum of squares Mean square F value of variation (df) (SS) (MS) Regression k SSR MSR = SSR/k Residual n p = n (k + 1) SSE ˆσ 2 = SSE n p Total n 1 SST The multiple coefficient of determination is still computed as R 2 = SST SSR, but is no longer the square of the Pearson correlation between the response and any of the predictor variables. MSR ˆσ 2 Winter Term 2013/14 15/28
16 Interval estimation and tests We would like to construct confidence intervals and statistical tests for hypotheses regarding the unknown regression parameters β. A requirement for the construction of tests and confidence intervals is the assumption of normally distributed errors. However, tests and confidence intervals are relatively robust to mild departures from the normality assumption. Moreover, tests and confidence intervals, derived under the assumption of normality, remain valid for large sample size even with nonnormal errors. Winter Term 2013/14 16/28
17 Testing linear hypotheses Hypotheses: 1 General linear hypothesis: H 0 : Cβ = d against H 1 : Cβ d, where C is a r p matrix with rank(c) = r p (r linear independent restrictions) and d is a p 1 vector. 2 Test of significance (ttest): H 0 : β j = 0 against H 1 : β j 0. Winter Term 2013/14 17/28
18 Testing linear hypotheses Hypotheses: 3 Composite test of subvector: H 0 : β 1 = 0 against H 1 : β Test for significance of regression: H 0 : β 1 = β 2 = = β k = 0 against H 1 : β j 0 for at least one j {1,..., k}. Winter Term 2013/14 18/28
19 Testing linear hypotheses Test statistics: 1 F = 1 r (Cˆβ d) (ˆσ 2 C(X X) 1 C ) 1 (Cˆβ d) F r,n p. 2 t j = ˆβ j ˆσ ˆβj t n p, where ˆσ ˆβ j denotes the standard error of ˆβ j. 3 F = 1 r (ˆβ 1 ) Ĉov(ˆβ 1 ) 1 (ˆβ 1 ) F r,n p. 4 F = n p k SSR SSE = n p k R 2 1 R 2 F k,n p. Winter Term 2013/14 19/28
20 Testing linear hypotheses Critical values: Reject H 0 in the case of: 1 F > F r,n p (1 α). 2 t > t n p (1 α/2). 3 F > F r,n p (1 α). 4 F > F k,n p (1 α) Winter Term 2013/14 20/28
21 Confidence intervals and regions for regression coefficients Confidence interval for β j : A confidence interval for β j with level 1 α is given by [ ˆβ j t n p (1 α/2)ˆσ ˆβ j, ˆβ j + t n p (1 α/2)ˆσ ˆβj ]. Confidence region for subvector β 1 : A confidence ellipsoid for β 1 = (β 1,..., β r ) with level 1 α is given by { β 1 : 1 r (ˆβ 1 β 1 ) Ĉov(ˆβ } 1 ) 1 (ˆβ 1 β 1 ) F r,n p (1 α). Winter Term 2013/14 21/28
22 Prediction intervals Confidence interval for the (conditional) mean of a future observation: A confidence interval for E(y 0 ) of a future observation y 0 at location x 0 with level 1 α is given by x 0 ˆβ ± t n p (1 α/2)ˆσ(x 0 (X X) 1 x 0 ) 1/2. Prediction interval for a future observation: A prediction interval for a future observation y 0 at location x 0 with level 1 α is given by x 0 ˆβ ± t n p (1 α/2)ˆσ(1 + x 0 (X X) 1 x 0 ) 1/2. Winter Term 2013/14 22/28
23 The corrected coefficient of determination We already defined the coefficient of determination, R 2, as a measure for the goodnessoffit to the data. The use of R 2 is limited, since it will never decrease with the addition of a new covariate into the model. The corrected coefficient of determination, Radj 2, adjusts for this problem, by including a correction term for the number of parameters. It is defined by R 2 adj = 1 n 1 n p (1 R2 ). Winter Term 2013/14 23/28
24 Akaike information criterion The Akaike information criterion (AIC) is one of the most widely used criteria for model choice within the scope of likelihoodbased inference. The AIC is defined by AIC = 2l(ˆβ ML, σ 2 ML) + 2(p + 1), where l(ˆβ ML, σml 2 ) is the maximum value of the loglikelihood. Smaller values of the AIC correspond to a better model fit. Winter Term 2013/14 24/28
25 Bayesian information criterion The Bayesian information criterion (BIC) is defined by BIC = 2l(ˆβ ML, σ 2 ML) + log(n)(p + 1). The BIC multiplied by 1/2 is also known as Schwartz criterion. Smaller values of the BIC correspond to a better model fit. The BIC penalizes complex models much more than the AIC. Winter Term 2013/14 25/28
26 Practical use of model choice criteria To select the most promising models from candidate models, we first obtain a preselection of potential models. All potential models can now be assessed with the aid of one of the various model choice criteria (AIC, BIC). This method is not always practical, since the number of regressor variables and modelling variants can be very large in many applications. In this case, we can use the following partially heuristic methods. Winter Term 2013/14 26/28
27 Practical use of model choice criteria II Complete model selection: In case that the number of predictor variables is not too large, we can determine the best model with the leapsandbounds algorithm. Forward selection: 1 Based on a starting model, forward selection includes one additional variable in every iteration of the algorithm. 2 The variable which offers the greatest reduction of a preselected model choice criterion is chosen. 3 The algorithm terminates if no further reduction is possible. Winter Term 2013/14 27/28
28 Practical use of model choice criteria III Backward elimination: 1 Backward elimination starts with the full model containing all predictor variables. 2 Subsequently, in every iteration, the covariate which provides the greatest reduction of the model choice criterion is eliminated from the model. 3 The algorithm terminates if no further reduction is possible. Stepwise selection: Stepwise selection is a combination of forward selection and backward elimination. In every iteration of the algorithm, a predictor variable may be added to the model or removed from the model. Winter Term 2013/14 28/28
Lecture 5 Hypothesis Testing in Multiple Linear Regression
Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationOutline. Topic 4  Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4  Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test  Fall 2013 R 2 and the coefficient of correlation
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 4448935 email:
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationMultiple Hypothesis Testing: The Ftest
Multiple Hypothesis Testing: The Ftest Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost
More informationVariance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.
Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be
More informationRegression, least squares
Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20thcentury statistics dealt with maximum likelihood
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationttests and Ftests in regression
ttests and Ftests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and Ftests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationFixed vs. Random Effects
Statistics 203: Introduction to Regression and Analysis of Variance Fixed vs. Random Effects Jonathan Taylor  p. 1/19 Today s class Implications for Random effects. Oneway random effects ANOVA. Twoway
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More informationStatistics 112 Regression Cheatsheet Section 1B  Ryan Rosario
Statistics 112 Regression Cheatsheet Section 1B  Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything
More informationNew Work Item for ISO 35345 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 35345 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 16233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is Rsquared? Rsquared Published in Agricultural Economics 0.45 Best article of the
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationELECE8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems
Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #47/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationStatistiek (WISB361)
Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17
More informationINTRODUCTORY STATISTICS
INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationOneWay Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
OneWay Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measuresoffit in multiple regression Assumptions
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationRegression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit nonresponse. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationLogistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests
Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationInstrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.
Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationLecture 2: Simple Linear Regression
DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a yvariable relates to two or more xvariables (or transformations
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationEstimation and Inference in Cointegration Models Economics 582
Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationMultivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A WileyInterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
More informationMultiple Linear Regression. Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables.
1 Multiple Linear Regression Basic Concepts Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables. In simple linear regression, we had
More informationCollinearity of independent variables. Collinearity is a condition in which some of the independent variables are highly correlated.
Collinearity of independent variables Collinearity is a condition in which some of the independent variables are highly correlated. Why is this a problem? Collinearity tends to inflate the variance of
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationWebbased Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Webbased Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationUsing JMP with a Specific
1 Using JMP with a Specific Example of Regression Ying Liu 10/21/ 2009 Objectives 2 Exploratory data analysis Simple liner regression Polynomial regression How to fit a multiple regression model How to
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationS6: Spatial regression models: : OLS estimation and testing
: Spatial regression models: OLS estimation and testing Course on Spatial Econometrics with Applications Profesora: Coro Chasco Yrigoyen Universidad Autónoma de Madrid Lugar: Universidad Politécnica de
More informationInferences About Differences Between Means Edpsy 580
Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Inferences About Differences Between Means Slide
More informationThe Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information
Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F distribution to simultaneously
More informationGeneral Regression Formulae ) (N2) (1  r 2 YX
General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Betahat: β = r YX (1 Standard
More informationL10: Probability, statistics, and estimation theory
L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationLesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationHedonism example. Our questions in the last session. Our questions in this session
Random Slope Models Hedonism example Our questions in the last session Do differences between countries in hedonism remain after controlling for individual age? How much of the variation in hedonism is
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More information1 Theory: The General Linear Model
QMIN GLM Theory  1.1 1 Theory: The General Linear Model 1.1 Introduction Before digital computers, statistics textbooks spoke of three procedures regression, the analysis of variance (ANOVA), and the
More informationREGRESSION LINES IN STATA
REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression
More informationRegression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013
Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2 Multiple Linear
More information1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: estim.tex, Dec8, 009: 6 p.m. (draft  typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 03 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More informationElements of statistics (MATH04871)
Elements of statistics (MATH04871) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis 
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More information1. χ 2 minimization 2. Fits in case of of systematic errors
Data fitting Volker Blobel University of Hamburg March 2005 1. χ 2 minimization 2. Fits in case of of systematic errors Keys during display: enter = next page; = next page; = previous page; home = first
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationLectures 8, 9 & 10. Multiple Regression Analysis
Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationMultiple Regression  Selecting the Best Equation An Example Techniques for Selecting the "Best" Regression Equation
Multiple Regression  Selecting the Best Equation When fitting a multiple linear regression model, a researcher will likely include independent variables that are not important in predicting the dependent
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationTESTING THE ASSUMPTIONS OF AGETOAGE FACTORS. Abstract
TESTING THE ASSUMPTIONS OF AGETOAGE FACTORS GARY G. VENTER Abstract The use of agetoage factors applied to cumulative losses has been shown to produce leastsquares optimal reserve estimates when certain
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationOneWay Analysis of Variance (ANOVA) Example Problem
OneWay Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesistesting technique used to test the equality of two or more population (or treatment) means
More informationStatistical Analysis with Missing Data
Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES
More information