Week 5: Multiple Linear Regression


 Kellie Singleton
 2 years ago
 Views:
Transcription
1 BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School of Business faculty.chicagobooth.edu/robert.gramacy/teaching
2 Beyond SLR Many problems involve more than one independent variable or factor which affects the dependent or response variable. Multifactor asset pricing models (beyond CAPM). Demand for a product given prices of competing brands, advertising, household attributes, etc. More than size to predict house price! In SLR, the conditional mean of Y depends on X. The multiple linear regression (MLR) model extends this idea to include more than one independent variable. 1
3 Categorical effects/dummy variables Week 1 already introduced one type of multiple regression: Regression for Y onto a categorical X : E[Y group = r] = β 0 + β r, for r = 1,..., R 1. I.e., ANOVA for grouped data. To represent these qualitative factors in multiple regression, we use dummy, binary, or indicator variables. 2
4 Dummy variables allow the mean (intercept) to shift by taking on the value 0 or 1. Examples: temporal effects (1 if Holiday season, 0 if not) spatial (1 if in Midwest, 0 if not) If a factor X takes R possible levels, we can represent X through R 1 dummy variables E[Y X ] = β 0 + β 1 1 [X =2] + β 2 1 [X =3] + + β R 1 1 [X =R] (1 [X =r] = 1 if X = r, 0 if X r.) What is E[Y X = 1]? 3
5 Recall the pickup truck data example: > pickup < read.csv("pickup.csv") > t1 < lm(log(price) ~ make, data=pickup) > summary(t1) ## abbreviated output Call: lm(formula = log(price) ~ make, data = pickup) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e16 *** makeford makegmc The coefficient values correspond to our dummy variables. 4
6 What if you also want to include mileage? No problem. > t2 < lm(log(price) ~ make + log(miles), data=pickup) > summary(t2) ## abbreviated output Call: lm(formula = log(price) ~ make + log(miles), data=pickup) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e16 *** makeford makegmc log(miles) e05 *** 5
7 The MLR Model The MLR model is same as always, but with more covariates. Y X 1,..., X d ind N(β 0 + β 1 X β d X d, σ 2 ) Recall the key assumptions of our linear regression model: (i) The conditional mean of Y is linear in the X j variables. (ii) The additive errors (deviations from line) are normally distributed independent from each other identically distributed (i.e., they have constant variance) 6
8 Our interpretation of regression coefficients can be extended from the simple single covariate regression case: β j = E[Y X 1,..., X d ] X j Holding all other variables constant, β j is the average change in Y per unit change in X j. is from calculus and means change in 7
9 If d = 2, we can plot the regression surface in 3D. Consider sales of a product as predicted by price of this product (P1) and the price of a competing product (P2). Everything measured on logscale, of course. 8
10 The data and least squares The data in multiple regression is a set of points with values for output Y and for each input variable. Data: Y i and x i = [X 1i, X 2i,..., X di ], for i = 1,..., n. Or, as a data array (i.e., data.frame), Data = Y 1 X 11 X X d1. Y n X 1n X 2n... X dn 9
11 So the model is Y i = β 0 + β 1 X i1 + + β d X id + ε i, ε i iid N (0, σ 2 ). How do we estimate the MLR model parameters? The principle of least squares is unchanged; define: fitted values Ŷi = b 0 + b 1 X 1i + b 2 X 2i + + b d X di residuals e i = Y i Ŷ i standard error s = n i=1 e2 i n p, where p = d + 1. Then find the best fitting plane, i.e., coefs b 0, b 1, b 2,..., b d, by minimizing the sum of squared residuals, s 2. 10
12 What are the LS coefficient values? Say that Y = Y 1. Y n ˆX = 1 X 11 X X 1d. 1 X n1 X n2... X nd. Then the estimates are [b 0 b d ] = b = (ˆX ˆX) 1 ˆX Y. Same intuition as for SLR: b captures the covariance between X j and Y (ˆX Y), normalized by input sum of squares (ˆX ˆX). 11
13 Obtaining these estimates in R is very easy: > salesdata < read.csv("sales.csv") > attach(salesdata) > salesmlr < lm(sales ~ P1 + P2) > salesmlr Call: lm(formula = Sales ~ P1 + P2) Coefficients: (Intercept) P1 P
14 Multiple vs simple linear regression Basic concepts and techniques translate directly from SLR. Individual parameter inference and estimation is the same, conditional on the rest of the model. ANOVA is exactly the same, and the F test still applies. Our diagnostics and transformations apply directly. We still use lm, summary, rstudent, predict, etc. We have been plotting HD data since Week 1. The hardest part would be moving to matrix algebra to translate all of our equations. Luckily, R does all that for you. 13
15 Residual standard error First off, the calculation for s 2 = var(e) is exactly the same: s 2 = n i=1 (Y i Ŷ i ) 2. n p Ŷ i = b 0 + b j X ji and p = d + 1. The residual standard error is ˆσ = s = s 2. 14
16 Residuals in MLR As in the SLR model, the residuals in multiple regression are purged of any relationship to the independent variables. We decompose Y into the part predicted by X and the part due to idiosyncratic error. Y = Ŷ + e corr(x j, e) = 0 corr(ŷ, e) = 0 15
17 Residual diagnostics for MLR Consider the residuals from the sales data: fitted residuals P1 residuals P2 residuals We use the same residual diagnostics (scatterplots, QQ, etc). Plot residuals (raw or student) against Ŷ to see overall fit. Compare e or r against each X to identify problems. 16
18 Another great plot for MLR problems is to look at Y (true values) against Ŷ (fitted values). > plot(salesmlr$fitted, Sales, + main= "Fitted vs True Response for Sales data", + pch=20, col=4, ylab="y", xlab="y.hat") > abline(0, 1, lty=2, col=8) Fitted vs True Response for Sales data Y.hat Y If things are working, these values should form a nice straight line. 17
19 Transforming variables in MLR The transformation techniques are also the same as in SLR. For Y nonlinear in X j, consider adding Xj 2 polynomial terms. and other For nonconstant variance, use log(y ), Y, or another transformation such as Y /X j that moves the data to a linear scale. Use the loglog model for price/sales data and other multiplicative relationships. Again, diagnosing the problem and finding a solution involves looking at lots of residual plots (against different X j s). 18
20 For example, the sales, P1, and P2 variables were pretransformed from raw values to a log scale. On the original scale, things don t look so good: > expsalesmlr < lm(exp(sales) ~ exp(p1) + exp(p2)) fitted residuals exp(p1) residuals exp(p2) residuals 19
21 In particular, the studentized residuals are heavily right skewed. ( studentizing is the same, but leverage is now distance in ddim.) > hist(rstudent(expsalesmlr), col=7, + xlab="studentized Residuals", main="") Frequency Our loglog transform fixes this problem Studentized Residuals 20
22 Inference for coefficients As before in SLR, the LS linear coefficients are random (different for each sample) and correlated with each other. The LS estimators are unbiased: E[b j ] = β j for j = 0,..., d. In particular, the sampling distribution for b is a multivariate normal, with mean β = [β 0 β d ] and covariance matrix S b. b N p (β, S b ) 21
23 Coefficient covariance matrix var(b) : the p p covariance matrix for random vector b is S b = var(b 0 ) cov(b 0, b 1 ) cov(b 1, b 0 ) var(b 1 )... var(b d 1 ) cov(b d 1, b d ) cov(b d, b d 1 ) var(b d ) Standard errors are the square root of the diagonal of S b. 22
24 ) 1 Calculating the covariance is easy : S b = s (ˆX 2 ˆX > X < cbind(1, P1, P2) > cov.b < summary(salesmlr)$sigma^2*solve(t(x)%*%x) > print(cov.b) P1 P e e e05 P e e e06 P e e e05 > se.b < sqrt(diag(cov.b)) > se.b P1 P Variance decreases with n and var(x); increases with s 2. 23
25 Standard errors Conveniently, R s summary gives you all the standard errors. > summary(salesmlr) ## abbreviated output Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e16 *** P <2e16 *** P <2e16 ***  Signif. codes: 0 *** ** 0.01 * Residual standard error: on 97 degrees of freedom Multiple Rsquared: 0.998, Adjusted Rsquared: Fstatistic: 2.392e+04 on 2 and 97 DF, pvalue: < 2.2e16 24
26 Inference for individual coefficients Intervals and tstatistics are exactly the same as in SLR. A (1 α)100% C.I. for β j is b j ± t α/2,n p s bj. z bj = (b j β 0 j )/s bj t n p (0, 1) is number of standard errors between the LS estimate and the null value. Intervals/testing via b j & s bj are oneatatime procedures: You are evaluating the j th coefficient conditional on the other X s being in the model, but regardless of the values you ve estimated for the other b s. 25
27 R 2 for multiple regression Recall that we already view R 2 as a measure of fit : R 2 = SSR SST = n i=1 (Ŷi Ȳ )2 n i=1 (Y i Ȳ ) 2. And the correlation interpretation is very similar: R 2 = corr 2 (Ŷ, Y ) = r 2 ŷy. (rŷy = r xy in SLR since cor(x, Ŷ ) = 1) 26
28 Consider our marketing model: > summary(salesmlr)$r.square [1] P1 and P2 explain most of the variability in log volume. Consider the pickup regressions (1: make, 2: make + miles). > summary(trucklm1)$r.square [1] > summary(trucklm2)$r.square [1] Make is pretty useless, but miles gets us up to R 2 = 36%. 27
29 Forecasting in MLR Prediction follows exactly the same methodology as in SLR. For new data x f = [X 1,f X d,f ], E[Y f x f ] = Ŷf = b 0 + b 1 X 1f + + b d X df var[y f x f ] = var(ŷf ) + var(e f ) = sfit 2 + s2 = spred 2. With ˆX our design matrix (slide 9) and ˆx f = [1, X 1,f X d,f ] s 2 fit = s 2 ˆx f (ˆX ˆX) 1ˆx f A (1 α) level prediction interval is still Ŷ f ± t α/2,n p s pred. 28
30 The syntax in R is also exactly the same as before: > predict(salesmlr, data.frame(p1=1, P2=1), + interval="prediction", level=0.95) fit lwr upr And we can get s fit using our equation, or from R. > xf < matrix(c(1,1,1), ncol=1) > X < cbind(1, P1, P2) > s < summary(salesmlr)$sigma > sqrt(drop(t(xf)%*%solve(t(x)%*%x)%*%xf))*s [1] > predict(salesmlr, data.frame(p1=1, P2=1), + se.fit=true)$se.fit [1]
31 Glossary and equations MLR updates to the LS equations: b = (ˆX ˆX) 1 ˆX Y var(b) = S b = s 2 (ˆX ˆX ) 1 sfit 2 = s2 ˆx f (ˆX ˆX) 1ˆx f R 2 = SSR/SST = cor 2 (Ŷ, Y ) = r ŷy 2 30
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationLecture 2: Simple Linear Regression
DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 16233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationOutline. Topic 4  Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4  Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test  Fall 2013 R 2 and the coefficient of correlation
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a yvariable relates to two or more xvariables (or transformations
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationComparing Nested Models
Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationTesting for Lack of Fit
Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationRegression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 4448935 email:
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationKSTAT MINIMANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINIMANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationAn analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression
Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship
More informationRegression, least squares
Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a firstorder model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationRegression Models 1. May 10, 2012 Regression Models
Regression Models 1 May 10, 2012 Regression Models Perhaps the most used statistical methodology in practice is regression analysis. Basically the idea of regression is to relate one variable, called a
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is Rsquared? Rsquared Published in Agricultural Economics 0.45 Best article of the
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More informationStatistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!
Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationSimple Linear Regression
Chapter Nine Simple Linear Regression Consider the following three scenarios: 1. The CEO of the local Tourism Authority would like to know whether a family s annual expenditure on recreation is related
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationRegression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture  2 Simple Linear Regression
Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture  2 Simple Linear Regression Hi, this is my second lecture in module one and on simple
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationGeneral Regression Formulae ) (N2) (1  r 2 YX
General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Betahat: β = r YX (1 Standard
More informationEDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 510 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
More informationSales forecasting # 1
Sales forecasting # 1 Arthur Charpentier arthur.charpentier@univrennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationMultiple Linear Regression. Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables.
1 Multiple Linear Regression Basic Concepts Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables. In simple linear regression, we had
More informationChapter 4 and 5 solutions
Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,
More informationLeast Squares Regression. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.
Least Squares Regression Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.edu Spring 2006 R Notes 1 Copyright c 2006 Alan T. Arnholt 2 Least Squares
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationStat 5303 (Oehlert): Tukey One Degree of Freedom 1
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationStatistics 112 Regression Cheatsheet Section 1B  Ryan Rosario
Statistics 112 Regression Cheatsheet Section 1B  Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 3031, 2008 B. Weaver, NHRC 2008 1 The Objective
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationPrincipal Components Regression. Principal Components Regression. Principal Components Regression. Principal Components Regression
Principal Components Regression Principal Components Regression y i = β 0 + β j x ij + ɛ i Usual least squares may be inappropriate if x is high dimensional, especially in comparison to sample size x is
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a stepbystep guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationThe Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information
Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F distribution to simultaneously
More informationHypothesis testing  Steps
Hypothesis testing  Steps Steps to do a twotailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationLecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
More informationCausal Forecasting Models
CTL.SC1x Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationTopic 3. Chapter 5: Linear Regression in Matrix Form
Topic Overview Statistics 512: Applied Linear Models Topic 3 This topic will cover thinking in terms of matrices regression on multiple predictor variables case study: CS majors Text Example (NKNW 241)
More informationIntroduction to Generalized Linear Models
to Generalized Linear Models Heather Turner ESRC National Centre for Research Methods, UK and Department of Statistics University of Warwick, UK WU, 2008 04 2224 Copyright c Heather Turner, 2008 to Generalized
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationLecture 5 Hypothesis Testing in Multiple Linear Regression
Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA Email: peverso1@swarthmore.edu 1. Introduction
More informationMultiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
More informationHypothesis Testing or How to Decide to Decide Edpsy 580
Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Hypothesis Testing or How to Decide to Decide
More informationBivariate Regression Analysis. The beginning of many types of regression
Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #47/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More information4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Nonlinear functional forms Regression
More informationEPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM
EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable
More informationn + n log(2π) + n log(rss/n)
There is a discrepancy in R output from the functions step, AIC, and BIC over how to compute the AIC. The discrepancy is not very important, because it involves a difference of a constant factor that cancels
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationElements of statistics (MATH04871)
Elements of statistics (MATH04871) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis 
More information