4.5.1 Model Utility Test To test if any of the predictors has an effect test the null hypothesis that all slopes are 0
|
|
- Coral Hudson
- 7 years ago
- Views:
Transcription
1 4. Hypothesis Tests After estimating the parameters of the model, we should check if the model is supported by the data. The first questions to be answered will be if any of the predictors has an effect on the mean response. If evidence shows this to be true, then we should answer the question, which of predictors has an effect. 4.. Model Utility Test To test if any of the predictors has an effect test the null hypothesis that all slopes are 0 H 0 : β β β k 0 versus H a : at least on slope is not 0 The test is based on an ANOVA approach, where we analyse the total sum of squares in Y, and find how much of the variation is due to the variables in the model and how much of the variation remains unexplained. The Total Sum of Squares: Let J n be the n n matrix with all entries being. SS T Y Y The Residual Sum of Squares (see above): n Y J n Y nx i (Y i Ȳ ) SS Res e e Y (I n X(X X) X ) Y Y Y ˆβ X Y nx The Sum of Squares for Regression: SS R SS T SS Res ˆβ X Y n Y J n Y nx i i (Ŷi Ȳ ) (Y i Ŷi) The F-score relating the Sum of Squares for Regression with the Sum of Squares for Residuals is used for the test SS R /k F SS Res /(n k ) MS R MS Res Under the assumptions of the MLRM and if β β β k 0 this F-score is F-distributed with df n k and df d n k. The proof of this result uses the fact that SS R is also χ distributed with df k and that SS R and SS Res are independent. The information for this test is usually summarized in an ANOVA table. Source SS df MS F Regression SS R k MS R MS R /MS Res Residual SS Res n k MS Res Total SS T n
2 ANOVA F-test. Hyp.: Choose α. H 0 : β β β k 0 versus H a : at least on slope is not 0. Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedasticity and normality of the error.. Test statistic: F 0 4. P-value: P (F > F 0 ) SS R /k SS Res /(n k ) MS R MS Res, df n k p, df d n k n p Continue Example.?? Continue the example on the effect of age and weight on blood pressure and the analysis of the model: bp β 0 + β (weight) + β (age) + ε, ε N (0, σ ) First find the ANOVA table. We already calculated SS Res.8. It is SS T y y n y J y (0, 4, 4,,, 9) Therefore SS R SS T SS Res (Use that n y J y ( P n i y ) /n) ANOVA table C A () 4.8 Source SS df MS F Regression Residual.8.9 Total 4.8 ANOVA F-test. Hyp.: Choose α 0.0. H 0 : β β 0 versus H a : at least one slope is not 0. Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedacity and normality of the error.
3 . Test statistic: F , df n, df d 4. P-value: P (F > F 0 ), use F-distribution table from my website, since F 0 is greater than the 0.00 critical value of the F distribution with df n and df d, the P-value is smaller than Therefore the P-value is smaller than α, reject H 0.. Context: At significance level of % the data provide sufficient evidence that at least one of the slopes is different from zero, i.e. that at least one of age or weight has an effect on the mean blood pressure. From here arises the question which of the slopes is not zero. The model utility test does not indicate that the model is a good description for the population, only that if it is a good fit then at least on of the regressors has an effect on the mean response. A residual analysis will help to judge if the model is appropriate for the population under consideration. Adjusted coefficient of determination The adjusted coefficient of determination, R adj, indicates how well the data is represented by the estimated model. In comparison to the coefficient of determination, R, it is adjusted for the number of variables in the model. When adding an additional variable to an existing model the coefficient of determination will always increase, regardless of the relevance of the variable for the response. The adjusted coefficient of determination includes an adjustment for the number of variables in the model to correct for this effect, but otherwise is to be interpreted in the same way as R. R adj MS Res MS T Continue Example.?? The adjusted coefficient of determination for the blood pressure example is R adj.9 4.8/ 0.9 9% of the sample variation in blood pressure can be explained by the changes in weight and age. The adjusted coefficient of determination is often used in the model building process. 4.. Test for individual slopes In order to find out if a variable x i has a significant linear effect on the mean response when correcting for the other variables in the model, a test for H 0 : β i 0, i k should be conducted. The test can be based on the t-score standardizing ˆβ i : Theorem 4.. In the MLRM t ˆβ i β i se( ˆβ i ) ˆβ i β i ˆσ C ii, where C ii is the i th diagonal entry in (X X), is t-distributed with df n p.
4 Conclusion: Let i {,,..., k}. A ( α) 00% confidence interval for slope β i is given by ˆβ i ± t n p α/ Ȉσ C ii, Test for individual slopes correcting for the other variables in the model:. Hypotheses: Choose α. Type Hypotheses upper tail H 0 : β i β i0 versus H a : β i > β i0 lower tail H 0 : β i β i0 versus H a : β i < β i0 tail H 0 : β i β i0 versus H a : β i β i0. Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedacity and normality of the error.. Test statistic: t 0 ˆβ i β i0 ˆσ C ii, df n p 4. P-value: Type P-value upper tail P (t > t 0 ) lower tail P (t < t 0 ) tail P (t > t 0 ) Continue Example.?? Test if the mean blood pressure increases with increasing weight(x ) within the proposed model bp β 0 + β (weight) + β (age) + ε, ε N (0, σ ). Hypotheses: H 0 : β 0 versus H a : β > 0, choose α Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedasticity and normality of the error.. Test statistic: t È.9(0.08).8, df 4. P-value: P-value P (t >.8), using the t-table with df, find that the P-value< Decision: Reject H 0.. Context: At significance level of % the data provide sufficient evidence that on average the blood pressure increases with weight when correcting for age. 4
5 4.. Test for a subset of slopes In some applications it is of interest to test if in the presence of some (usually confounding) variables at least one in a set of slopes for different variables is not zero. Such a question would for example arise, when it is of interest to find the effect of the amount of humour, violence, and drama on the popularity of a movie, after correcting for the effects of age and sex (the confounding variables). For the development of the test statistic the extra sum of squares principle is applied. It is based on the comparison of the residual sum of squares for two models:. The full model: model which includes all variables.. The reduced model: the model only including the variables to be corrected for. Essentially it is tested if the inclusion of the extra variables results in a significant decrease in the residual sum of squares.. The full model with k predictors, therefore p k + parameters (including the intercept) Y X β + ε where Y and ε are n dimensional random vectors, the design matrix X R n p, and the parameter vector β R p. To test for the effect of r < k predictors partition the design matrix and the parameter vector such that Y [X X ] " β β # + ε X β + X β + ε with partitioned design matrix where X R n (p r) and X R n r, and partitioned parameter vector where β R (p r) and β R r. For the full model with ˆβ (X X) X Y with df n p. We wish to test H 0 : β 0. SS Res ( β) Y Y ˆβ X Y. The reduced model, where we assume β 0, with k r predictors and k r + p r parameters: Y X β + ε and with ˆβ (X X ) X Y with df n (p r) n p + r. SS Res ( β ) Y Y ˆβ X Y
6 The difference in the Sum of Squares for Residuals when adding variables x k r+,..., x k to the model with variables x,..., x k r is SS Res ( β β ) SS Res ( β ) SS Res ( β) with df n p + r (n p) r. These are called the extra sum of squares due to β (extra sum of squares for regression, since the decrease in the residual sum of squares implies an increase by the same amount in the regression sum of squares). One can prove that SS Res ( β β ) is independent of MS Res therefore is F distributed with df n r and df d n p. ANOVA Extra Sum of Squares Test:. Hypotheses: Choose α. F SS Res( β β )/r MS Res H 0 : β k r+ β k 0( β 0) versus H a : at least one slope is not 0. Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedasticity and normality of the error.. Test statistic: F 0 SS Res( β β )/r MS Res, df n r, df d n k 4. P-value: P (F > F 0 ) Continue Example.?? The model is bp β 0 + β (weight) + β (age) + ε, ε N (0, σ ) Use the extra sum of squares test to see if age has a significant effect on the mean blood pressure when correcting for weight. The full model: The reduced model:. Hypotheses: Choose α. bp β 0 + β (weight) + β (age) + ε, ε N (0, σ ) bp β 0 + β (weight) + ε, ε N (0, σ ) H 0 : β 0 versus H a : β 0
7 . Assumptions: Random samples, the MLRM is a proper description of the population, including homoscedasticity and normality of the error.. Test statistic: r. From previous work SS Res ( β).8 and MS Res.9. For finding SS Res (β ) X C A with X X Œ and (X X ) Œ and X y 0 Œ gives SS Res (β ) y y (X y) (X X ) (X y) 8. F 0 (8.8.8)/.9 4.4, df n r, df d n p 4. P-value: 0.00 < P (F > F 0 ) < 0.0 according to the F-distribution table.. Decision: Reject H 0.. Context: At significance level of % the data provide sufficient evidence that age has a significant effect on the blood pressure when correcting for weight. R reduced<-lm(bp~weight) full<-lm(bp~weight+age) anova(reduced, full) 4. Orthogonal Columns in the Design Matrix Definition 4.. Matrices A R n m and B R n k are orthogonal, iff A B 0 m k. Theorem 4.. If the design matric can be partitioned, so that X [X X ] with X X 0 then the least squares estimator for β ( β, β ) can be partitioned into ˆβ ( ˆβ, ˆβ ), so that ˆβ (X X ) X Y, and ˆβ (X X ) X Y
8 Proof: It is ˆβ (X X) X Y. First find (X X) : X X X [X X X ] X X X X X X X X X X 0 (p r) r 0 r (p r) X X Then is the inverse of X X, because X XA A 4 X X 0 (p r) r 0 r (p r) X X (X X ) 0 (p r) r 0 r (p r) (X X ) 4 (X X ) 0 (p r) r 0 r (p r) (X X ) 4 (X X )(X X ) + 0 (p r) r 0 r (p r) (X X )0 (p r) r + 0 r (p r) (X X ) 0 r (p r) (X X ) + (X X )0 r (p r) 0 r (p r) 0 (p r) r + (X X )(X X ) 4 I p r 0 (p r) r 0 r (p r) I r With this result now find ˆβ I p ˆβ (X X) X Y 4 (X X ) 0 (p r) r 0 r (p r) (X X ) 4 (X X ) 0 (p r) r 0 r (p r) (X X ) 4 (X X ) X Y (X X ) X Y 4 ˆβ ˆβ 4 X X 4 X Y X Y Y q.e.d. Conclusion: Because of the theorem, we find in the case of orthogonal columns in the design matrix, that the parameter vector can be partitioned, so that the parts are independent from each other, and can be found independently. The extra sum of squares test for H 0 : β 0 is in the case of orthogonal columns the same as the model utility test in the partitioned model Y X β + ε. For the full model, it is now easy to prove that SS R ( β) SS R ( β ) + SS R ( β ) 8
9 I.e. the sum of squares for regression for the complete parameter vector equals the sum of squares for regression for the partitioned parameter vector, and therefore according to the extra sum of squares principle the two are independent, and tests for one partitioned vector is independent from the other. 9
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationOne-Way Analysis of Variance
One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationMULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)
MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationHypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationhp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines
The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationChapter Four. Data Analyses and Presentation of the Findings
Chapter Four Data Analyses and Presentation of the Findings The fourth chapter represents the focal point of the research report. Previous chapters of the report have laid the groundwork for the project.
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationGeneral Regression Formulae ) (N-2) (1 - r 2 YX
General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationTRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationA Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationUNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)
UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationSimple Methods and Procedures Used in Forecasting
Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationInner product. Definition of inner product
Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationThese axioms must hold for all vectors ū, v, and w in V and all scalars c and d.
DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More information10. Analysis of Longitudinal Studies Repeat-measures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationConcepts of Experimental Design
Design Institute for Six Sigma A SAS White Paper Table of Contents Introduction...1 Basic Concepts... 1 Designing an Experiment... 2 Write Down Research Problem and Questions... 2 Define Population...
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationLecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More informationThe Method of Least Squares
Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this
More informationParametric and non-parametric statistical methods for the life sciences - Session I
Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationSIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More informationUSING THE ETS MAJOR FIELD TEST IN BUSINESS TO COMPARE ONLINE AND CLASSROOM STUDENT LEARNING
USING THE ETS MAJOR FIELD TEST IN BUSINESS TO COMPARE ONLINE AND CLASSROOM STUDENT LEARNING Andrew Tiger, Southeastern Oklahoma State University, atiger@se.edu Jimmy Speers, Southeastern Oklahoma State
More informationE3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More information