Lecture 5 Hypothesis Testing in Multiple Linear Regression
|
|
- Dustin Adams
- 7 years ago
- Views:
Transcription
1 Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004
2 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables
3 Overall test 2 y i = β 0 + x i1 β x ip β p + ɛ i Does the entire set of independent variables contribute significantly to the prediction of y?
4 Test for an addition of a single variable 3 Does the addition of one particular variable of interest add significantly to the prediction of y acheived by the other independent variables already in the model? y i = β 0 + x i1 β x ip β p + ɛ i
5 Test for addition of a group of variables 4 Does the addition of some group of independent variables of interest add significantly to the prediction of y obtained through other independent variables already in the model? y i = β 0 + x i1 β x i,p 1 β p 1 + x ip β p + ɛ i
6 The ANOVA table 5 Source of Sums of squares Degrees of Mean E[Mean square] variation freedom square Regression SSR = ˆβ X y nȳ 2 SSR p p pσ 2 + β R X C X Cβ R Error SSE = y y ˆβ X SSE y n (p + 1) n (p+1) σ 2 Total SST O = y y nȳ 2 n 1 X C is the matrix of centered predictors: X C = 0 x 11 x 1 x 12 x 2 x 1p x p x 21 x 1. x 22 x 2. x 2p x p. x n1 x 1 x n2 x 2 x np x p 1 C A and β R = (β 1,, β p ).
7 The ANOVA table for 6 y i = β 0 + x i1 β1 + x i2 β2 + + x ip β p + ɛ i is often provided in the output from statistical software as Source of Sums of squares Degrees of F variation freedom Regression x 1 1 x 2 x 1. 1 x p x p 1, x p 2,, x 1 1 Error SSE n (p + 1) Total SST O n 1 where SSR = SSR(x 1 ) + SSR(x 2 x 1 ) + + SSR(x p x p 1, x p 2,..., x 1 ) and has p degrees of freedom.
8 Overall test 7 H 0 : β 1 = β 2 = = β p = 0 H 1 : β j 0 for at least one j, j = 1,..., p Rejection of H 0 implies that at least one of the regressors, x 1, x 2,..., x p, contributes significantly to the model. We will use a generalization of the F-test in simple linear regression to test this hypothesis.
9 Under the null hypothesis, SSR/σ 2 χ 2 p and SSE/σ 2 χ 2 n (p+1) are independent. Therefore, we have 8 F 0 = SSR/p SSE/(n p 1) = MSR MSE F p,n p 1 Note: as in simple linear regression, we are assuming that ɛ i N(0, σ 2 ) or relying on large sample theory.
10 CHS example, cont. 9 > anova(lmwtht) Analysis of Variance Table y i = β 0 + weight i β 1 + height i β 2 + ɛ i Response: DIABP Df Sum Sq Mean Sq F value Pr(>F) WEIGHT ** HEIGHT Residuals Signif. codes: 0 *** ** 0.01 * ( )/2 F 0 = = 5.59 > F 2,495,.95 = /495 We reject the null hypothesis at α =.05 and conclude that at least one of β 1 or β 2 is not equal to 0.
11 The overall F statistic is also available from the output of summary(). 10 > summary(lmwtht) Call: lm(formula = DIABP ~ WEIGHT + HEIGHT, data = chs) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-10 *** WEIGHT * HEIGHT Signif. codes: 0 *** ** 0.01 * Residual standard error: on 495 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 495 DF, p-value:
12 Tests on individual regression coefficients 11 Once we have determined that at least one of the regressors is important, a natural next question might be which one(s)? Important considerations: Is the increase in the regression sums of squares sufficient to warrant an additional predictor in the model? Additional predictors will increase the variance of ŷ - include only predictors that explain the response (note: we may not know this through hypothesis testing as confounders may not test significant but would still be necessary in the regression model). Adding an unimportant predictor may increase the residual mean square thereby reducing the usefulness of the model.
13 12 y i = β 0 + x i1 β x ij β j + + x ip β p + ɛ i H 0 : β j = 0 H 1 : β j 0 As in simple linear regression, under the null hypothesis t 0 = ˆβ j ŝe( ˆβ j ) t n p 1. We reject H 0 if t 0 > t n p 1,1 α/2. This is a partial test because ˆβ j depends on all of the other predictors x i, i j that are in the model. Thus, this is a test of the contribution of x j given the other predictors in the model.
14 CHS example, cont. 13 y i = β 0 + weight i β 1 + height i β 2 + ɛ i H 0 : β 2 = 0 vs H 1 : β 2 0, given that weight is in the model. From the ANOVA table, ˆσ2 = C = (X X) 1 = t 0 = / = < t 495,.975 = 1.96 Therefore, we fail to reject the null hypothesis.
15 Tests for groups of predictors 14 Often it is of interest to determine whether a group of predictors contribute to predicting y given another predictor or group of predictors are in the model. In CHS example, we may want to know if age, height and sex are important predictors given weight is in the model when predicting blood pressure. We may want to know if additional powers of some predictor are important in the model given the linear term is already in the model. Given a predictor of interest, are interactions with other confounders of interest as well?
16 Using sums of squares to test for groups of predictors 15 Determine the contribution of a predictor or group of predictors to SSR given that the other regressors are in the model using the extra-sums-of-squares method. Consider the regression model with p predictors y = Xβ + ɛ. We would like to determine if some subset of r < p predictors contributes significantly to the regression model.
17 Partition the vector of regression coefficients as β = [ ] β 1 β 2 16 where β 1 is (p + 1 r) 1 and β 2 is r 1. We want to test the hypothesis H 0 : β 2 = 0 Rewrite the model as where X = [X 1 X 2 ]. H 1 : β 2 0 y = Xβ + ɛ = X 1 β 1 + X 2 β 2 + ɛ, (1)
18 Equation (1) is the full model with SSR expressed as 17 SSR(X) = ˆβ X y (p+1 degrees of freedom) and MSE = y y ˆβ X y n p 1. To find the contribution of the predictors in X 2, fit the model assuming H 0 is true. This reduced model is y = X 1 β 1 + ɛ, where ˆβ 1 = (X 1 X 1 ) ( 1) X 1 y
19 and 18 SSR(X 1 ) = ˆβ 1 X 1 y (p+1-r degrees of freedom). The regression sums of squares due to X 2 when X 1 is already in the model is SSR(X 2 X 1 ) = SSR(X) SSR(X 1 ) with r degrees of freedom. This is also known as the extra sum of squares due to X 2. SSR(X 2 X 1 ) is independent of MSE. We can test H 0 : β 2 = 0 with the statistic F 0 = SSR(X2 X 1 )/r MSE F r,n p 1.
20 CHS example, cont. 19 Full model: y i = β 0 + weight i β 1 + height i β 2 H 0 : β 2 = 0 Df Sum Sq Mean Sq F value Pr(>F) WEIGHT HEIGHT Residuals F 0 = / = 0.95 < F 1,495,0.95 = 3.86 This should look very similar to the t-test for H 0.
21 20 BP i = β 0 + weight i β 1 + height i β 2 + age i β 3 + gender i β 4 + ɛ > summary(lm(diabp~weight+height+age+gender,data=chs)) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-08 *** WEIGHT HEIGHT AGE *** GENDER Signif. codes: 0 *** ** 0.01 * Residual standard error: on 493 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 4 and 493 DF, p-value:
22 H 0 : β 2 = β 3 = β 4 = 0 vs H 1 : β j, j = 2, 3, 4 21 Df Sum Sq Mean Sq F value Pr(>F) WEIGHT HEIGHT AGE GENDER Residuals SSR(intercept, weight, height, age, gender) = = SSR(intercept, weight) = = SSR(height, age, gender intercept, weight) = = 1670 Notice we can also get this from the ANOVA table above SSR(height, age, gender intercept,weight) = = 1670
23 The observed F statistic is 22 F 0 = 1670/3/ = 13.5 > F 3,493,.95 = 2.62, and we reject the null hypothesis, concluding that at least one of β 2, β 3 or β 4 is not equal to 0. This should look very similar to the overall F test if we considered the intercept to be a predictor and all the covariates to be the additional variables under consideration.
24 What if we had put the predictors in the model in a different order? 23 diabp i = β 0 + height i β 2 + age i β 3 + weight i β 1 + gender i β 4 + ɛ Df Sum Sq Mean Sq F value Pr(>F) HEIGHT AGE WEIGHT GENDER Residuals Could we use this table to test H 0 : β 2 = β 3 = β 4 = 0?
25 What if we had the ANOVA table for the reduced model? Df Sum Sq Mean Sq F value Pr(>F) WEIGHT Residuals Given that SSR = SSR(x 2 ) + SSR(x 3 x 2 ) + SSR(x 1 x 2, x 3 ) + SSR(x 4 x 3, x 2, x 1 ) and then SSR(x 2, x 3, x 4 x 1 ) = SSR SSR(x 1 ) SSR(x 2, x 3, x 4 x 1 ) = = 1680.
26 One other question we might be interested in asking is if there are any significant interactions in the model? 25 lm(diabp~weight*height*age*gender,data=chs) Estimate Std. Error t value Pr(> t ) (Intercept) WEIGHT HEIGHT AGE GENDER WEIGHT:HEIGHT WEIGHT:AGE HEIGHT:AGE WEIGHT:GENDER HEIGHT:GENDER AGE:GENDER WEIGHT:HEIGHT:AGE WEIGHT:HEIGHT:GENDER WEIGHT:AGE:GENDER HEIGHT:AGE:GENDER WEIGHT:HEIGHT:AGE:GENDER
27 ANOVA table 26 Df Sum Sq Mean Sq F value Pr(>F) WEIGHT HEIGHT AGE GENDER WEIGHT:HEIGHT WEIGHT:AGE HEIGHT:AGE WEIGHT:GENDER HEIGHT:GENDER AGE:GENDER WEIGHT:HEIGHT:AGE WEIGHT:HEIGHT:GENDER WEIGHT:AGE:GENDER HEIGHT:AGE:GENDER WEIGHT:HEIGHT:AGE:GENDER Residuals
28 We can simplify the ANOVA table to 27 Df Sum Sq Mean Sq F value Pr(>F) Main effects Interactions Residuals How do we fill in the rest of this table?
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationComparing Nested Models
Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationHow to calculate an ANOVA table
How to calculate an ANOVA table Calculations by Hand We look at the following example: Let us say we measure the height of some plants under the effect of different fertilizers. Treatment Measures Mean
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationN-Way Analysis of Variance
N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationTesting for Lack of Fit
Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationEDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations
More informationOne-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationLecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationStatistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl
Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More information1.1. Simple Regression in Excel (Excel 2010).
.. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under
More informationLecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
More informationRegression Analysis (Spring, 2000)
Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationGeneral Regression Formulae ) (N-2) (1 - r 2 YX
General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard
More informationMULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis
Journal of tourism [No. 8] MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM Assistant Ph.D. Erika KULCSÁR Babeş Bolyai University of Cluj Napoca, Romania Abstract This paper analysis
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationNew Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationStat 5303 (Oehlert): Tukey One Degree of Freedom 1
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationLets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is
In this lab we will look at how R can eliminate most of the annoying calculations involved in (a) using Chi-Squared tests to check for homogeneity in two-way tables of catagorical data and (b) computing
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationMSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.
MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationCopyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5
Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression
More informationORTHOGONAL POLYNOMIAL CONTRASTS INDIVIDUAL DF COMPARISONS: EQUALLY SPACED TREATMENTS
ORTHOGONAL POLYNOMIAL CONTRASTS INDIVIDUAL DF COMPARISONS: EQUALLY SPACED TREATMENTS Many treatments are equally spaced (incremented). This provides us with the opportunity to look at the response curve
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationDeterministic and Stochastic Modeling of Insulin Sensitivity
Deterministic and Stochastic Modeling of Insulin Sensitivity Master s Thesis in Engineering Mathematics and Computational Science ELÍN ÖSP VILHJÁLMSDÓTTIR Department of Mathematical Science Chalmers University
More informationMULTIPLE REGRESSIONS ON SOME SELECTED MACROECONOMIC VARIABLES ON STOCK MARKET RETURNS FROM 1986-2010
Advances in Economics and International Finance AEIF Vol. 1(1), pp. 1-11, December 2014 Available online at http://www.academiaresearch.org Copyright 2014 Academia Research Full Length Research Paper MULTIPLE
More informationANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview Introduction Overview Overview Introduction Finding
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationLucky vs. Unlucky Teams in Sports
Lucky vs. Unlucky Teams in Sports Introduction Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationStatistical Functions in Excel
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
More informationAn Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA
ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationMULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)
MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationThis can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
More informationIndependent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationDifference of Means and ANOVA Problems
Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly
More informationBiostatistics Short Course Introduction to Longitudinal Studies
Biostatistics Short Course Introduction to Longitudinal Studies Zhangsheng Yu Division of Biostatistics Department of Medicine Indiana University School of Medicine Zhangsheng Yu (Indiana University) Longitudinal
More informationI n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationTwo-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption
Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More information1 Basic ANOVA concepts
Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the
More informationCausal Forecasting Models
CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationTRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002
More informationChicago Insurance Redlining - a complete example
Chapter 12 Chicago Insurance Redlining - a complete example In a study of insurance availability in Chicago, the U.S. Commission on Civil Rights attempted to examine charges by several community organizations
More information