# Soci708 Statistics for Sociologists

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Soci708 Statistics for Sociologists Module 11 Multiple Regression 1 François Nielsen University of North Carolina Chapel Hill Fall Adapted from slides for the course Quantitative Methods in Sociology (Sociology 6Z3) taught at McMaster University by Robert Andersen (now at University of Toronto) 1 / 21

2 Goals of This Module Review of least-squares regression analysis Simple and multiple regression Slope, intercept and R 2 Standard error of the regression Inference for Regression Confidence intervals and hypothesis tests for the slope F-test for the entire regression model Assumptions of regression and how to check them 2 / 21

3 Inference in Multiple Regression (1) Recall that least-squares regression fits the equation: Y = A + B 1 X 1 + B 2 X B k X k finding the values of A, B 1, B 2,..., B k that minimize the sum of the squared residuals: residual 2 = E 2 i = (y ŷ) 2 The slopes are now partial slopes (versus marginal slopes in simple regression) The slope coefficient B 1 represents the average change in y associated with a one-unit increase in x 1, holding the other x s constant 3 / 21

4 Inference in Multiple Regression (1) At this point explain again meaning of B k s in context of an example Remember standardized coefficients Explain R 2 and how it relates to correlation between Ŷ and Y 4 / 21

5 Inference in Multiple Regression (2) The standard deviation of y around the regression surface (standard error of the regression) is again simply estimated: S E = E 2 /(n k 1) i There are n k 1 degrees of freedom, where n is the sample size and k is the number of slopes to estimate As in the simple regression case, S E is used to find the standard errors for each slope and CI and hypothesis are tested in exactly the same way We will not go into details regarding this here because the formulas are complicated. A computer program will calculated the SE s for us It is important to note, however, that we now use the t-distribution with n k 1 df 5 / 21

6 Multiple Regression in R > # in R > ed<-c(12,13,12,14,12,15,12) > age<-c(45,35,27,60,23,30,49) > income<-c(20000,22000,23000,25000,18000,30000,26000) > reg.model<-lm(income~ed+age) > summary(reg.model) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ed age Signif. codes: 0 *** ** 0.01 * Residual standard error: 3248 on 4 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 4 DF, p-value: / 21

7 Analysis of Variance and F Test (1) Recall that for the (simple or multiple) regression model (y i y) = (ŷ i y) + (y i ŷ i ) It is an algebraic fact (that can be demonstrated) that the equality holds after one squares the deviations and sum them: (y i y) 2 = (ŷ i y) 2 + (y i ŷ i ) 2 This can be written SST = SSM + SSE where SST = (y i y) 2 (Sum of Squares Total) SSM = (ŷ i y) 2 (Sum of Squares Model) SSE = (y i ŷ i ) 2 (Sum of Squares Error) 7 / 21

8 Analysis of Variance and F Test (2) Recall that to calculate s 2 y, the sample variance of y, we divide SST by its degrees of freedom (n 1); likewise to calculate s 2, the error variance, we divide SSE by its degrees of freedom (n k 1) Each sum of squares has its degrees of freedom DF, and these add up to the total degrees of freedom DFT = (n 1): DFT = DFM + DFE (n 1) = k + (n k 1) For each source of variation the mean square MS is calculated as sum of squares MS = degrees of freedom 8 / 21

9 Analysis of Variance and F Test (3) The three mean squares are thus: s 2 SST (yi y = MST = DFT = y) 2 n 1 MSM = SSM (ŷi DFM = y) 2 k s 2 = MSE = SSE (yi DFE = ŷ i ) 2 n k 1 The null hypothesis that y is not linearly related to the x variables may be tested by comparing MSM and MSE using the F statistic F = MSM MSE Under the null hypothesis H 0 : β 1 = β 2 = = β k = 0 F is distributed as an F(k, n k 1) distribution 9 / 21

10 Analysis of Variance and F Test (4) A small P-value favors the alternative hypothesis 2 H a : at least one of the β j is not 0 2 From Moore & McCabe (2006, p.689) 10 / 21

11 Analysis of Variance and F Test (5) The Analysis of Variance (ANOVA) Table The ANOVA table summarizes sums of squares, mean squares and the F test ANOVA Table Degrees Source of freedom Sum of squares Mean square F Model k SSM = (ŷ i y) 2 SSM/DFM MSM/MSE Error n k 1 SSE = (y i ŷ i ) 2 SSE/DFE Total n 1 SST = (y i y) 2 SST/DFT 11 / 21

12 Analysis of Variance and F Test (6) Stata output with ANOVA table for regression of depression score. * in Stata. reg total l10inc age female cath jewi none Source SS df MS Number of obs = F( 6, 249) = 6.35 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = total Coef. Std. Err. t P> t [95% Conf. Interval] l10inc age female cath jewi none _cons / 21

13 Analysis of Variance and F Test (7) Stata output with ANOVA table for regression of depression score. * in Stata. reg total l10inc age female cath jewi none Source SS df MS Model Residual Total Number of obs = 256 F( 6, 249) = 6.35 Prob > F = R-squared = Adj R-squared = Root MSE = Model = SSM = (ŷ i y) 2 Residual = SSE = (y i ŷ i ) 2 Total = SST = (y i y) 2 MS = SS/Df... What is SST/DFT = ? F(6, 249) = MSM/MSE R-squared = SSM/SST Rooth MSE = MSE 13 / 21

14 Analysis of Variance and F Test (8) Exercise: Can you recover the redacted figures?. * in Stata. reg total l10inc age female cath jewi none Source SS df MS Number of obs = F( 6, 249) = xxxx Model xxx Prob > F = Residual xxx R-squared = xxxxxx Adj R-squared = Total Root MSE = total Coef. Std. Err. t P> t [95% Conf. Interval] l10inc xxxxx xxxxx age xxxxx xxxxx female cath jewi none _cons / 21

15 Extended Regression Analysis Example Four regression models of CESD depression score (transformed) in Stata. * in Stata. use "D:\soci708\data\survey2b.dta", clear. * CESD depression scale score is called total. * check distribution of total and transform (next 6 lines). histogram total, kdensity. qnorm total. ladder total. generate sqrtdep=sqrt(total). histogram sqrtdep, kdensity. qnorm sqrtdep. * estimate 4 models and standardized coef. (next 5 lines). reg sqrtdep age female. reg sqrtdep age female l10inc educatn. reg sqrtdep age female l10inc educatn cath jewi none. reg sqrtdep age female l10inc educatn cath jewi none drinks. reg sqrtdep age female l10inc educatn cath jewi none drinks, beta. * get descriptive stats and correlations for presentation. su sqrtdep age female l10inc educatn cath jewi none drinks. cor sqrtdep age female l10inc educatn cath jewi none drinks 15 / 21

16 Checking the Assumptions (1) Although regression can be an effective method to summarize the relationship between quantitative variables, some assumptions must be met As with the other methods of inference we have discussed, these assumptions pertain to the population we want to make inferences about Of course, we do not have data on the whole population so cannot assess the assumptions directly We can, however, check whether the assumptions appear reasonable for the sample 16 / 21

17 Checking the Assumptions (2) Assumptions for linear regression are: 1. Linearity: Is there a linear relationship between y and x? We can assess this assumption in simple regression by looking at a scatterplot 2. Constant Spread: Is the spread of the y-values approximately the same regardless of the value of x? If the spread of y changes with x, then we have a problem. A scatterplot of y and x or of the residuals against x allows us to assess this. 3. Normality: Are the residuals normally distributed (is there a skew or outliers)? If not normally distributed, we have a problem. We can check this assumption using a histogram or stemplot of the residuals 17 / 21

18 Assessing Linearity > # in R > library(car) > data(prestige) > attach(prestige) > par(pty="s") # to make square plot > plot(education, prestige) > abline(lm(prestige~education), col="red", lwd=3) education prestige 18 / 21

19 Assessing Spread > # in R > data(prestige) > attach(prestige) > par(pty="s") # to make square plot > mod1<-lm(prestige~education) > plot(education, mod1\$residuals, xlab="education", ylab="residuals") > abline(h=0) Education Residuals 19 / 21

20 Assessing Normality > # in R > mod1<-lm(prestige~education) > hist(mod1\$residuals, main="histogram of residuals", xlab="residual", col="red") Frequency Histogram of residuals Residual 20 / 21

21 Choosing the Right Test 21 / 21

### MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

### Correlation and Regression

Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

### Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet 4.3 In this question, you are told that a OLS regression analysis of average weekly earnings yields the following estimated model. AW E = 696.7 + 9.6 Age, R 2 = 0.023,

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

### Regression Analysis. Data Calculations Output

Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Regression in Stata. Alicia Doyle Lynch Harvard-MIT Data Center (HMDC)

Regression in Stata Alicia Doyle Lynch Harvard-MIT Data Center (HMDC) Documents for Today Find class materials at: http://libraries.mit.edu/guides/subjects/data/ training/workshops.html Several formats

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Rockefeller College University at Albany

Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

### Interaction effects between continuous variables (Optional)

Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

### International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

### MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Using JMP with a Specific

1 Using JMP with a Specific Example of Regression Ying Liu 10/21/ 2009 Objectives 2 Exploratory data analysis Simple liner regression Polynomial regression How to fit a multiple regression model How to

### STA 4163 Lecture 10: Practice Problems

STA 463 Lecture 0: Practice Problems Problem.0: A study was conducted to determine whether a student's final grade in STA406 is linearly related to his or her performance on the MATH ability test before

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Correlation and Simple Linear Regression

Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

### Comparing Nested Models

Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

### Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

### Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

### Quantitative Methods for Economics Tutorial 9. Katherine Eyal

Quantitative Methods for Economics Tutorial 9 Katherine Eyal TUTORIAL 9 4 October 2010 ECO3021S Part A: Problems 1. In Problem 2 of Tutorial 7, we estimated the equation ŝleep = 3, 638.25 0.148 totwrk

Using Your TI-NSpire Calculator: Linear Correlation and Regression Dr. Laura Schultz Statistics I This handout describes how to use your calculator for various linear correlation and regression applications.

### Lectures 8, 9 & 10. Multiple Regression Analysis

Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and

### t-tests and F-tests in regression

t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Data Analysis Methodology 1

Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

### Chapter 4 and 5 solutions

Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

### 1.1. Simple Regression in Excel (Excel 2010).

.. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### Simple Linear Regression

STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

### , has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

### AP Statistics Section :12.2 Transforming to Achieve Linearity

AP Statistics Section :12.2 Transforming to Achieve Linearity In Chapter 3, we learned how to analyze relationships between two quantitative variables that showed a linear pattern. When two-variable data

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies

### Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

### Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### N-Way Analysis of Variance

N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Topic 3. Chapter 5: Linear Regression in Matrix Form

Topic Overview Statistics 512: Applied Linear Models Topic 3 This topic will cover thinking in terms of matrices regression on multiple predictor variables case study: CS majors Text Example (NKNW 241)