Curvilinear Regression Analysis

Similar documents
ANALYSIS OF TREND CHAPTER 5

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Regression III: Advanced Methods

ORTHOGONAL POLYNOMIAL CONTRASTS INDIVIDUAL DF COMPARISONS: EQUALLY SPACED TREATMENTS

Module 5: Multiple Regression Analysis

MULTIPLE REGRESSION WITH CATEGORICAL DATA

UNDERSTANDING THE TWO-WAY ANOVA

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Testing for Lack of Fit

Premaster Statistics Tutorial 4 Full solutions

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Introduction to General and Generalized Linear Models

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

2.5 ZEROS OF POLYNOMIAL FUNCTIONS. Copyright Cengage Learning. All rights reserved.

Descriptive Statistics

Univariate Regression

Regression Analysis: A Complete Example

Chapter 5 Analysis of variance SPSS Analysis of variance

Examining a Fitted Logistic Model

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Two-sample inference: Continuous data

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Additional sources Compilation of sources:

Linear Models in STATA and ANOVA

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Section 13, Part 1 ANOVA. Analysis Of Variance

Multiple Regression: What Is It?

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Factors affecting online sales

Module 3: Correlation and Covariance

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

The importance of graphing the data: Anscombe s regression examples

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Simple Methods and Procedures Used in Forecasting

Analysis of Variance ANOVA

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Multivariate Analysis of Variance (MANOVA)

Week TSX Index

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

2. Simple Linear Regression

SPSS Guide: Regression Analysis

Application. Outline. 3-1 Polynomial Functions 3-2 Finding Rational Zeros of. Polynomial. 3-3 Approximating Real Zeros of.

Notes on Applied Linear Regression

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

One-Way Analysis of Variance

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Elementary Statistics Sample Exam #3

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Part 2: Analysis of Relationship Between Two Variables

Review of Fundamental Mathematics

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Introduction to Fixed Effects Methods

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Chapter 7: Simple linear regression Learning Objectives

Chapter 15. Mixed Models Overview. A flexible approach to correlated data.

CHAPTER 13. Experimental Design and Analysis of Variance

Final Exam Practice Problem Answers

Introduction to Quantitative Methods

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

ABSORBENCY OF PAPER TOWELS

Simple linear regression

LOGISTIC REGRESSION ANALYSIS

Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Simple Tricks for Using SPSS for Windows

General Regression Formulae ) (N-2) (1 - r 2 YX


Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

1.5 Oneway Analysis of Variance

UNCORRECTED PAGE PROOFS

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

12: Analysis of Variance. Introduction

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

3.6 The Real Zeros of a Polynomial Function

Example: Boats and Manatees

Homework 11. Part 1. Name: Score: / null

Come scegliere un test statistico

DATA INTERPRETATION AND STATISTICS

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Correlation key concepts:

Using R for Linear Regression

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Nonlinear Regression Functions. SW Ch 8 1/54/

Canonical Correlation Analysis

Transcription:

Analysis Lecture 18 April 7, 2005 Applied Analysis Lecture #18-4/7/2005 Slide 1 of 29

Today s Lecture ANOVA with a continuous independent variable. Today s Lecture regression analysis. Interactions with continuous variables. Lecture #18-4/7/2005 Slide 2 of 29

An Example From Pedhazur, p. 513-514: Example ANOVA SS Differences ANOVA Table Assume that in an experiment on the learning of paired associates, the independent variable is the number of exposures to a list. Specifically, 15 subjects are randomly assigned, in equal numbers, to five levels of exposure to a list, so that one group is given one exposure, a second group is given two exposures, and so on to five exposures for the fifth group. The dependent variable measure is the number of correct responses on a subsequent test. Lecture #18-4/7/2005 Slide 3 of 29

The Analysis Running an ANOVA (from Analyze...General Linear Model...Univariate in SPSS) produces these results: Example ANOVA SS Differences ANOVA Table Lecture #18-4/7/2005 Slide 4 of 29

The Interpretation From the example, we could test the hypothesis: Example ANOVA SS Differences ANOVA Table H 0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5 Here, F 4,10 = 2.10, which gives a p-value of 0.156. Using any reasonable Type-I error rate (like 0.05), we would fail to reject the null hypothesis. We would then conclude that there is no effect of number of exposures on learning (as measured by test score). Note that for this analysis there were five coded vectors produced (four degrees of freedom for the numerator). Lecture #18-4/7/2005 Slide 5 of 29

A New Analysis Example ANOVA SS Differences ANOVA Table Instead of running an ANOVA to test for differences between the means of the test scores at each level of X, couldn t we run an linear regression? In the words of Marv Albert: YES! For the linear regression to be valid, the means of the levels of X must fall on the linear regression line. The key point is that the means must follow a linear trend. Using the difference between the ANOVA and the, I will show you how you can test for a linear trend in the analysis. Lecture #18-4/7/2005 Slide 6 of 29

Multiple Results Running an regression (from Analyze...Linear in SPSS) produces these results: Example ANOVA SS Differences ANOVA Table Lecture #18-4/7/2005 Slide 7 of 29

Multiple Results From the example, we could test the hypothesis: Example ANOVA SS Differences ANOVA Table H 0 : b 1 = 0 Here, F 1,13 = 8.95, which gives a p-value of 0.010. Using any reasonable Type-I error rate (like 0.05), we would reject the null hypothesis. We would then conclude that there is a significant relationship between number of exposures and learning (as measured by test score). This conclusion is different than the conclusion we drew before. What is different about our analysis? Lecture #18-4/7/2005 Slide 8 of 29

SS Differences Notice from the ANOVA analysis, the SS treatment = 8.40. Example ANOVA SS Differences ANOVA Table From the regression analysis, the SS regression = 7.50. Note the difference between the two. The SS treatment is larger. SS deviation = SS treatment SS regression = 0.90. The difference between SS treatment and SS regression is termed SS deviation. Take a look at how that difference comes about. Lecture #18-4/7/2005 Slide 9 of 29

SS Differences The estimated regression line is: Example ANOVA SS Differences ANOVA Table Y = 2.7 + 0.5X X N X X Y X Y ( X Y ) 2 N X ( X Y ) 2 1 3 3.0 3.2-0.2 0.04 0.12 2 3 4.0 3.7 0.3 0.09 0.27 3 3 4.0 4.2-0.2 0.04 0.12 4 3 5.0 4.7 0.3 0.09 0.27 5 3 5.0 5.2-0.2 0.04 0.12 NX ( X Y ) 2 0.90 Lecture #18-4/7/2005 Slide 10 of 29

Data Scatterplot Example ANOVA SS Differences ANOVA Table number correct 6.00 5.00 M M 4.00 M M 3.00 M 2.00 1.00 2.00 3.00 4.00 5.00 number of exposures Lecture #18-4/7/2005 Slide 11 of 29

SS Differences Example ANOVA SS Differences ANOVA Table The value obtained in the previous slide, 0.90, was equal to the SS deviation. The SS deviation is literally the calculation of a statistic that measures a variable s deviation from linearity. This value serves as a basis for the question of: What is the difference between restricting the data to confirm to a linear trend and placing no such restriction? (Pedhazur, p. 517) Lecture #18-4/7/2005 Slide 12 of 29

SS Differences Example ANOVA SS Differences ANOVA Table When the SS treatment is calculated, there is no restriction on the means of the treatment groups. If the means fall onto a (straight) line, there will be no difference between SS treatment and SS regression, SS deviation = 0. With departures from linearity, the SS treatment will be much larger than the SS regression. Do you feel a statistical hypothesis test coming on? Lecture #18-4/7/2005 Slide 13 of 29

Hypothesis Test Example ANOVA SS Differences ANOVA Table The SS Treatments can be partitioned into two components: SS (also called the SS due to linearity), and the remainder, the SS due to deviation from linearity. Source df SS M S F Between Treatments 4 8.60 Linearity 1 7.50 7.50 7.50 Deviation From Linearity 3 0.90 0.30 0.30 Within Treatments 10 10.00 1.00 Total 14 18.40 If the SS due to linearity leads to a significant F value, then one can conclude a linear trend exists, and that linear regression is appropriate. Lecture #18-4/7/2005 Slide 14 of 29

The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple The preceding example demonstrated how a linear trend could be detected using a statistical hypothesis test. A linear trend is something we are very familiar with, having encountered linear regression for most of this course. regression analysis can be used to determine if not-so-linear trends exist between Y and X. Pedhazur distinguishes between two types of trends possible: Intrinsically linear. Intrinsically nonlinear. Lecture #18-4/7/2005 Slide 15 of 29

The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple An intrinsically linear model is one that is linear in its parameters but not linear in the variables. By transformation such a model may be reduced to a linear model. Such models are the focus of this remainder of this lecture. An intrinsically nonlinear model is one that may not be coerced into linearity by transformation. Such models often require more complicated estimation algorithms than what is provided by least squares and the GLM. Lecture #18-4/7/2005 Slide 16 of 29

The Polynomial Model The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple A simple regression model extension for curved relations is the polynomial model, such as the following second-degree polynomial: Y = a + b 1 X 1 + b 2 X 2 1 One could also estimate a third-degree polynomial: Y = a + b 1 X 1 + b 2 X 2 1 + b 3 X 3 1 Or a fourth-degree polynomial: Y = a + b 1 X 1 + b 2 X 2 1 + b 3 X 3 1 + b 4 X 4 1 And so on... Lecture #18-4/7/2005 Slide 17 of 29

The Polynomial Model: Estimation The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple The way of determining the extent to which a given model is applicable is similar to determining if added variables significantly improve the predictive ability of a regression model. Beginning with a linear model (a first-degree polynomial), estimate the model, denoted as R 2 y.x. The tests of incremental variance accounted for are done for each level of the polynomial: Linear: R 2 y.x Quadratic: R 2 y.x,x 2 R 2 y.x Cubic: R 2 y.x,x 2,x 3 R 2 y.x,x 2 Quartic: R 2 y.x,x 2,x 3,x 4 R 2 y.x,x 2,x 3 Lecture #18-4/7/2005 Slide 18 of 29

A New Example From Pedhazur, p. 522: The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Suppose that we are interested in the effect of time spent in practice on the performance of a visual discrimination task. Subjects are randomly assigned to different levels of practice, following which a test of visual discrimination is administered, and the number of correct responses is recorded for each subject. As there are six levels the highest-degree polynomial possible for these data is the fifth. Our aim, however, is to determine the lowest degree-polynomial that best fits the data. Lecture #18-4/7/2005 Slide 19 of 29

Data Scatterplot 20.00 The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Task Score 15.00 10.00 5.00 2.50 5.00 7.50 10.00 Practice Time Lecture #18-4/7/2005 Slide 20 of 29

Estimation In SPSS The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple To estimate the degrees of a polynomial, first one must create new variables in SPSS, each representing X raised to a given power. Then successive regression analyses must be run, each adding a level to the equation: Model R 2 Increase Over Previous F X 0.883 0.940 121.029 * X, X 2 0.943 0.060 15.604 * X, X 2, X 3 0.946 0.003 0.911 Because adding X 3 did not significantly increase R 2, we stop with the quadratic model. Lecture #18-4/7/2005 Slide 21 of 29

Estimation In SPSS Of course, there is an easier way... In SPSS go to Analyze...Curve Estimation The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Lecture #18-4/7/2005 Slide 22 of 29

Estimation In SPSS MODEL: MOD_2. Independent: x The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Dependent Mth Rsq d.f. F Sigf b0 b1 b2 b3 y LIN.883 16 121.03.000 3.2667 1.5571 y QUA.943 15 123.55.000-1.9000 3.4946 -.1384 y CUB.946 14 82.18.000.6667 1.8803.1290 -.0127 Lecture #18-4/7/2005 Slide 23 of 29

Data Scatterplot The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Lecture #18-4/7/2005 Slide 24 of 29

Parameter Interpretation The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple The b parameters in a polynomial regression are nearly impossible to interpret. An independent variable is represented by more than a single vector - what s held constant? The relative magnitude of the b parameters for different degrees cannot be compared because the SD of the higher degree polynomials explodes. X s 2 x X 2 (s 2 x) 2 X 3 (s 2 x) 3... Lecture #18-4/7/2005 Slide 25 of 29

Variable Centering The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Centering variables in a polynomial equation can avoid collinearity problems. Centering does not change the R 2 of a model, only the regression parameters. Lecture #18-4/7/2005 Slide 26 of 29

Multiple The Polynomial Model New Example Estimation In SPSS Parameter Interpretation Variable Centering Multiple Running multiple curvilinear regression models are straight forward extensions from what was shown today: Y = a + b 1 X + b 2 Z + b 3 XZ + b 4 X 2 + b 5 Z 2 Note the cross-product XZ. This cross product term is tested above and beyond X and Z individually. Lecture #18-4/7/2005 Slide 27 of 29

Final Thought Final Thought Next Class regression can be accomplished using techniques we are familiar with. Interpretation can be tricky... We are all lucky to be students during this season... Lecture #18-4/7/2005 Slide 28 of 29

Next Time Final Thought Next Class No class next week (I m in Montreal...if you are there, say hello). Chapter 14: Continuous and categorical independent variables. Comedy provided by this guy: Lecture #18-4/7/2005 Slide 29 of 29