# 1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand how to conduct a moderated multiple regression using SPSS Understand how to interpret moderated multiple regression Learn to generate predicted values for interaction using Excel Learn to run simple slopes tests in SPSS Learn how to test higher-order interactions Effect of a predictor variable (X) on an outcome variable (Y) depends on a third variable (M) The term moderation is synonymous with interaction effect M X Y 1

2 Effect of presence of others on performance depends on the dominance of response tendencies (i.e., social facilitation; Zajonc, 1965) Effects of stress on health depends on social support (Cohen & Wills, 1985) Effect of provocation on aggression depends on trait aggressiveness (Marshall & Brown, 2006) Y = A + BX + e X Y Y = A + B 1 X + B 2 M + e X M Y 2

3 Y = A + B 1 X + B 2 M + e Y B 1 High M Medium M Low M B 2 The intercept of the regression of Y on X depends on the specific value of M Slope of the regression of Y on X (b 1 ) stays constant X Y = A + B 1 X + B 2 M + B 3 X*M + e X M Y X*M Y = A + B 1 X + B 2 M + B 3 X*M + e Y X High M Medium M Low M The slope and intercept of the regression of Y on X depends on the specific value of M There is a different line for every individual value of M (simple regression line) 3

4 Y = A + B 1 X + B 2 M + B 3 X*M + e The interaction is carried by the X*M term which is the product of X and M The B 3 coefficient reflects the interaction between X and M only if the lower order terms B 1 X and B 2 M are included in the equation! Leaving out these terms confounds the additive and multiplicative effects which produces misleading results ALWAYS INCLUDE THE MAIN EFFECT TERMS THAT CONSTITUTE YOUR INTERACTION Each individual has a score on X and M To form the X*M term, simply multiply together the individual s scores on X and M There are two equivalent ways to evaluate whether an interaction is present: Test whether the coefficient B 3 differs significantly from zero Test whether the increment in the squared multiple correlation ( R2) given by the interaction is significantly greater than zero Interactions work with continuous or categorical predictor variables For categorical variables, we have to agree on a coding scheme (dummy vs. effects coding) X: Self-Esteem Level M: Self-Esteem Instability Y: Psychological Well-Being SE Level Does the association between self-esteem level and psychological wellbeing depend on self-esteem instability? SE instability SE Level x SE Instability Well-Being 4

5 Why not simply split both X and M into two groups and conduct ordinary ANOVA to test for moderation? Disadvantage #1: Median splits are highly sample dependent Disadvantage #2: This will reduce our power to detect interaction effects because we are ignoring useful information about the predictors Disadvantage #3: Median splits can bias the results of moderation 5

6 Unstandardized means that the original metrics of the variables are preserved This is accomplished by centering both X and M around their respective sample means Centering refers to subtracting the mean of the variable from each score Centering provides a meaningful zero-point for X and M (gives you effects at the mean of X and M, respectively) Centering predictors does not affect the interaction term, but all of the other coefficients in the model Compute crossproduct of cx (centered X) and cm (centered M) Y = A + B 1 cx + B 2 cm + B 3 cx*cm + e This statement creates variable a which will be the centered score for self-esteem level This statement creates variable b which will be the centered score for self-esteem instability 6

7 This statement creates the ab product term which simply multiplies the centered scores for self-esteem level and self-esteem instability This statement is the regression command This statement identifies psychological well-being as the outcome variable 7

8 This statement is the command that identifies which variables will be entered on the first step of the hierarchical multiple regression This statement is the command that identifies which variable will be entered on the second step of the hierarchical multiple regression Descriptive statistics will be generated for all of the variables in your model 8

9 A correlation matrix will be generated for all of the variables in your model The output will display which variables were entered on each step This is the R for Model 1 which reflects the degree of association between our first regression model (the main effects of self-esteem level and self-esteem instability) and the criterion variable 9

10 This is R 2 for Model 1 which represents the amount of variance in the criterion variable that is explained by the model (SS M ) relative to how much variance there was to explain in the first place (SS T ). To express this value as a percentage, you should multiply this value by 100 (which will tell you the percentage of variation in the criterion variable that can be explained by the model). This is the Adjusted R 2 for the first model which corrects R 2 for the number of predictor variables included in the model. It will always be less than or equal to R 2. It is a more conservative estimate of model fit because it penalizes researchers for including predictor variables that are not strongly associated with the criterion variable This is the standard error of the estimate for the first model which is a measure of the accuracy of predictions. The larger the standard error of the estimate, the more error in our regression model 10

11 The change statistics are not very useful for the first step because it is comparing this model with an empty model (i.e., no predictors) which is going to be the same as the R 2 This is the R for Model 2 which reflects the degree of association between this regression model (main effects of self-esteem level and self-esteem instability as well as their interaction) and the criterion variable This is R 2 for Model 2 which represents the amount of variance in the criterion variable that is explained by the model (SS M ) relative to how much variance there was to explain in the first place (SS T ). To express this value as a percentage, you should multiply this value by 100 (which will tell you the percentage of variation in the criterion variable that can be explained by the model). 11

12 This is the Adjusted R 2 for Model 2 which corrects R 2 for the number of predictor variables included in the model. It will always be less than or equal to R 2. It is a more conservative estimate of model fit because it penalizes researchers for including predictor variables that are not strongly associated with the criterion variable This is the standard error of the estimate for Model 2 which is a measure of the accuracy of predictions. The larger the standard error of the estimate, the more error in our regression model The change statistics tell us whether the addition of the interaction term in Model 2 significantly improved the model fit. A significant F Change value means that there has been a significant improvement in model fit (i.e., more variance in the outcome variable has been explained by Model 2 than Model 1) 12

13 SS M MS M SS R MS R SS T The ANOVA for Model 1 tells us whether the model, overall, results in a significantly good degree of prediction of the outcome variable. SS M MS M SS R MS R SS T The ANOVA for Model 2 tells us whether the model, overall, results in a significantly good degree of prediction of the outcome variable. However, the ANOVA does not tell us about the individual contribution of variables in the model This is the Y-intercept for Model 1. This means that when a (centered self-esteem level) and b (centered self esteem instability) are 0 (i.e., those variables are at their means), then we would expect the score for psychological well-being to be

14 This t-test compares the value of the Y-intercept with 0. If it is significant, then it means that the value of the Y-intercept (4.606 in this example) is significantly different from 0 This is the unstandardized slope or gradient coefficient. It is the change in the outcome associated with a one unit change in the predictor variable. This means that for every 1 point that self-esteem level is increased the model predicts an increase of for psychological well-being This is the standardized regression coefficient (β). It represents the strength of the association between the predictor and the criterion variable 14

15 This t-test compares the magnitude of the standardized regression coefficient (β) with 0. If it is significant, then it means that the value of β (0.542 in this example) is significantly different from 0 (i.e., the predictor variable is significantly associated with the criterion variable) This is the unstandardized slope or gradient coefficient. It is the change in the outcome associated with a one unit change in the predictor variable. This means that for every 1 point that self-esteem instability is increased the model predicts a decrease of for psychological well-being This is the standardized regression coefficient (β). It represents the strength of the association between the predictor and the criterion variable 15

16 This t-test compares the magnitude of the standardized regression coefficient (β) with 0. If it is significant, then it means that the value of β ( in this example) is significantly different from 0 (i.e., the predictor variable is significantly associated with the criterion variable) This is the Y-intercept for Model 2. This means that when centered self-esteem level, centered self-esteem instability, and their interaction are all 0, then the model predicts that the score for psychological well-being will be This t-test compares the value of the Y-intercept for Model 2 with 0. If it is significant, then it means that the value of the Y-intercept (4.576 in this example) is significantly different from 0 16

17 This is the regression information for self-esteem level when selfesteem instability and the self-esteem level x self-esteem instability product term are also included in the model This is the regression information for self-esteem instability when self-esteem level and the self-esteem level x self-esteem instability product term are also included in the model This is the unstandardized slope or gradient coefficient for the interaction term 17

18 SPSS calculates the standardized regression coefficient for the product term incorrectly so this value is not quite right but it is what most people use This t-test compares the magnitude of the standardized regression coefficient (β) with 0. If it is significant, then it means that the value of β ( in this example) is significantly different from 0 (i.e., the interaction term is significantly associated with the criterion variable) SPSS takes the z-score of the product (z XM ) when calculating the standardized scores Except in unusual circumstances, z XM is different from z x z m which is the product of the two z-scores in which we are interested Friedrich (1982) suggested the following solution: Run a regression using standardized variables 18

19 This is the correct standardized regression coefficient for the product term of self-esteem level x self-esteem instability (even though it is in the unstandardized column). NOTE: It is important that the predictors AND the outcome variable are standardized! SPSS does not provide a straightforward module for plotting interactions There is an infinite number of slopes we could compute for different combinations of X and M The most useful is to calculate values for high (+1 SD) and low (-1 SD) X as a function of high (+1 SD) and low (-1 SD) values of M For a two-way interaction, we will plot four predicted values: (1) high SE level and high SE instability, (2) high SE level and low SE instability, (3) low SE level and high SE instability, and (4) low SE level and low SE instability 19

20 Each of the four predicted values will use the unstandardized regression coefficients from our regression analysis To get high SE level, you will add 1 SD to the mean (i.e., = ) To get high SE instability, you will add 1 SD to the mean (i.e., = ) 20

21 To get the value for the interaction, the values for SE level and SE instability are multiplied (i.e., =C6*C7) To get the predicted value for high SE level and high SE instability, multiply the unstandardized regression coefficient by the value in the mean column (e.g., B6*C6) and then add those product terms together (i.e., =SUMPRODUCT(B5:B8,C5:C8) Repeat the process for this set of cells 21

22 then this set of cells and finally this set of cells This spreadsheet is set up to use the predicted values to create the plot 22

23 It appears that individuals with high self-esteem (dotted line) generally report higher levels of psychological well-being than those with low selfesteem (solid line) and that this is especially true for those with stable high self-esteem but we need to test this to make sure Test of interaction term: Does the association between X and Y depend on M? Simple slope testing: Do the regression coefficients for high (+1 SD) and low (-1 SD) values of X (and M) differ significantly from zero? Simple slope testing for low self-esteem (-1 SD) To test this simple slope, we +1 SD to the values of SE level Simple slope testing for high self-esteem (+1 SD) To test this simple slope, we -1 SD to the values of SE level This seems backward but by adding (or subtracting) a standard deviation from each score, we are shifting the zero points Now run separate regression analysis with each transformed score 23

24 High SE Level We are testing the association between SE instability and psychological well-being when SE level is high and low Low SE Level High SE Level This value shows us the strength of the association between SE instability and psychological wellbeing when SE level is high Low SE Level 24

25 High SE Level This value shows us the strength of the association between SE instability and psychological wellbeing when SE level is low Low SE Level It is important to note that we can also test the hidden simple slopes. They are hidden because of the way we chose to plot the data but they can still be tested in the same way we tested the obvious simple slopes 25

26 High SE Instability We are testing the association between SE level and psychological well-being when SE instability is high and low Low SE Instability High SE Instability This value shows us the strength of the association between SE level and psychological well-being when SE instability is high Low SE Instability High SE Instability This value shows us the strength of the association between SE level and psychological well-being when SE instability is low Low SE Instability 26

27 You may want to control for other variables (i.e., covariates) Simply add centered continuous covariates as predictors to the regression equation In case of categorical control variables, you may want to use effect coding With effect coding, the categorical variable is analyzed as a set of (nonorthogonal) contrasts which opposes all but one category to another category With effect coding, the intercept is equal to the grand mean and the slope for a contrast expresses the difference between a group and the grand mean Example with 2 categories (single variable) Men (+1), Women (-1) Example with 3 categories (two variables) Democrats (1, 0) Independents (0,1) Republicans (-1, -1) Standardized regression coefficient (β) is already an effect size statistic but it is not perfect f 2 (see Aiken & West, 1991, p. 157) r 2 Y.AI: Squared multiple correlation resulting from combined prediction of Y by the additive set of predictors (A) and their interaction (I) (= full model) r 2 Y.A: Squared multiple correlation resulting from prediction by set A only (= model without interaction term) f 2 gives you the proportion of systematic variance accounted for by the interaction relative to the unexplained variance in the outcome variable Conventions by Cohen (1988) f 2 =.02: small effect f 2 =.15: medium effect f 2 =.26: large effect 27

28 ... =.. =.083 small effect of interaction. Very similar process to what we described for continuous x continuous The major difference involves the coding scheme for the categorical variable (i.e., effect coding v. dummy coding) We will talk more about coding schemes when we talk about curvilinear regression Higher-order interactions refers to interactions among more than 2 variables (e.g., a three-way interaction) All basic principles (centering, coding, probing, simple slope testing, effect size) generalize to higher-order interactions The major difference is that you begin probing the three-way interaction by examining whether the two-way interactions emerge at high and low values of one of your moderators Example: SE level x SE instability x Sex Well-being I would begin probing the interaction by determining whether the SE level x SE instability interaction emerged for men and women separately...then follow the basic procedures as outlined earlier 28

29 Plot first-level moderator effect (e.g., SE level x SE instability) at different levels of the third variable (e.g., sex) It is best to use two graphs with each showing a two-way interaction There are 6 different ways to plot the three-way interaction Best presentation should be determined by theory In the case of categorical variables (e.g., sex) it often makes sense to plot the separate graphs as a function of group (one graph for men and another for women) The logic to compute the values for different combinations of high and low values on predictors is the same as in the two-way case If variables were measured without error, the following sample sizes are needed to detect small, medium, and large interaction effects with adequate power (80%) Large effect (f 2 =.26): N = 26 Medium effect (f 2 =.13): N = 55 Small effect (f 2 =.02): N = 392 Busemeyer & Jones (1983): reliability of product term of two uncorrelated variables is the product of the reliabilites of the two variables.80 x.80 =.64 Required sample size is more than doubled when predictor reliabilites drop from 1 to.80 Problem gets even worse for higher-order interactions 29

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Moderation. Moderation

Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### January 26, 2009 The Faculty Center for Teaching and Learning

THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

### Simple Linear Regression, Scatterplots, and Bivariate Correlation

1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

### Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

### Testing and Interpreting Interactions in Regression In a Nutshell

Testing and Interpreting Interactions in Regression In a Nutshell The principles given here always apply when interpreting the coefficients in a multiple regression analysis containing interactions. However,

### Using Excel for Statistical Analysis

Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

### Regression-Based Tests for Moderation

Regression-Based Tests for Moderation Brian K. Miller, Ph.D. 1 Presentation Objectives 1. Differentiate between mediation & moderation 2. Differentiate between hierarchical and stepwise regression 3. Run

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### 10. Analysis of Longitudinal Studies Repeat-measures analysis

Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

### For example, enter the following data in three COLUMNS in a new View window.

Statistics with Statview - 18 Paired t-test A paired t-test compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the

### Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### When to use Excel. When NOT to use Excel 9/24/2014

Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Running head: ASSUMPTIONS IN MULTIPLE REGRESSION 1. Assumptions in Multiple Regression: A Tutorial. Dianne L. Ballance ID#

Running head: ASSUMPTIONS IN MULTIPLE REGRESSION 1 Assumptions in Multiple Regression: A Tutorial Dianne L. Ballance ID#00939966 University of Calgary APSY 607 ASSUMPTIONS IN MULTIPLE REGRESSION 2 Assumptions

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

### Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### Introduction to Data Analysis in Hierarchical Linear Models

Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

### Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Assessing Measurement System Variation

Assessing Measurement System Variation Example 1: Fuel Injector Nozzle Diameters Problem A manufacturer of fuel injector nozzles installs a new digital measuring system. Investigators want to determine

### Chapter 14: Analyzing Relationships Between Variables

Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between

### Everything You Wanted to Know about Moderation (but were afraid to ask) Jeremy F. Dawson University of Sheffield

Everything You Wanted to Know about Moderation (but were afraid to ask) Jeremy F. Dawson University of Sheffield Andreas W. Richter University of Cambridge Resources for this PDW Slides SPSS data set SPSS

Table of Contents Introduction to Minitab...2 Example 1 One-Way ANOVA...3 Determining Sample Size in One-way ANOVA...8 Example 2 Two-factor Factorial Design...9 Example 3: Randomized Complete Block Design...14

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

### Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

### Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Excel Tutorial Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Working with Data Entering and Formatting Data Before entering data

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures

Journal of Data Science 8(2010), 61-73 Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures Matthew J. Davis Texas A&M University Abstract: The

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### Forecasting in STATA: Tools and Tricks

Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be

### MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών

Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών Το υλικό αυτό προέρχεται από workshop που οργανώθηκε σε θερινό σχολείο της Ευρωπαϊκής

### Multiple Regression Analysis (ANCOVA)

Chapter 16 Multiple Regression Analysis (ANCOVA) In many cases biologists are interested in comparing regression equations of two or more sets of regression data. In these cases, the interest is in whether

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### 4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### 1 Theory: The General Linear Model

QMIN GLM Theory - 1.1 1 Theory: The General Linear Model 1.1 Introduction Before digital computers, statistics textbooks spoke of three procedures regression, the analysis of variance (ANOVA), and the

### Binary Logistic Regression

Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

### Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

### Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

### MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

### Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### The method of Least Squares using the Excel Solver

The method of Least Squares using the Excel Solver Michael Wood (Michael.wood@port.ac.uk) 22 October 2012 (Introduction and links to electronic versions of this document and the other parts at http://woodm.myweb.port.ac.uk/stats.

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### Algebra 1 Chapter 3 Vocabulary. equivalent - Equations with the same solutions as the original equation are called.

Chapter 3 Vocabulary equivalent - Equations with the same solutions as the original equation are called. formula - An algebraic equation that relates two or more real-life quantities. unit rate - A rate

### Moderator and Mediator Analysis

Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,

College Readiness LINKING STUDY A Study of the Alignment of the RIT Scales of NWEA s MAP Assessments with the College Readiness Benchmarks of EXPLORE, PLAN, and ACT December 2011 (updated January 17, 2012)

### Testing Moderator and Mediator Effects in Counseling Psychology Research

Journal of Counseling Psychology Copyright 2004 by the American Psychological Association, Inc. 2004, Vol. 51, No. 1, 115 134 0022-0167/04/\$12.00 DOI: 10.1037/0022-0167.51.1.115 Testing Moderator and Mediator

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Full Factorial Design of Experiments

Full Factorial Design of Experiments 0 Module Objectives Module Objectives By the end of this module, the participant will: Generate a full factorial design Look for factor interactions Develop coded orthogonal

### Schools Value-added Information System Technical Manual

Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

### AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

### Getting Started with HLM 5. For Windows

For Windows August 2012 Table of Contents Section 1: Overview... 3 1.1 About this Document... 3 1.2 Introduction to HLM... 3 1.3 Accessing HLM... 3 1.4 Getting Help with HLM... 4 Section 2: Accessing Data

### Eviews Tutorial. File New Workfile. Start observation End observation Annual

APS 425 Professor G. William Schwert Advanced Managerial Data Analysis CS3-110L, 585-275-2470 Fax: 585-461-5475 email: schwert@schwert.ssb.rochester.edu Eviews Tutorial 1. Creating a Workfile: First you

### NOTES ON HLM TERMINOLOGY

HLML01cc 1 FI=HLML01cc NOTES ON HLM TERMINOLOGY by Ralph B. Taylor breck@rbtaylor.net All materials copyright (c) 1998-2002 by Ralph B. Taylor LEVEL 1 Refers to the model describing units within a grouping:

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

### The Basic Two-Level Regression Model

2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### Penalized regression: Introduction

Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Mediation with Dichotomous Outcomes

April 19, 2013 This is a draft document to assist researchers. Please do not cite or quote without the author s permission. I thank both Nathaniel R. Herr and Craig Nathanson who suggested corrections

### Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation