CHAPTER 2 AND 10: Least Squares Regression
|
|
- Adrian Pope
- 7 years ago
- Views:
Transcription
1 CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot and describe the form, direction and strength of the relationship. Note: fitting a line only makes sense if the overall pattern of the scatterplot is roughly linear. 2. Look for outliers and influential observations on the scatterplot. a. Note: Inference is not safe if there are influential points as results depend strongly on these few points. It is often helpful to rework the data without the influential variables and compare the results. 3. Find the correlation r to get a numerical measure of direction and strength of the linear relationship. 4. Find r 2, the fraction of variation in the values of y that is explained by the least squares regression of y on x. 5. If the data is reasonably linear, find the least squares regression line for the data. Note: The line can be used to predict y for a given x. 6. Make a residual plot and normal probability plot to check the regression assumptions. 7. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence interval for the correlation, slope and intercept. 8. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence intervals for the mean response and prediction intervals.
2 Association Between Variables: Two variables measured on the same individuals are associated if some values of one variable tend to occur more often with some values of the second variable than with other values of that variable. Just because two variables are associated doesn t mean that a change in one variable causes a change in the other. (Causation section 2.6) Also, the relationship between two variables might not tell the whole story. Other variables may affect the relationship. These other variables are called lurking variables. Positive association: When above average values of one variable tend to accompany aboveaverage values of the other, and below average values also tend to occur together. Negative association: When above average values of one variable tend to accompany below average values of the other and visa versa. No association: Hard to find a pattern showing a relationship between the variables. Response variable: measures an outcome of a study. Dependent variable Y Explanatory variable: explains or causes changes in the response variable. Independent variable X 2
3 Example : A forester has become adept at estimating the volume (in cubic feet) of trees on a particular site prior to a timber sale. Since his operation has now expanded, he would like to train another person to assist in estimating the cubic foot volume of trees. He decided to create a model that will allow him to obtain the actual tree volume based on his assistant s estimation. The forester selects a random sample of trees to be felled. For each tree, the assistant is to guess the cubic foot volume of the tree. The forester also obtains the actual foot volume after the tree has been chopped down. Below is his data: Tree Estimated Volume Actual Volume STEP : Make a scatterplot; describe the form, direction and strength of the relationship. Before doing the scatterplot you need to decide which variable is the explanatory variable and which is the response variable. For Example, identify the explanatory and the response variables. Explanatory Variable: Response Variable: A scatterplot shows the relationship between two quantitative variables measured on the same individual. The explanatory variable is plotted on the x axis; the response variable is plotted on the y axis. Look at the overall pattern. The overall pattern can be described by form, direction and strength. Form: is the scatterplot linear, quadratic, etc. Direction: is the association positive or negative? Strength: of the relationship. Describe the scatterplot in Example. Form: Strength: Direction: 3
4 Scatterplot Using SPSS: >Graphs >Scatter/Dot Select simple and click define Pull estimate into the X Axis box and actual into the Y Axis box then click OK. Note: To get the fitted line you need to double click on your graph to bring up the chart editor. You will then need to click on a button that looks like a sdatterplot with a fitted line through it. Select linear and then close. Estimated Volume versus Actual Volume of Trees Actual R Sq Linear = Estimate STEP 2: Look for outliers and influential observations on the scatterplot. Look for striking deviations from the overall pattern. Outlier: An observation that lies outside of the overall pattern of the other observations. Points that are outliers in the y direction of a scatterplot have large regression residuals, but other outliers need not have large residuals. Influential observations: an observation that if removed would markedly change the results of the regression calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least squares regression line. Are there any outliers or influential observations in our data? Note: To add a categorical variable to a scatterplot, use a different plot color or symbol for each category. 4
5 STEP 3: Find the correlation r. The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r. x ix yiy r n s x s y Properties of correlation: It makes no difference which variable you call x and which you call y since correlation does not make use of the distinction between the explanatory variable and the response variable. Both variables need to be quantitative to calculate correlation. The correlation r does not change if we change the units of measurements of x, y, or both. A positive r corresponds to a positive relationship between the variables. A negative r corresponds to a negative relationship between the variables. ( r ) Values near 0 indicate a weak relationship and values close to or indicate a strong relationship. Correlation measures the strength of only a linear relationship. Like the mean and standard deviation, the correlation is not resistant. The correlation r is strongly affected by a few outlying observations. Use r with caution when outliers appear in the scatterplot. Correlation is not a complete description of two variable data. You should give the mean and standard deviations of both x and y along with the correlation. For Example, find the correlation between estimated volume and actual volume. We will use the Pearson Correlation output from SPSS here. 5
6 Note: The SPSS manual tells you where to find r using the least squares regression output, but this r is actually the ABSOLUTE VALUE OF r, so you need to figure out the sign yourself by looking at the association (positive or negative) of your data. Estimate Actual Correlations Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Estimate Actual.936** ** **. Correlation is significant at the 0.0 level (2-tailed). STEP 4: Find r 2. r 2 is the percent of variation in y explained by the regression line (the closer to 00%, the better). We can get this from the regression output by squaring the correlation r. For Example, find the percent of variation in actual volume of trees explained by the regression line. STEP 5: Find the least squares regression line for the data. The least squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible. We have data on an explanatory variable x and a response variable y for n individuals. The means and standard deviations of the data are x and s x for x and y and s y for y; the correlation between x and y is r. The regression Model for the population is: yi 0 x i The sample prediction equation of the least squares regression is: y b bx 0 The slope is: sy b r s x Where the slope measures the amount of change caused in the response variable when the explanatory variable is increased by one unit. The intercept is: b0 y bx 6
7 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click OK. Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.936 a a. Predictors: (Constant), Estimate Model Regression Residual Total a. Predictors: (Constant), Estimate b. Dependent Variable: Actual ANOVA b Sum of Squares df Mean Square F Sig a Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig For Example, find the least squares regression line. Based on the r 2 value you found previously, do you think this line will be useful for predicting actual tree volumes? For Example, use the regression line to predict the actual volume of a tree with an estimated volume of 3 cubic feet? We can use a regression line to make predictions as long as we follow the following rules: Only use the least squares regression line to find y for a specific value of x. (Don t use it to find x for a specific value of y!) Extrapolation involves using the line to find y values corresponding to x values that are outside the range of our data x values. Typically we want to avoid this since the line may not be valid for wide ranges of x values. 7
8 Example 2: (From Moore and McCabe fourth edition) During the period after birth, a male white rat gains 40 grams (g) per week till about 0 weeks of age. (This is unusually regular in his growth, but 40g per week is a realistic rate.) a. If the rat weighed 00g at birth, give an equation for his weight after a week. What is the slope of this line? b. Would you be willing to use this line to predict the rat s weight at age 2 years: Do the prediction and think about the reasonableness of the results? (There are 454 g in a pound. To help you asses the results, a large cat weighs about 0 pounds. Prediction Intervals Predicting a future observation under conditions similar to those used in the study. Since there is variability involved in using a model created from sample data, a prediction interval is better than a single prediction. They re related to confidence intervals. Use SPSS to calculate these intervals. Residual: The vertical distance between the observed y value and the corresponding predicted y value. residual e y y y ( b b x ) i i i i 0 i For example 2, find the residual for tree number and tree number 7. Assumptions for Regression Inference and Regression Model:. Repeated responses of y are independent of each other. This basically means the data comes from a simple random sample. (To check this assumption examine the way in which the units were selected.) 2. For any fixed value of x, the response y varies according to a normal distribution. (To check the assumption of normality you can do a normal probability plot of the residuals on SPSS). 3. The relationship is linear. (To check the linearity assumption, you can make a scatterplot or a residual plot of the data.) 4. The standard deviation of y is the same for all values of x. The value of is unknown. (To check for constant variability you can look at the residual plot of the data.) 8
9 STEP 6: Make a residual plot and normal probability plot to check the regression assumptions. It is always important to check that the assumptions of the regression model have been met to determine whether your results are valid. This is also important to do before you proceed with inference. Normal Probability plots: If your points fall in a relatively straight line, then you can assume that your response is relatively normal and the second assumption has been met. To check the normality assumption we make a normal probability plot by doing the following: Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Then select plots and click on the box for normal probability plot and click continue followed by OK. Normal P-P Plot of Regression Standardized Residual.0 Dependent Variable: Actual 0.8 Expected Cum Prob Observed Cum Prob For example 2, has the normality assumption has been met? Residual Plots: A residual plot is a scatterplot of the regression residuals against the explanatory variable. It is used to assess the fit of a regression line and to check for a constant variability. The residual plot magnifies the deviations from the line to make the pattern easy to read. If the points are random with no pattern and approximately the same number of points above and below the center line, you can feel confident that assumptions three and four have been met. If you have a funnel shape this shows you that the assumption of constant variance has not been met. If you have some other pattern like a parabola, this shows you that the linearity assumption has not been met. 9
10 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on the save button, check unstandardized residuals, and click continue. The residuals will appear in the data editor. Make a scatterplot of the residuals on the y axis against the estimated volume on the x axis. To get the line at y=0, when in the chart editor right click on the graph and select Add Y Axis reference line. Then select reference line and plug in a zero for y axis position. Residual Plot For example 2, have the assumptions of linearity and constant variability been met? Unstandardized Residual Estimate Lastly, it is important to check for outliers and influential variables. Looking for high residuals or points that are far from the other points is important. Often we will want to do the analysis both with and without the outliers, particularly if they are influential variables as well. Are there any outliers or influential variables in example 2? 0
11 Scatterplot Without the Influential Variable Actual Derived from Actual 4.00 R Sq Linear = Estimate Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.86 a a. Predictors: (Constant), Estimate Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig
12 STEP 7: Look at the hypothesis tests and confidence intervals Up until this point we have looked at some regression related concepts that can be used in an exploratory data analysis setting as well as a more formal setting. We will now look at inference for regression. Before we do this, however, it must be understood that the tests and confidence intervals that we find from now on out can only be found on data that has been collected using a random sampling technique such as simple random sampling. If we did not collect our data using a random sample, or if we have conducted a census, these techniques are meaningless. Test for a Zero Population Correlation: State null and alternative hypotheses H : 0 versus 0 a : 0 a Find the test statistic r n 2 t 2 r H, H : 0 or H : 0 where n is the sample size and r is the sample correlation Calculate the P value in terms of a random variable T having the ( 2) The P value for a test of H 0 against H a : 0 is PT ( t) H a : 0 is PT ( t) H : 0 is 2 P( T t ) a Compare the P value to the α level If P value α, then reject H 0 If P value > α, fail to then reject State your conclusions in terms of the problem H 0 a tn distribution. For example, test H : 0 versus 0 H a : 0 2
13 Confidence Intervals for Regression Slope and Intercept: A level C confidence interval for the intercept is 0 b t SE * 0 b 0 A level C confidence interval for the slope is b t SE * b SPSS will also give you these confidence intervals for 95%, but you may have to use the estimates for the coefficients and their standard errors to find the other confidence intervals. (Use the t table and n 2 degrees of freedom to get t*). Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on statistics and select confidence intervals and click continue followed by OK Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta For example 2, find the 95% and 99% confidence intervals for slope and y intercept. 3
14 Hypothesis test for the regression slope: State the null and Alternative Hypotheses H0 : 0 versus Ha : 0, a : 0 Find the test statistic t b, with df = n 2 SEb H or H : 0 (SPSS will give you the test statistic) Calculate the P value (SPSS will give you the 2 sided P value. If you have a one sided test, you will have to divide the P value by 2). Compare the P value to the α level If P value α, then reject H 0 If P value > α, then fail to reject H State your conclusions in terms of the problem 0 a The test statistic for correlation is numerically identical to the test statistic used to test slope. Therefore, you can read the test statistic and P value off of the SPSS output for the slope when doing a test for the correlation. For example 2, perform a significance test to see whether the slope of the regression line is positive. 4
15 Example 4: This example will use data that is part of a data set from Dr. T.N.K. Raju, Department of Neonatology, University of Illinois at Chicago. IMR=Infant Mortality rate PQLI=Physical Quality of life Index (Indicator of average wealth) Case PQLI IMR Case PQLI IMR How does the physical quality of life index affect infant mortality rate? Answer the questions below based on the output that follows. a. Describe the form, direction and strength of the relationship. b. What is the correlation? c. What percent of the variation in infant mortality rate is explained by the regression line? d. Give an estimate for the standard deviation of the model. (Find s.) e. Do a hypothesis test to test H : 0 0 versus H : 0. a f. What is the equation of the least squares regression line? g. Use the regression line to predict a PQLI of 25. 5
16 h. Is the prediction in part 6 good? Why? i. Find the residual for case. j. Find a 99% confidence interval for the slope. k. What assumptions need to be met for the above to be of use? How Physical Quality of Life affects Infant Mortality Rate 30 0 IMR PQLI Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.300 a a. Predictors: (Constant), PQLI Model (Constant) PQLI a. Dependent Variable: IMR Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta
17 Residual Plot Unstandardized Residual PQLI Normal P-P Plot of Regression Standardized Residual.0 Dependent Variable: IMR 0.8 Expected Cum Prob Observed Cum Prob Does the relationship make sense? For example, does it make sense that the infant mortality rate will go up as the physical quality of life index gets better? What could be a potential lurking variable here? Now let s look at what happens if we add a categorical variable to the picture. Case PQLI IMR Location 7 0 rural rural rural urban rural urban rural urban urban rural urban 7
18 40 How Physical Quality of Life Affects Infant Mo LOCATION urban IMR rural PQLI 8
Chapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationDoing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationFormula for linear models. Prediction, extrapolation, significance test against zero slope.
Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationCorrelation and Regression Analysis: SPSS
Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationAn analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression
Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
More informationRelationships Between Two Variables: Scatterplots and Correlation
Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationChapter 2 Probability Topics SPSS T tests
Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationWe are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationHomework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationKSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationMULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis
Journal of tourism [No. 8] MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM Assistant Ph.D. Erika KULCSÁR Babeş Bolyai University of Cluj Napoca, Romania Abstract This paper analysis
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationIndependent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
More informationSCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
More informationPredictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014
Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands March 2014 Chalie Patarapichayatham 1, Ph.D. William Fahle 2, Ph.D. Tracey R. Roden 3, M.Ed. 1 Research Assistant Professor in the
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationModerator and Mediator Analysis
Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,
More informationStatistics courses often teach the two-sample t-test, linear regression, and analysis of variance
2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationChapter 7. One-way ANOVA
Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationAP STATISTICS REVIEW (YMS Chapters 1-8)
AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation
More informationSPSS TUTORIAL & EXERCISE BOOK
UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS
More informationAn SPSS companion book. Basic Practice of Statistics
An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by
More informationWhen to use Excel. When NOT to use Excel 9/24/2014
Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationTable of Contents. Preface
Table of Contents Preface Chapter 1: Introduction 1-1 Opening an SPSS Data File... 2 1-2 Viewing the SPSS Screens... 3 o Data View o Variable View o Output View 1-3 Reading Non-SPSS Files... 6 o Convert
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationOne-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate
1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationANALYSIS OF TREND CHAPTER 5
ANALYSIS OF TREND CHAPTER 5 ERSH 8310 Lecture 7 September 13, 2007 Today s Class Analysis of trends Using contrasts to do something a bit more practical. Linear trends. Quadratic trends. Trends in SPSS.
More information