Regression and Correlation
|
|
- Fay Owen
- 7 years ago
- Views:
Transcription
1 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will consider bivariate data in forms of ordered pairs (x, y). Both x and y can be observed or y can be observed for specific values of x that are selected by the researcher. Data must display a linear trend on the scatterplot for us to consider fitting a linear equation that will expres in terms of x. We will use the following vocabulary with respect to x and y: y response variable or dependent variable x predictor variable/explanatory variable or independent variable Our line is called Least Squares Regression Line and will have a form: y=b 0 b x, where y is a predicted value of y for given x. The name of the line reflects the fact that our line has smallest possible sum of squared errors (of all lines that can be possibly fitted to the given data). Errors or residuals are : e= y y Formulas for the slope and y-intercept of Least-Squares Regression equation: y =b 0 b x : b = (x i x)(y i y) (x i x) = r, b 0 =y b x = where x i x n = y i y n (standard deviations of x-s and y-s respectively) Following example will illustrate the computations. In our example we consider people on the diet, X=# of times per week person eats out, Y= change in weight (in pounds),(before-after) after three months period of dieting x i y i x i x (x i x) y i y ( y i y) (x i x)( y i y) SS(total) x=5/5=5 y=0/5=8 b = /= b 0 =8 5 =8, so our equation is : y=8 x Equation tells us that for every time increase in x (times to eat out in a week), y (weight change) decreases by pounds. The more often you go out to eat the less pounds are lost Units of the slope are : (units of y)/ (units of x) Total Sum of squares SS total = y i y gives total variability in y values
2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Linear correlation Coefficient of X and Y, r : r= x n i x y i y = n ( x i x )( y y i ) = x i x y i y, x i x y i y Linear Correlation coefficient gives information about strength and direction of a linear trend in our data. Values close to or - indicate extremely strong linear association between X and Y. Linear correlation r is symmetric, if X and Y are reversed, value of r remains the same. Sign(r)=sign of the slope, r is unit free and r. In our example r= 0 = - 0., indicating very strong negative linear trend. Warning about correlation: strong (positive or negative) correlation does not have to imply causal relationship between x and y. Making predictions using our equation: Predict number of pounds lost if dieting person eats out 5 times per week: y=8-0=8 lb (this prediction is for x within data range) Warning about making predictions: It is safe to predict y for x within data range. Extrapolation - making predictions for values of the predictor variable outside the range of the observed values of the predictor variable (x) Grossly incorrect predictions can result from extrapolation. For ex. Prediction for x=5 i=- (weight gain of lb). One may think that value is reasonable, weight gain is to be expected if you eat out so often, but because we have no data in that range, we may not be sure that our particular trend will hold as far out of the range of x values. Besides Total sum of squares there are two other important sums of squares that we will use. We can now obtain residual sum of squares, and regression sum of squares by obtaining predicted values of y for each x, by using our equation. x i y i ŷ i y i ŷ ( y i ŷ ) ŷ i y ( ŷ i y ) SS(resid) SS(reg)
3 STP3 Brief Class Notes Instructor: Ela Jackiewicz Residual Sum of Squares: SS resid = y i y i by the regression equation gives total variability in y values unexplained Regression Sum of Squares: the regression equation SS reg = y i y gives total variability in y valuexplained by Regression Identity: SS(total)=SS(reg)+SS(resid) We can verify that in our example : 0=88+8 We will use the three sums of squares in farther definitions. Problems we may encounter with our data: Outlier a data point that lies far from the regression line, relative to other data points Influential observation a data point whose removal causes the regression to change considerably. It is usually separated in the x-direction from the other data points. It pulls the regression line towards itself. Assessing the fit of our regression line: = We can use SS(resid) to compute Residual Standard Deviation: s SS(resid) e n This measure tells us how close are the data points to our fitted regression line, the smaller is this value, the better is the fit of our regression line. If is smaller than (standard deviation of y-s), it means a good fit. The units of are the same as units of y. In our example = (8/3)=.5lb<5.5= indicating a good fit. ( = In the nice data, if it is not too small, we expect roughly : 8% of observed y-s to be within ± of the regression line 5% of observed y-s points to be within ± of the regression line and.% of observed y-s points to be within ±3 of the regression line 0 5 =5.5 ) In other words these percentages (roughly) of data points are expected to be within a vertical distances (parallel to the y axis) above and below the regression line. Another value that tells us about the fit of the regression line is: Coefficient of Determination: r SS reg SS resid = = SS total SS total (square of linear corr. coef. r) This measure gives us percentage of total variability in y values that ixplained by our regression line. If this is close to (00%), then that indicates a good fit. 0 r
4 STP3 Brief Class Notes Instructor: Ela Jackiewicz In our example r = 8 0 = 88 0 =0.83, it indicates a good fit, 83% of total variability in y-values ixplained by the regression line. There is an approximate relationship of r to and. This approximation is good for large data r If this ratio is small, it indicates a good fit Above formula can also be expressed as r s e and used to approximate r for large data. In our example 0.83 =0. and approximation, since our data is small. = =0.8, and r 0. not a great Hypothesis test and CI for the slope and correlation coefficient: Assuming that the following linear model applies to X and Y: Linear Model has a form: Y = Y X, where Y X is a linear function of X, so Y = 0 X Y X = population mean Y value for given X, Y X = population standard deviation of Y-s for given X value, population correlation coefficient is (rho), Term ϵ in our model represents random error, we include it in the model to reflect that Y varieven if X is fixed. We can test hypothesis related to the true slope and true correlation coefficient. The terms we compute for Least Squares Regression are estimating parameters in our model: b 0 estimates 0, b estimates, r estimates, estimates σ Y X =σ ϵ, since it is independent of x If we want to test if slope is zero (indicating no relationship between x and y), our hypotheses are: H 0 : =0 vs H a :β 0 or H a :β > 0 or H a :β < 0 Choice of alternative depends on the question. Test statistics: t s = b SE b df=n-, Standard Error of the slope b : SE b = (x i x) = n, the second formula iasier to compute,
5 STP3 Brief Class Notes Instructor: Ela Jackiewicz We can also compute Confidence interval for a true slope: 5% CI for : b ±t.05 SE b Equivalently we may test hypotheses involving true correlation coefficient. Our hypotheses are: H 0 : =0 vs H a :ρ 0 or H a :ρ> 0 or H a :ρ< 0 Test statistics: t s =r n r, df=n- iquivalent to the one before, but easier to compute. In our example we may ask the question: Assuming that a linear model applies, test the hypothesis that there is no relationship between x and y against an alternative that y decreases with increasing x. Use 5% significance level. Our test hypotheses are: H 0 : =0 and H a :β < 0 or ( H 0 :ρ=0 and H a :ρ< 0 ) We will use second test statistics: t=. 5 = 3.8 P-value=0.05< Null is rejected, we have evidence for alternative. Yes, y decreases with increasing x, there is statistically significant negative linear relationship between x and y. Different (Computational) formulas for the slope, r and all three sums of squares are: b = S xy, SS resid =Syy S xy, SS total =S S yy, SS reg = S xy, xx S xx S xx r= S xy S xx S yy, r = S xy S xx S yy where: S xx = x i x i CALCULATOR n, S yy = y i y i n, S xy = x i y i x i y i n ) To use above computational formulas and/or find basic statistics for x and y use STAT EDIT place x-s and y-s on L and L respectively, then STAT CALC, select -Var Stats <enter> -Var Stats L, L <enter> )To find the regression line: STAT EDIT place x-s and y-s on L and L respectively STAT CALC, select LinReg(ax+b) <enter> LinReg(ax+b) L, L <enter>
6 STP3 Brief Class Notes Instructor: Ela Jackiewicz Output should display slope, y intercept, r and r.if you do not see r and r, then go to the CATALOG, select DIAGNOSTICS ON <enter><enter>, next time you run the regression it will show both values. 3) To perform the test: STAT TESTS use LinRegT-Test, place Xlist L Ylist L, then select appropriate alternative hypothesis (They use β for β, test is for both β and ρ ) In the output value of s=, you also get values of and there - ) To compute CI for β use STAT TESTS use LinRegT-interval, place Xlist L Ylist L, then select appropriate C-level If you do not have the CI on your calculator, use s from the LinRegT-Test output to compute SE b = n and compute CI by hand. Least squares Linear Regression Example Is number of beers consumed a good predictor of BAC (blood alcohol content)? Can you drive legally after beers (legal limit is.08)? Random sample of students from Ohio State University participated in a study. Each student was assigned randomly number of beers to drink and after 30 minutes a police officer measured their BAC in grams of alcohol per deciliter of blood. The results are given in the table below. # of beers=x BAC=y Answer following questions: Graph these data to confirm existence of a linear trend. Obtain LS regression line for these data. What is the slope of your line telling you about change in BAC after you drink a next beer? Do you think y-intercept value is sensible? Compute linear correlation coefficient and coefficient of determination. What % of total variability in BAC ixplained by the regression line? Are r and r indicating a good fit? Clearly answer both questions posed at the beginning of the problem. Can we use our equation to safely predict BAC for one that drinks 5 beers? Given SS(resid)=.005, compute, compare it to and decide if it indicates a good fit. Test the hypothesis of no linear relationship between x and y against directional alternative hypothesis that y increases with increasing x. Obtain 5% CI for the true slope β
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationHypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationCopyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5
Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationSIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
More informationScatter Plot, Correlation, and Regression on the TI-83/84
Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationRegression and Correlation
Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis
More informationCopyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7
Using Your TI-83/84/89 Calculator: Linear Correlation and Regression Dr. Laura Schultz Statistics I This handout describes how to use your calculator for various linear correlation and regression applications.
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationtable to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1.
Review Problems for Exam 3 Math 1040 1 1. Find the probability that a standard normal random variable is less than 2.37. Looking up 2.37 on the normal table, we see that the probability is 0.9911. 2. Find
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationCORRELATION ANALYSIS
CORRELATION ANALYSIS Learning Objectives Understand how correlation can be used to demonstrate a relationship between two factors. Know how to perform a correlation analysis and calculate the coefficient
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationWEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationPearson s Correlation
Pearson s Correlation Correlation the degree to which two variables are associated (co-vary). Covariance may be either positive or negative. Its magnitude depends on the units of measurement. Assumes the
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationSimple Linear Regression
STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze
More informationPredictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.
Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged
More informationYou buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week. Week 1 2 3 4 5 6 7 8 9
Warm Up: You buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week Week 1 2 3 4 5 6 7 8 9 Money Owed 900 800 700 600 500 400 300 200 100
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationTRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationHow Does My TI-84 Do That
How Does My TI-84 Do That A guide to using the TI-84 for statistics Austin Peay State University Clarksville, Tennessee How Does My TI-84 Do That A guide to using the TI-84 for statistics Table of Contents
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationChapter 9 Descriptive Statistics for Bivariate Data
9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationAn analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression
Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationWhat does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.
PRIMARY CONTENT MODULE Algebra - Linear Equations & Inequalities T-37/H-37 What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationData analysis and regression in Stata
Data analysis and regression in Stata This handout shows how the weekly beer sales series might be analyzed with Stata (the software package now used for teaching stats at Kellogg), for purposes of comparing
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationViolent crime total. Problem Set 1
Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More information(Least Squares Investigation)
(Least Squares Investigation) o Open a new sketch. Select Preferences under the Edit menu. Select the Text Tab at the top. Uncheck both boxes under the title Show Labels Automatically o Create two points
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationUSING A TI-83 OR TI-84 SERIES GRAPHING CALCULATOR IN AN INTRODUCTORY STATISTICS CLASS
USING A TI-83 OR TI-84 SERIES GRAPHING CALCULATOR IN AN INTRODUCTORY STATISTICS CLASS W. SCOTT STREET, IV DEPARTMENT OF STATISTICAL SCIENCES & OPERATIONS RESEARCH VIRGINIA COMMONWEALTH UNIVERSITY Table
More informationch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
ch12 practice test 1) The null hypothesis that x and y are is H0: = 0. 1) 2) When a two-sided significance test about a population slope has a P-value below 0.05, the 95% confidence interval for A) does
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationWe are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationSimple Methods and Procedures Used in Forecasting
Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events
More informationcontaining Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.
Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationA Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationThe Point-Slope Form
7. The Point-Slope Form 7. OBJECTIVES 1. Given a point and a slope, find the graph of a line. Given a point and the slope, find the equation of a line. Given two points, find the equation of a line y Slope
More informationwith functions, expressions and equations which follow in units 3 and 4.
Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationStatistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More information