AP Statistics AP Exam Review Linear Regression & Computer Output: Interpreting Important Variables

Similar documents
Chapter 13 Introduction to Linear Regression and Correlation Analysis

2. Simple Linear Regression

Exercise 1.12 (Pg )

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 7: Simple linear regression Learning Objectives

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Simple linear regression

Regression Analysis: A Complete Example

Chapter 23. Inferences for Regression

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Scatter Plot, Correlation, and Regression on the TI-83/84

17. SIMPLE LINEAR REGRESSION II

MULTIPLE REGRESSION EXAMPLE

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Name: Date: Use the following to answer questions 2-3:

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

How To Run Statistical Tests in Excel

Part 2: Analysis of Relationship Between Two Variables

SPSS Guide: Regression Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Final Exam Practice Problem Answers

Univariate Regression

Simple Linear Regression, Scatterplots, and Bivariate Correlation

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Homework 11. Part 1. Name: Score: / null

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

11. Analysis of Case-control Studies Logistic Regression

Correlation and Regression

Module 5: Multiple Regression Analysis

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Scatter Plots with Error Bars

Using R for Linear Regression

Statistics 151 Practice Midterm 1 Mike Kowalski

Chapter 5 Analysis of variance SPSS Analysis of variance

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Logs Transformation in a Regression Equation

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Describing Relationships between Two Variables

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

Section 3 Part 1. Relationships between two numerical variables

PLOTTING DATA AND INTERPRETING GRAPHS

Coefficient of Determination

Linear Regression. use waist

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

Factors affecting online sales

5. Linear Regression

Correlation and Simple Linear Regression

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Multiple Linear Regression

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Simple Regression Theory II 2010 Samuel L. Baker

Linear Models in STATA and ANOVA

Session 7 Bivariate Data and Analysis

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

DATA INTERPRETATION AND STATISTICS

Worksheet A5: Slope Intercept Form

Algebra I Vocabulary Cards

Relationships Between Two Variables: Scatterplots and Correlation

Algebra II A Final Exam

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Regression and Correlation

(More Practice With Trend Forecasts)

Graphing Linear Equations

2013 MBA Jump Start Program. Statistics Module Part 3

Lesson Using Lines to Make Predictions

Module 3: Correlation and Covariance

Transforming Bivariate Data

Correlation key concepts:

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

The importance of graphing the data: Anscombe s regression examples

AP STATISTICS REVIEW (YMS Chapters 1-8)

Premaster Statistics Tutorial 4 Full solutions

AP Physics 1 and 2 Lab Investigations

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

August 2012 EXAMINATIONS Solution Part I

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Homework #1 Solutions

STAT 350 Practice Final Exam Solution (Spring 2015)

AP Statistics. Chapter 4 Review

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

G r a d e 1 0 I n t r o d u c t i o n t o A p p l i e d a n d P r e - C a l c u l u s M a t h e m a t i c s ( 2 0 S ) Final Practice Exam

Example: Boats and Manatees

Elementary Statistics Sample Exam #3

Moderation. Moderation

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Statistics Class Level Test Mu Alpha Theta State 2008

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

The Point-Slope Form

Transcription:

I. Minitab / Computer Printouts Below is a computer output. You will be expected to use and interpret computer output on the AP Exam. This output is from Minitab, however most computer output looks very similar. We will discuss which numbers you need to know, what they mean, and how to interpret them. SE Coef, = 0.3839 represents the standard deviation of the slope S = 13.40 represents standard deviation of residuals Constant ----- -87.1 This is the y-intercept. This is the value of the response variable when the explanatory variable is 0. Check the context of the situation. Often, there can be no such value. In this case, it is not possible to have a volume that is negative nor is it possible to have a height of zero. Height/Slope --- 1.5433 This is the coefficient of the explanatory variable, thus it is the slope. This entire line of numbers deals with regression for slope. For each increase in height of one unit, the volume is expected to increase by approximately 1.5433 units. (Actual units were not provided) Prediction equation----- y 87.1 1.54x (ie. Least Squares Regression Line) This is an equation used to make predictions and is based on only one sample. SE Coef --------- 0.3839 This is the standard deviation of the slope. Remember, this data came from only one sample. We would expect the slope to vary a little from sample to sample. Thus, If we gathered repeated samples, we would expect the slope of the volumes of the trees to vary by approximately 0.3839 units. S --------- 13.40 This is the standard deviation of the residuals. The average amount that the observed values differ from the predicted values is 13.40. The average amount that the observed volumes of trees differ from the predicted volumes is approximately 13.40 units. 1

r ------------- 35.8% This is the correlation of determination, which is the fraction or proportion of variation in the y values that is explained by the least squares regression of y on x. About 35.8% in the variation in volume can be explained by the least squares regression of y( volume) on x ( height). r r.358.598 This is the correlation coefficient. It tells you strength and direction of the relationship. With an r value of.598, there is a weak, positive relationship between height and volume of trees. T ------------ 4.0 statistic parameter b This is the test statistic which = test statistic std. dev of statistic SE P--------- 0 This is the p-value of a Linear Regression t test. With a p-value of approx.. 0 less than any alpha level (.05,.01), reject the null. There is evidence that there is a relationship between the volume of a tree and its height. Example : Minitab / Computer Printouts Regression Analysis: Height versus Mother Height The regression equation is Height = 4.7 + 0.640 Mother Height (dependent (intercept, (slope, (independent variable, y) b 0 ) b 1) variable, x) the estimated regression equation: ŷ = b 0 + b 1 x (test (estimates) (sd of ests.) statistics) (p-values) Predictor Coef SE Coef T P Constant 4.690 8.978.75 0.009 IGNORE these values Mother H 0.6405 0.1394 4.59 0.000 tests H 0: β 1 = 0, vs. two-tailed altern. intercept, b 0 (the latter is equivalent to testing for slope, b 1 linear correlation between x and y) S =.973 R-Sq = 35.7% R-Sq(adj) = 34.0% (standard error (coefficient of linear (adjusted r, used for multiple regression) of estimate, s e ) determination, r ) * (the coefficient of linear correlation is the square root of r, with the same sign as the slope, b 1 ) for a simple linear regression minitab models only conduct two-tailed alternatives S.973 standard error of the estimate, or the standard deviation of the residuals - actual values versus the predicted values SE Coef 0.1394 standard deviation of the slope

Practice Minitab / Computer Printouts 1. A sample of men agreed to participate in a study to determine the relationship between several variables including height, weight, waste size, and percent body fat. A scatterplot with percent body fat on the y-axis and waist size (in inches) on the horizontal axis revealed a positive linear association between these variables. Computer output for the regression analysis is given below: Dependent variable is: %BF R-squared = 67.8% S = 4.713 with 50- = 48 degrees of freedom Variable Coefficient se of coeff t-ratio prob Constant -4.734.717-15.7 <.0001 Waist 1.70 0.0743.9 <.0001 (a) Write the equation of the regression line (be sure to use correct notation and define your variables): (b) Explain/interpret the information provided by R-squared in the context of this problem. Be specific. (c) Calculate and interpret the correlation coefficient (r). (d) One of the men who participated in the study had waist size 35 inches and 10% body fat. Calculate the residual associated with the point for this individual. 3

Determine the LSRL, Standard deviation for slope, correlation coefficient and the standard error of the residuals for each: #. #3. 4

II. More Practice with Linear Regression and Residual Plots 4. Fast food is often considered unhealthy because much fast food is high in fat and calories. The fat and calorie content for a sample of 5 fast-food burgers is provided below. Fat(g) Calories 31 580 35 590 39 640 39 680 43 660 a) Identify the explanatory and the response variables: b) Use the calculator to make a scatter plot of these ordered pairs. Sketch the scatter plot here. c) What information does the scatter plot provide? That is, use the scatter plot to describe the relationship between the fat grams and calories in a fast food burger. d) Find the following summary statistics for this data: x, y, s, s x y e) Now use your calculator to record the following statistics and to find the equation of the least squares line. Record the equation and use it for the remaining computations. a, b, r, r, y 5

f) Examine a graph of the least squares line superimposed on your scatter plot. Stat > calc > 8:linreg(a+bx) >L1, L, Y To get the Y to show up: Vars > Y Vars> 1:Function > 1: Y1 This will graph the LSRL along with your scatterplot. If you go to the Y= screen, you will now see the equation for the LSRL g) Does the line appear to be good model for the data? h) What is the value of your slope? What information does it provide? Be specific. i) How many calories would you predict a burger with 0 fat grams has? j) Calculate the residual for 35 fat grams. k) Calculate the value of r. What information does it provide? Be specific. l) What is the value of r? What does it tell you in this situation? m) Make a residual plot on your calculator. Be sure to label both axes with words and a friendly scale. n) Based on this residual plot, do you think the least squares line is a good model for this data 6

5. Becky s parents have kept records of her height since she was born. The data set consists of Becky s age in months and her height in centimeters. The summary statistics for the data are provided below: Mean age: 44 months std. dev. age: 8.5 months Mean ht: 8 cm std. dev. ht: 4.1 cm The correlation between age and height is.86. (a) Find the equation of the least squares line that you would use to predict Becky s height from her age. Show all work. (b) What real-world information does the slope provide? Be specific! (c) Suppose height had been measured in inches rather than in centimeters. What would be the correlation between age and height in inches? Note: 1 inch =.54 cm 7