AP Statistics Regression Test PART I : TRUE OR FALSE.

Similar documents
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 7: Simple linear regression Learning Objectives

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Simple linear regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Exercise 1.12 (Pg )

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Chapter 13 Introduction to Linear Regression and Correlation Analysis

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Father s height (inches)

Section 3 Part 1. Relationships between two numerical variables

Relationships Between Two Variables: Scatterplots and Correlation

Univariate Regression

Example: Boats and Manatees

Homework 8 Solutions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

2. Simple Linear Regression

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Describing Relationships between Two Variables

Name: Date: Use the following to answer questions 2-3:

Worksheet A5: Slope Intercept Form

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Correlation key concepts:

Session 7 Bivariate Data and Analysis

Homework 11. Part 1. Name: Score: / null

Chapter 9 Descriptive Statistics for Bivariate Data

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Econometrics Simple Linear Regression

Scatter Plot, Correlation, and Regression on the TI-83/84

The importance of graphing the data: Anscombe s regression examples

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

AP STATISTICS REVIEW (YMS Chapters 1-8)

G r a d e 1 0 I n t r o d u c t i o n t o A p p l i e d a n d P r e - C a l c u l u s M a t h e m a t i c s ( 2 0 S ) Final Practice Exam

Data Mining Part 5. Prediction

Linear Models in STATA and ANOVA

Algebra 1 Course Information

Logs Transformation in a Regression Equation

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Lesson Using Lines to Make Predictions

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Dimensionality Reduction: Principal Components Analysis

2) The three categories of forecasting models are time series, quantitative, and qualitative. 2)

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

ALGEBRA I (Common Core) Thursday, January 28, :15 to 4:15 p.m., only

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Chapter 23. Inferences for Regression

Fairfield Public Schools

Regression and Correlation

Diagrams and Graphs of Statistical Data

Module 3: Correlation and Covariance

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

The Big Picture. Correlation. Scatter Plots. Data

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

AP Statistics Ch 3 Aim 1: Scatter Diagrams

AP Statistics. Chapter 4 Review

Using Excel for Statistical Analysis

Statistics 2014 Scoring Guidelines

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014

Getting Correct Results from PROC REG

Chapter 5 Estimating Demand Functions

Premaster Statistics Tutorial 4 Full solutions

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Transforming Bivariate Data

Simple Regression Theory II 2010 Samuel L. Baker

MEASURES OF VARIATION

MATH 100 PRACTICE FINAL EXAM

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

STAT 350 Practice Final Exam Solution (Spring 2015)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

Grade. 8 th Grade SM C Curriculum

Chapter 11: r.m.s. error for regression

Summarizing and Displaying Categorical Data

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Lesson 20. Probability and Cumulative Distribution Functions

Writing the Equation of a Line in Slope-Intercept Form

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

CURVE FITTING LEAST SQUARES APPROXIMATION

Linear Regression. use waist

Introduction to Regression and Data Analysis

MULTIPLE REGRESSION EXAMPLE

17. SIMPLE LINEAR REGRESSION II

THE COST OF COLLEGE EDUCATION PROJECT PACKET

MTH 140 Statistics Videos

AP * Statistics Review. Descriptive Statistics

Transcription:

AP Statistics Regression Test PART I : TRUE OR FALSE. 1. A high correlation between x and y proves that x causes y. 3. The correlation coefficient has the same sign as the slope of the least squares line fitted to the same data. 5. A correlation of -.1 and +.1 show the same degree of clustering around the regression line. 7. The mean of the residuals is always zero. PART II : MULTIPLE CHOICE. Name:. The two coefficients (a and b) for the line of best fit have the same sign.. An r-value greater than zero indicates that ordered pairs with high x- values will have low y- values.. A correlation of.75 indicates a relationship that is 3 times as linear as one for which the correlation is only.5.. A definite pattern in the residual plot is an indication that a nonlinear model will show a better fit to the data than the straight regression line. _ 9. Suppose one has collected data on X the diameter of tree trunk (in inches) and Y tree height (in feet). If the Regression equation is ŷ = -3. + 3.1x, what is your estimate of the average height of all trees having a trunk diameter of 7 inches? (A) 1.1 ft (B) 19.1 ft (C) 0.1 ft (D) 1.1 ft (E).1 ft _. In a random sample of older patients at a large medical practice, the age of a patient and a measure of that patient s hearing loss were recorded. The correlation between age and hearing loss of the patients in the sample was found to be 0.7. Which one of the following would be a correct statement if the age of a patient were used to predict the amount of hearing loss for a patient? (A) Forty-nine percent of the time, the least squares regression line accurately predicts hearing loss. (B) Forty-nine percent of the variation in hearing loss can be explained by the variation in the age of a patient. (C) About 70% of a person s hearing loss can be explained by age, according to the regression line relating hearing loss and age. (D) About 70% of the time, age will correctly predict the amount of hearing loss. (E) The least squares regression line relating hearing loss to age will have a slope of approximately 0.7. _ 11. A correlation between college entrance exam grades and scholastic achievement was found to be 1.0. On the basis of this you would tell the university that: (A) The entrance exam is a good predictor of academic success. (B) The exam is a poor predictor of academic success. (C) Students who do best on this exam will make the worst students. (D) Students at this school are underachieving. (E) They should hire a new statistician. _1. A study of the effects of television measured how many hours of television each of 15 grade school children watched per week during a school year and their reading scores. Which variable would you put on the horizontal axis of a scatterplot of the data? (A) Reading score, because it is the response variable. (B) Reading score, because it is the explanatory variable. (C) Hours of television, because it is the response variable. (D) Hours of television, because it is the explanatory variable. (E) It makes no difference, because there is no explanatory-response distinction in this study. _13. The study described in the previous question found that children who watch more television tend to have lower reading scores than children who watch fewer hours of television. The study report says that, Hours of television watched explained 9% of the observed variation in the reading scores of the 15 subjects. The correlation between hours of TV and reading score must be (A) r = -0.3 (B) r = 0.3 (C) r = -0.09 (D) r = 0.09 (E) Can t tell from the information given.

Height _1. A study of child development measures the age (in months) at which a child begins to talk and also the child s score on an ability test given several years later. The study asks whether the age at which a child talks helps predict the later test score. The least-squares regression line of test score y on age x is ŷ = 1 1.3x. According to this regression line, what happens (on average) each month a child delays speech? (A) The test score goes down 1 points. (B) The test score goes down 1.3 points. (C) The test score goes up 1 points. (D) The test score goes up 1.3 points. (E) The test score is.7. _15. Biologists have collected data relating the age (in years) of one variety of oak tree to its height (in feet). A scatterplot for 0 of these trees is given below. Scatterplot of Height vs Age 50 0 30 0 A 0 5 Age 15 0 If the point labeled A in the above scatterplot were changed to have a height of 0 feet, which one of the following statements would be true? (A) The slope of the least squares regression line would decrease and the correlation would increase. (B) The slope of the least squares regression line would decrease and the correlation would decrease. (C) The slope of the least squares regression line would increase and the correlation would increase. (D) The slope of the least squares regression line would increase and the correlation would decrease. (E) The slope of the least squares regression line and the correlation would remain the same. _1. If the correlation between body weight and annual income were high and positive, we could conclude that: (A) High incomes cause people to eat more food. (B) Low incomes cause people to eat less food. (C) High-income people tend to spend a greater proportion of their income on food than low-income people, on average. (D) High-income people tend to be heavier than low income people, on average. (E) High incomes cause people to gain weight. _17. Residuals are (A) (B) (C) (D) (E) possible models not explored by the researcher. variation in the data that is explained by the model. the difference between observed responses and values predicted by the model. data collected from individuals that is not consistent with the rest of the group. the sum of the error squares of all the data points. _1. Which of the following statements about influential scores are true? I. Influential scores have large residuals. II. Removal of an influential score sharply affects the regression line. III. An x-value that is an outlier in the x-variable is more indicative that a point is influential than a y-value that is an outlier in the y-variable. (A) I and II (B) I and III (C) II and III (D) I, II, and III (E) None of the above

Minutes Late Minutes Late _19. An efficiency expert wanted to see if there is a relationship between the number of people attending a meeting and the number of minutes late that the meeting started. The table shows the results with the accompanying scatter plot (Figure 1). Number of people attending the meeting 3 5 Number of minutes late the meeting started 3 1 7 Figure represents the scatter plot with the point (, 7) removed from the data. Scatterplot of Minutes Late vs Number attending Scatterplot of Minutes Late vs Number attending 1 1 1 1 1 3 5 7 Number attending 9 3 Number attending 5 Figure 1 Figure Which one of the following is TRUE about the point (, 7)? (A) It has the largest residual. (B) It is an influential observation. (C) It is an outlier in the x-direction. (D) The slope changes little when the point is removed. (E) The correlation is lower with the removal of the point. _0. Suppose the scatterplot of logx and logy shows a strong positive correlation close to 1. Which of the following must be true? I. The variables X and Y also have a correlation close to 1. II. A scatterplot of the variables X and Y shows a strong nonlinear pattern. III. The residual plot of the variables X and Y shows a random pattern. (A) I only (B) II only (C) III only (D) I and II (E) I, II, and III _1. The model can be used to predict the stopping distance (in feet) for a car traveling at a specific speed (in mph). According to this model, about how much distance will a car going 5 mph need to stop? (A).3 ft (B) 1. ft (C) 7 ft (D) 35 ft (E) 79 ft PART III: FREE RESPONSE. There is a relationship between the amount of money in a person s retirement account and the amount of time that money has been invested. Each of the following strategies has been used to straighten the data. Find the predicted value of Money using each model for Time = 30 years. (A) (B) (C)

3. According to the article, "First-Year Academic Success: A Prediction for Caucasian and African-American Students" (Journal of College Student Development (1999) there is a mild correlation between high school GPA (HS) and first-year college GPA (Coll). The data, which was collected at a "Southeastern public research university", can be summarized as follows: mean high school GPA = 3.7 mean college GPA =. standard dev of high school GPA =.7 standard dev of college GPA =.5 r. (a) Find the least square regression line that could be used to predict college GPA. (SHOW ALL WORK) (b) The correlation indicates a fairly strong relationship between high school and college GPA. Is this relationship causal or associative? Justify your answer.. A researcher believes that there is a relationship between the length of your pinky finger (in inches) and your height (in feet). After measuring many of these attributes of people, she came up with the LSRL of. (a) Adam has a pinky finger that measures 1.35 inches long. His residual from this LSRL is 0.0 feet. How tall is Adam? (b) Hailey is 5.75 ft tall and has a 1.5 inch pinky finger. What is Hailey s residual?

5. In a study of the application of a certain type of weed killer, 1 fields containing large numbers of weeds were treated. The weed killer was prepared at seven different strengths by adding 1, 1.5,,.5, 3, 3.5, or teaspoons to a gallon of water. Two randomly selected fields were treated with all the various strengths of weed- killer. After a few days, the percentage of weeds killed on each field was measured. The computer output obtained from fitting a least squares regression line to the data is shown below. A plot of the residuals is provided as well. (a) What is the equation of the least squares regression line given by this analysis? (b) Interpret the slope of the regression line in part (a). (c) Interpret the coefficient of determination (R ). (d) Is a linear model appropriate for this data? Explain your reasoning. (e) Use the regression equation from part (a) to predict the percentage of weeds killed when.5 teaspoons of weed killer are used AND decide which of the following you would expect to be a reasonable conclusion. Explain your reasoning. The prediction will be too large. The prediction will be too small. The prediction would have a residual of zero.