Name Period AP Statistics Unit 10 Review

Similar documents
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 7: Simple linear regression Learning Objectives

Regression Analysis: A Complete Example

STAT 350 Practice Final Exam Solution (Spring 2015)

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

2. Simple Linear Regression

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Final Exam Practice Problem Answers

10. Analysis of Longitudinal Studies Repeat-measures analysis

Multiple Linear Regression

Elementary Statistics Sample Exam #3

Chapter 5 Analysis of variance SPSS Analysis of variance

Simple linear regression

2013 MBA Jump Start Program. Statistics Module Part 3

Module 5: Multiple Regression Analysis

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Chapter 23. Inferences for Regression

Statistics 2014 Scoring Guidelines

Premaster Statistics Tutorial 4 Full solutions

Chapter 13 Introduction to Linear Regression and Correlation Analysis

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Part 2: Analysis of Relationship Between Two Variables

SPSS Guide: Regression Analysis

AP Statistics. Chapter 4 Review

Univariate Regression

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

11. Analysis of Case-control Studies Logistic Regression

Name: Date: Use the following to answer questions 2-3:

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

II. DISTRIBUTIONS distribution normal distribution. standard scores

Regression step-by-step using Microsoft Excel

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Week TSX Index

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

Simple Methods and Procedures Used in Forecasting

Linear Models in STATA and ANOVA

MULTIPLE REGRESSION EXAMPLE

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Recall this chart that showed how most of our course would be organized:

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

AP STATISTICS (Warm-Up Exercises)

An analysis method for a quantitative outcome and two categorical explanatory variables.

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Interaction between quantitative predictors

UNDERSTANDING THE TWO-WAY ANOVA

Regression III: Advanced Methods

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

5. Linear Regression

Factors affecting online sales

The importance of graphing the data: Anscombe s regression examples

Difference of Means and ANOVA Problems

Chapter 7 Section 7.1: Inference for the Mean of a Population

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

Correlation and Simple Linear Regression

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

How To Run Statistical Tests in Excel

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter 4 and 5 solutions

Statistiek II. John Nerbonne. October 1, Dept of Information Science

PRACTICE PROBLEMS FOR BIOSTATISTICS

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

August 2012 EXAMINATIONS Solution Part I

Multiple Regression: What Is It?

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014

socscimajor yes no TOTAL female male TOTAL

17. SIMPLE LINEAR REGRESSION II

Chapter 1: Exploring Data

Generalized Linear Models

12: Analysis of Variance. Introduction

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

One-Way Analysis of Variance (ANOVA) Example Problem

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Draft 1, Attempted 2014 FR Solutions, AP Statistics Exam

Section 13, Part 1 ANOVA. Analysis Of Variance

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Midterm Review Problems

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Statistics 151 Practice Midterm 1 Mike Kowalski

Mind on Statistics. Chapter 13

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

A Short Tour of the Predictive Modeling Process

Exercise 1.12 (Pg )

Statistical Models in R

Simple Linear Regression Inference

Transcription:

Name Period AP Statistics Unit 10 Review Use the following to answer questions 1 4: At what age do babies learn to crawl? Does it take longer for them to learn in the winter, when babies are often bundled in clothes that restrict their movements? Data were collected from parents who brought their babies into the University of Denver Infant Study Center to participate in one of a number of experiments. Parents reported the birth month and the age at which their child was first able to creep or crawl a distance of four feet within one minute. The resulting data were grouped by month of birth. The data below are for January, May, and September. (Crawling age is given in weeks.) Crawling Age Mean Std. Dev. n January 29.84 7.08 32 Birth Month May 28.58 8.07 27 September 33.83 6.93 38 Assume that the data represent three independent SRSs, one from each of the three populations of interest (all babies born in a particular month), and that crawling ages are normally distributed for all three populations. A partial ANOVA table is given below. Source df Sums of Squares Mean Square F-ratio Birth month 505.26 Error 53.45 Total 1. What are the degrees of freedom for birth month (numerator)? A) 2 B) 3 C) 4 D) 94 E) 97 2. What are the degrees of freedom for error (denominator)? A) 2 B) 3 C) 4 D) 94 E) 97 3. The null hypothesis for the ANOVA F test is that the population mean crawling ages are equal for all three birth months. Which of the following is an appropriate alternative hypothesis? A) The population mean crawling age is larger for January than for the other two months. B) The population mean crawling age is larger for May than for the other two months. C) The three months all have different population mean crawling ages. D) The population mean crawling ages for the three months are all within one standard deviation of each other. E) The population mean crawling age is different for at least one of the three months. 4. Which of the following is the value of the ANOVA F test statistic for equality of the population means of the three birth months? A) 3.15 B) 3.42 C) 4.73 D) 6.30 E) 9.45

Use the following for questions 5 8: A high school teacher suspects that students of different ages estimate the ages of adults differently. He asks randomly-selected sophomores, juniors, and seniors to guess the age of a person in a photograph and plans to compare the mean age guesses using one-way analysis of variance. Here is the computer output from his analysis: Source DF SS MS F P School Year 2 135.53 67.76 6.85 0.003 Error 43 425.08 9.89 Total 45 560.61 S = 3.144 R-Sq = 24.17% R-Sq(adj) = 20.65% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ---------+---------+---------+---------+ Senior 15 43.800 2.426 (-------*-------) Junior 16 47.875 3.948 (-------*-------) Sophomore 15 46.733 2.789 (--------*-------) ---------+---------+---------+---------+ 44.0 46.0 48.0 50.0 Pooled StDev = 3.144 5. Which of the following is the appropriate null hypothesis for the ANOVA F-test in this situation? A) The population mean age guess for seniors is higher that the mean age guess for juniors and sophomores. B) The population mean age guesses for all three age groups are different. C) The population mean age guess for at least one age group is different from the others. D) The population mean age guesses for all three age groups are equal. E) The population mean age guesses for at least two of the three age groups are equal. 6. Which of the following statements about required conditions for the ANOVA F-test is correct in this situation? A) None of the three distributions of sample guesses should show signs of strong skew. B) As long as there are no outliers, the ANOVA test is appropriate. C) As long as two of the three distributions of sample guesses are close to Normally distributed, the test is robust with respect to strong skew in the third distribution. D) The shapes of the distributions of samples guesses don t matter, because the condition of equal sample standard deviations is violated. E) The shapes of the distributions of samples guesses don t matter, because the condition of independence has been violated, since the three grade levels were sampled from the same school. 7. Assuming all necessary conditions have been met, what is the appropriate conclusion for the ANOVA F test? A) Reject Ho. These data do not provide enough evidence to conclude that there is a difference in the true mean age guesses in the three age groups. B) Reject Ho. These data provide convincing evidence that there is a difference in the true mean age guesses in these three age groups. C) Accept Ha. These data provide convincing evidence that there is a difference in the true mean age guesses in these three age groups. D) Fail to reject Ho. These data do not provide enough evidence to conclude that there is a difference in the true mean age guesses in the three age groups. E) Fail to reject Ho. These data provide convincing evidence that there is a difference in the true mean age guesses in these three age groups.

8. Based on the numerical summaries in the computer output, which of the follow statement is true? A) All three samples have about the same range. B) The mean age guesses by seniors is closest to the person s actual age. C) There appears to be little difference between the age guesses of the three age groups. D) Age guesses by juniors are significantly higher than age guesses by sophomores. E) On average, the age guesses of seniors is much lower than that of the other two age groups. Use the following for questions 9 10: Below are three sets of parallel box plots, labeled Set A, Set B, and Set C. Each set of box plots describes the results of random samples of size n = 30 from three independent populations. Set A Set B Set C Group A1 Group B1 Group C1 Group A2 Group B2 Group C2 Group A3 Group B3 Group C3 4 8 12 Scores 16 20 24 4 8 12 Scores 16 20 24 4 8 12 Scores 16 20 24 An ANOVA F test was performed on each set of samples to compare means. Assume conditions for performing the F test were met in each case. 9. Which one of the following statements is supported by these box plots? A) Set A has much larger within-group variation that either Set B or Set C. B) Set B has more between-group variation than Set C. C) Set C has much larger within-group variation than either Set A or Set B. D) Set B has the lowest between-group variation. E) Set A has much less between-group variation than Set C. 10. Which of the following describes the relationship between the F statistics for these three ANOVA tests? A) FSet A FSet C B) FSet A FSet C C) FSet B FSet A FSet C D) FSet C FSet A E) FSet C FSet A

11. Below is a partial computer output for simple linear regression of Fuel versus Car Weight. Predictor Coef SE Coef T P Constant 57.024 2.548 22.38 0.000 weight -0.0084428 0.0007686-10.98 0.000 S = 1.54785 R-Sq = 87.0% R-Sq(adj) = 86.3% Note that R-Sq is 87% for this regression, while R-Sq for the multiple regression model that included IndCyl and Weight*IndCyl is 92%. Which of the following is an appropriate interpretation of this information? A) The number of cylinders a car has accounts for 92% of variation in Fuel efficiency. B) The number of cylinders a car has accounts for 5% of variation in Fuel efficiency. C) A regression model that includes both Car weight and Number of cylinders accounts for 5% more of the variation in Fuel efficiency than one that includes only Car weight. D) A regression model that includes both Car weight and Number of cylinders accounts for 92% more of the variation in Fuel efficiency than one that includes only Car weight. E) On average, a six-cylinder car uses 5% more fuel than a four-cylinder car. Use the following for questions 12 13: Suppose a medical researcher is investigating the relationship between age and systolic blood pressure (SBP) in men from a certain population. He intends to model the relationship with the equation y 0 1x1 2x2 3x1x2, where x 1 = age and x2 0 for men under age 40 and x2 1 for men over age 40. 12. Which of the following does the coefficient 1 represent? A) The y-intercept of a line describing the relationship between age and SBP for men under age 40 in B) The slope of the line describing the relationship between age and SBP for men under age 40 in C) The slope of the line describing the relationship between age and SBP for men over age 40 in D) The difference in slopes between the line describing the relationship between age and SBP for men under age 40 in this population and the line describing the same relationship for men over age 40. E) The difference in y-intercepts between the line describing the relationship between age and SBP for men under age 40 in this population and the line describing the same relationship for men over age 40. 13. Which of the following does the coefficient 3 represent? A) The y-intercept of a line describing the relationship between age and SBP for men under age 40 in B) The slope of the line describing the relationship between age and SBP for men under age 40 in C) The slope of the line describing the relationship between age and SBP for men over age 40 in D) The difference in slopes between the line describing the relationship between age and SBP for men under age 40 in this population and the line describing the same relationship for men over age 40. E) The difference in y-intercepts between the line describing the relationship between age and SBP for men under age 40 in this population and the line describing the same relationship for men over age 40.

Sunyan s favorite exercise machine is a stair climber. He can adjust the resistance level of the machine on a 1 to 10 scale but he has discovered that the number of simulated floors the machine says he has climbed at any given level varies from session to session. Sunyan decides to explore the relationship between the machine s resistance level and the number of floors climbed for workouts of two durations 20 minutes and 30 minutes. For each time length, he records the number of floors climbed at six different resistance levels. A scatterplot for number of floors versus resistance level reveals two linear relationships with equal slopes, one for each workout length. Output from a regression analysis of Sunyan s data is given below. The indicator variable Time takes value 0 for 20-minute workouts and 1 for 30-minute workouts. Predictor Coef SE Coef T P Constant 44.700 2.846 15.70 0.000 Level 7.6000 0.4739 16.04 0.000 Time 40.333 1.619 24.92 0.000 S = 2.80344 R-Sq = 99.0% R-Sq(adj) = 98.8% 14. On average, what is the predicted increase in number of floors climbed for a one-unit increase in resistance level? A) 2.80344 B) 2.846 C) 7.6000 D) 40.333 E) The answer depends on the length of the workout and can t be determined from the information given. Based on a sample of the salaries of professors at a major university, you have performed a multiple regression relating salary to years of service and gender. The estimated multiple linear regression model is Salary = $45000 + $3000(Years) + $4000(Gender) + $1000[(Years)(Gender)], where Gender = 1 if the professor is male and Gender = 0 if the professor is female. 15. Using this model, which of the following is the predicted difference in the salaries of a male professor with three years of service and a female professor with three years of service? A) $3000 B) $4000 C) $5000 D) $7000 E) $9000

Use the following for questions 16 18 To what extent can we predict the fuel efficiency of passenger cars on the basis of their weight and the number of cylinders the engine has? Below is computer output for a regression analysis of 20 randomly-selected late-model family sedans. The explanatory variables are Weight (= car weight in pounds) and IndCyl (an indicator variable that takes the value 0 for four-cylinder engines and 1 for six-cylinder engines) and the response variable is Fuel (= highway fuel efficiency in miles per gallon). Assume that the conditions for multiple regression inference have been satisfied. Predictor Coef SE Coef T P Constant 69.916 4.625 15.12 0.000 weight -0.012991 0.001591-8.16 0.000 indcyl -20.258 8.871-2.28 0.036 weight*indcyl 0.006630 0.002602 2.55 0.021 S = 1.28477 R-Sq = 92.0% R-Sq(adj) = 90.6% Analysis of Variance Source DF SS MS F P Regression 3 305.79 101.93 61.75 0.000 Residual Error 16 26.41 1.65 Total 19 332.20 16. Which of the following is the correct regression equation for six-cylinder engines? A) Fuel 49.658 0.006361 Weight B) Fuel 69.916 0.006361 Weight C) Fuel 49.658 0.012991 Weight D) Fuel 49.658 0.019621 Weight E) Fuel 69.916 0.012991 Weight 17. Which of the following is the fuel efficiency predicted by this model for a four-cylinder car that weighs 2800 pounds? A) 31.1 B) 31.8 C) 33.5 D) 52.1 E) 51.4 18. Which one of the following statements is supported by this regression model? A) A one-pound increase in the weight of a four-cylinder engine reduces predicted fuel efficiency by the same amount as a one-pound increase in the weight of a six-cylinder engine. B) For every one-pound increase in the weight of a four-cylinder engine, predicted fuel efficiency increases by 0.001591. C) For every one-pound increase in the weight of a four-cylinder engine, predicted fuel efficiency increases by 0.006630. D) For every one-pound increase in the weight of a six-cylinder engine, predicted fuel efficiency decreases by 0.012991 miles per gallon. E) For every one-pound increase in the weight of a six-cylinder engine, predicted fuel efficiency decreases by 0.006361 miles per gallon.

Answers 1. A 2. D 3. E 4. C 5. D 6. A 7. B 8. E 9. D 10. E 11. C 12. B 13. D 14. C 15. D 16. A 17. C 18. E