Draft 1, Attempted 2014 FR Solutions, AP Statistics Exam



Similar documents
Results from the 2014 AP Statistics Exam. Jessica Utts, University of California, Irvine Chief Reader, AP Statistics

Statistics 2014 Scoring Guidelines

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Exercise 1.12 (Pg )

Chapter 7 Section 7.1: Inference for the Mean of a Population

How To Check For Differences In The One Way Anova

Statistics 151 Practice Midterm 1 Mike Kowalski

AP STATISTICS (Warm-Up Exercises)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Second Midterm Exam (MATH1070 Spring 2012)

Father s height (inches)

STAT 350 Practice Final Exam Solution (Spring 2015)

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

AP Statistics Solutions to Packet 2

Unbeknownst to us, the entire population consists of 5 cloned sheep with ages 10, 11, 12, 13, 14 months.

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

11. Analysis of Case-control Studies Logistic Regression

5. Linear Regression

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

1) The table lists the smoking habits of a group of college students. Answer: 0.218

Practice#1(chapter1,2) Name

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

430 Statistics and Financial Mathematics for Business

Mind on Statistics. Chapter 8

Lecture 1: Review and Exploratory Data Analysis (EDA)

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

socscimajor yes no TOTAL female male TOTAL

Statistics Class Level Test Mu Alpha Theta State 2008

Review #2. Statistics

Fairfield Public Schools

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

Solutions to Homework 3 Statistics 302 Professor Larget

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Statistics 104: Section 6!

Simple Linear Regression

The Math. P (x) = 5! = = 120.

Example 1: Dear Abby. Stat Camp for the Full-time MBA Program

Section 1.3 Exercises (Solutions)

Thursday, November 13: 6.1 Discrete Random Variables

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

COMMON CORE STATE STANDARDS FOR

AP STATISTICS REVIEW (YMS Chapters 1-8)

3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventy-year-old person enters the room, the

How To Write A Data Analysis

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Week 3&4: Z tables and the Sampling Distribution of X

A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion

Simple Regression Theory II 2010 Samuel L. Baker

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Descriptive Statistics

17. SIMPLE LINEAR REGRESSION II

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

List of Examples. Examples 319

Characteristics of Binomial Distributions

7 CONTINUOUS PROBABILITY DISTRIBUTIONS

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

b. What is the probability of an event that is certain to occur? ANSWER: P(certain to occur) = 1.0

Premaster Statistics Tutorial 4 Full solutions

GRADES 7, 8, AND 9 BIG IDEAS

Math 201: Statistics November 30, 2006

Chapter 3. The Normal Distribution

Problem of the Month Through the Grapevine

CALCULATIONS & STATISTICS

The correlation coefficient

Solutions to Homework 6 Statistics 302 Professor Larget

AP Statistics. Chapter 4 Review

Chapter 7: Simple linear regression Learning Objectives

Exploratory data analysis (Chapter 2) Fall 2011

Solution. Solution. (a) Sum of probabilities = 1 (Verify) (b) (see graph) Chapter 4 (Sections ) Homework Solutions. Section 4.

UNIT 1: COLLECTING DATA

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

X: Probability:

Chapter 5 Section 2 day f.notebook. November 17, Honors Statistics

MTH 140 Statistics Videos

Regression Analysis: A Complete Example

Name: Date: Use the following to answer questions 2-3:

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Statistics Review PSY379

Unit 26: Small Sample Inference for One Mean

Two-sample hypothesis testing, II /16/2004

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Problem Solving and Data Analysis

Univariate Regression

What is the purpose of this document? What is in the document? How do I send Feedback?

7. Normal Distributions

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

The importance of graphing the data: Anscombe s regression examples

Near Misses in Bingo

Transcription:

Free response questions, 2014, first draft! Note: Some notes: Please make critiques, suggest improvements, and ask questions. This is just one AP stats teacher s initial attempts at solving these. I, as you, want to learn from this process. I simply construct these as a service for both students and teachers to start discussions. There is nothing official about these solutions. I certainly can t even guarantee that they are correct. They probably have typos and errors. But if they generate discussion and help others, than I ve succeeded. They are simply one statistics teacher s attempt at the problems. I do this as a way to invite dialogue about the questions. You can go here to access the problems. This is a public site. http://media.collegeboard.com/digitalservices/pdf/ap/ap14_frq_statistics.pdf 1. Proportion of those on campus who participated in at least one activity: 7+17 33 0.727 Proportion of those off campus who participated in at least one activity: 25+12 67 0.552 In this sample, students who reported living off campus were more likely to report participating in no extracurricular activities than those living on campus. In this sample, students who reported living on campus were more likely to report participating in exactly one extracurricular activity than those who reported living off campus. In this sample, students who reported living on campus were a bit more likely to report participating in two or more extracurricular activities than those reporting living off campus. Because the p value of 0.23 is above any reasonable significance level (like 0.10), we do not have convincing evidence of an association between residential status and l the level of extracurricular participation for students at this university. 1

2. P(select woman, then another woman, then another woman) 3 2 1 = 9 8 7 0.0119 Yes, there is reason to doubt the manager s claim. The chance that a truly random draw results in all three women being picked is rare: less than 2% of all draws of three students would produce a result as unusual as this. This simulation does not work, because the probability of selecting a woman (or a man) needs to change after each draw. We are selecting people without replacement from a small group (only 9 people) in the real lottery, but by using a six- sided die to simulate the drawing, this simulation s plan assumes that we are drawing with replacement. 3. Let X = # students absent at this school on a randomly selected day. X is approximately normal, with µ X = 120, σ X = 10.5. P(X > 140) = P(Z > 140 120 10.5 ) P(Z > 1.9048).0284 This question requires that we consider the sampling distribution of X, the sample mean, with n = 3. This distribution is approximately normal (because we took a random sample of 3 from an approximately normal population) In this case µ X = 140 and σ X = 10.5 3. Method I: Both sampling plans create estimates with the same mean: 120 students. But since the standard deviation of the sample mean is lower than the standard deviation of individuals from the population, we would expect that for part, we would be less likely to lose funding, because we are less likely to get a sample mean of 140 from three days than we are to get a result of 140 or higher from a single day. Method II: For part, P(X > 140) = P(Z > 140 120 ) P(Z > 3.299).000485. 10.5/ 3 This is a lower probability from part, so we are less likely. P(no Mondays or Fridays in three weeks)= 3 5 ( ) 3 = 0.216 2

4. A statistical advantage of selecting the median over the mean is that we get an estimate that tends to vary less than the mean does. Samples of incomes tend to be skewed to the right. They also tend to have a few unusually high incomes. Sample means tend to change more depending on whether outlier incomes are selected in your sample or not. Sample medians, however, are more resistant to outliers, and therefore tend to vary less in repeated sampling. When using random samples, either statistic can be unbiased, but medians will show less variability in repeated sampling than means will. Note: It is unclear whether the mean or the median is a better estimate from a practical perspective. Is the dean trying to estimate total future donations? Is the dean trying to determine plausible lower limit for the 50 th percentile of incomes? It s unclear form the context what the purpose of the informal survey is. Estimates are strong if they show low bias and low variability. Assessing of bias: Method 1 is likely to result in bias, because those who respond to the sample do so voluntarily. It s reasonable to predict that those who choose to report their income are more satisfied with their income than those who refuse to share. I would predict that Method 1 would produce an estimate that is too high. Method 2, however is done via random sampling, with a good plan for following up with each person selected. As long as responses are kept confidential, I would predict that Method 2 has much less bias than Method 1. Assessing variability: A sample of 100 (as in method 2) would produce an estimate that has more variability than a sample of size 600 (as is predicted with method 1). However, I would be skeptical of the unsupported claim that the people proposing method 1 will get 600 responses for their online survey. Despite the suspicion of a larger sample size, their claim is not convincing. I would choose method 1 over method 2 over method 1 despite the possibility of lower variability because of the clear bias. For this reason, pick method 2 over method 1, 3

5. For this problem, we performed a study on a sample of eight randomly selected car models. We recorded: W= amount paid by a randomly selected woman for this model M = amount paid by a randomly selected man for this model Since we record two variables on each model that are clearly associated, this is a matched pairs design, and thus requires a one sample t test on paired differences. The parameter of interest is µ W M = mean amount that a randomly selected woman pays for a randomly selected car model at this dealership above what a randomly selected male pays for an identical car form the same model. We are testing, at a 5% significance level: H 0 : µ W M = 0 H A : µ W M > 0 Conditions for a one- sample t test on paired differences: Normality: The dot plot of differences appears to be roughly symmetric, with no outliers. It is reasonable to conclude that the underlying sampling distribution of differences is papproximately normal). Randomness: a random sample of eight car models was chosen from tax records of all car purchases in the county. Within that, we randomly selected one woman and one man who purchased a similarly equipped car of that model. Independence: Since each difference was computed from a randomly selected woman and a randomly selected man for a randomly selected model, I feel confident that individual differences within the sample are independent. I also believe that 8 < (1/10) (number of car models in the county) Mechanics: Using a t- test for paired differences with 7 degrees of freedom: Sample mean : $585.00 Sample SD: $530.71 N = 8 t statistic: 3.1178 p- value: 0.008. Conclusion: Our p- value 0.008 < 0.05, our significance (alpha) level. Therefore we have convincing evidence that women tend to pay more, on average, than men for car models in this county. 4

6. FCR! = 1.595789 + 0.0372614(175) 4.925. Residual = FCR FCR! 5.880 4.925 0.955 This car consumes about 0.955 gallons per 100 miles more than what the linear model would predict for cars that are as long as this (175 inches). i. The residual for car A, based on the regression equation predicting FCR from length, is +0.955. This car has a wheel base of about 93 inches. This car is circled to the right. ii. for car B, the prediction of FCR based on car length is nearly perfect: The actual FCR for car B is just a bit higher than the predicted FCR. In graph II, we are looking at the association between engine size and the residuals from the regression model to predict FCR from car length. We can see a moderate, positive linear association between these variables. This means that the residuals tend to be negative when engine sizes are shorter (meaning that the regression model with car length tends to over- predict FCR for these cars). When engine sizes are longer, the residuals are positive (meaning that the regression model with car length tends to under- predict FCR). In graph III we look at the association between wheel base and the residuals from the regression model to predict FCR from car length. In this case, we see no clear pattern between these variables - this means that there is no systematic tendency for predictions from the regression equation to change when wheel base changes. (d) I recommend that Jamal choose engine size over wheel base size. Because the existing linear model s predictions systematically vary when engine size increases, we know that including engine size in a regression model will explain additional variability in FCR that is not currently explained by the existing regression model However, because there is no systematic pattern between wheel base and the residuals, we would not expect wheel base to explain much additional variability in FCR. 5

6