Statistics 101 Fall 2013 Midterm I Exam Solutions

Similar documents
General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Statistics 151 Practice Midterm 1 Mike Kowalski

Regression Analysis: A Complete Example

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Fairfield Public Schools

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Name: Date: Use the following to answer questions 3-4:

STAT 350 Practice Final Exam Solution (Spring 2015)

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

First Midterm Exam (MATH1070 Spring 2012)

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Father s height (inches)

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

MTH 140 Statistics Videos

Module 3: Correlation and Covariance

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Course Syllabus MATH 110 Introduction to Statistics 3 credits

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Mean = (sum of the values / the number of the value) if probabilities are equal

Descriptive Statistics

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Name: Date: Use the following to answer questions 2-3:

Mind on Statistics. Chapter 8

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

The Dummy s Guide to Data Analysis Using SPSS

A full analysis example Multiple correlations Partial correlations

Premaster Statistics Tutorial 4 Full solutions

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

2013 MBA Jump Start Program. Statistics Module Part 3

Applied Data Analysis. Fall 2015

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Chapter 7: Simple linear regression Learning Objectives

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Final Exam Practice Problem Answers

4. Continuous Random Variables, the Pareto and Normal Distributions

Chapter 23 Inferences About Means

AP Statistics: Syllabus 1

CALCULATIONS & STATISTICS

COMMON CORE STATE STANDARDS FOR

Mind on Statistics. Chapter 12

Chapter Four. Data Analyses and Presentation of the Findings

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

6.2 Normal distribution. Standard Normal Distribution:

Using Excel for Statistical Analysis

Simple Linear Regression

Simple Regression Theory II 2010 Samuel L. Baker

Math Quizzes Winter 2009

Second Midterm Exam (MATH1070 Spring 2012)

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

How Far is too Far? Statistical Outlier Detection

Statistics 2014 Scoring Guidelines

AP STATISTICS (Warm-Up Exercises)

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

SAMPLING DISTRIBUTIONS

University of Chicago Graduate School of Business. Business 41000: Business Statistics

STAT 2300: BUSINESS STATISTICS Section 002, Summer Semester 2009

What is the purpose of this document? What is in the document? How do I send Feedback?

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

430 Statistics and Financial Mathematics for Business

Probability Distributions

II. DISTRIBUTIONS distribution normal distribution. standard scores

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Factors affecting online sales

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Mathematics (Project Maths)

socscimajor yes no TOTAL female male TOTAL

International Statistical Institute, 56th Session, 2007: Phil Everson

Module 5: Multiple Regression Analysis

Study Guide for the Final Exam

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1) The table lists the smoking habits of a group of college students. Answer: 0.218

Elements of statistics (MATH0487-1)

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

UNIT 1: COLLECTING DATA

Chapter 23. Inferences for Regression

Multiple Regression: What Is It?

Simple linear regression

Chapter 5 Analysis of variance SPSS Analysis of variance

Constructing and Interpreting Confidence Intervals

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Math 251, Review Questions for Test 3 Rough Answers

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Chapter 8 Section 1. Homework A

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Chapter 1: Exploring Data

Transcription:

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 1 OF 7 Statistics 101 Fall 2013 Midterm I Exam Solutions 1. (9 points) Multiple choice. No explanations needed. Forbes magazine published data on the annual salary of the chief executive officer for 50 of the best small firms in 2012. The annual salary was reported in $1,000s. The salaries ranged from $44,000 to $2,105,000 and are depicted in the two plots below. a) (3 points) The mean of these salaries is $730,000. What is your guess of the median: $200,000, $400,000, $600,000, or $1,000,000? $600,000 (it s the line in the middle of the box in the boxplot). b) (3 points) What is your guess of the standard deviation for this data: $200,000, $400,000, $600,000, or $1,000,000? $400,000 (The whole range is $2million, divide that by 5 to get a guess of SD) c) (3 points) What is your guess of the interquartile range for this data: $200,000, $400,000, $600,000, or $1,000,000? $600,000 (it s the width of the box in the boxplot).

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 2 OF 7 2. (21 points) The General Social Survey, an observational study, measured the annual income (income: measured in dollars) and education (educ: measured in number of years) for n = 1758 adults in the US. The results of a regression for this data are shown below: a) (5 points) What is the value for the slope in this model? What is its interpretation? The slope in this model is $5168.32. This means that for every extra year of school completed, a person s family income is predicted to increase by $5168.32 on average. b) (4 points) Interpret the value of R 2 for this regression. R 2 for this model is reported to be 0.154. This means that 15.4% of the variability in income can be predicted the number of years of school completed. c) (4 points) College graduates will typically have 16 years of education after receiving their Bachelor s degree. Based on this model, what is the predicted annual income for a college graduate? ^ y = a + b(x) = -36133 + 5168(x) = -36133 + 5168(16) = $46,555

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 3 OF 7 d) (4 points) A recently graduated friend of yours (with 16 years of education) gets a job and doesn t want to tell you her income, but you know her residual is $25,000. What is your friend s actual income? ^ e = y y. Solving for y we get: y = y + e = 25000 + 46555 = $71,555 ^ e) (4 points) This friend of yours interprets this model to mean that more education causes increased income later in life. Briefly comment on your friend s interpretation. Provide a specific explanation, if appropriate. This may not be a correct conclusion; this relationship may not be causal. Since this is based on a survey (on observational study), there may be confounding factors in this association. For example, people who are hard-working or more driven may end up getting more education and have more income. If you took away this extra education, they may make the same amount of income anyway. 3. (24 points) The National Collegiate Athletic Association (NCAA) requires a Division I athlete (one that has an average high school GPA) to score at least 820 on the combined math and verbal parts of the SAT exam to compete in their first college year. In 2012, the scores of all students nationwide taking the SATs were approximately normally distributed with mean μ = 1012 and standard deviation σ = 219. a) (6 points) What proportion of all students nationwide had scores less than 820? P(X < 820) = P(Z < (820-1012)/219) = P(Z < -0.88) = P(Z > 0.88) = 0.1894

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 4 OF 7 b) (6 points) The NCAA is considering raising this minimal SAT level to 900. What percent of all students nationwide would this new policy affect and were not affected by the old policy? P(820 < X < 900) = P(X < 900) - P(X < 820) = P(Z < (900-1012)/219) P(Z < (820-1012)/219) = P(Z < -0.51) - P(Z < -0.88) = 0.3050-0.1894 = 0.1156 c) (6 points) Harvard reports that 56% of Harvard s population scored 1400 or higher on the combined math and verbal parts of the SATs. What is the corresponding SAT score for all students nationwide. That is, what score on the SAT is needed to be below 56% of all students nationwide (not just Harvard)? Here, we need to first find the z-value that puts 56% of the distribution to the right of it (so we know it should be negative). Looking it up on the standard normal z-table, we see that a z-value of z * = -0.15 is the value we want. Thus in terms of SAT scores in the general population: X = µ + z*(σ) = 1012 0.15(219) = 979 d) (6 points) There are 29 players on Harvard s men s soccer team. Assuming they are a random sample from Harvard s population, what is the approximate probability that fewer than half of them scored 1400 or higher on their combined math and verbal parts of the SATs? Let X = # soccer players with an SAT score above 1400. It is safe to assume that X follows a Binomial distribution. More specifically, X ~ Bin(n = 29, π = 0.56). First we need to calculate: ( ) ( ) ( )( ) ( ) ( ) ( ) ( ) Note: we used the normal approximation to the Binomial here, and that s OK since nπ = 29(0.56) = 16.24 10 and nπ = 29(0.56) = 12.76 10.

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 5 OF 7 4. (20 points) According to a study by the Harvard School of Public Health, 44% of college students engage in binge drinking, and 56% are not binge drinkers (they either drink moderately or abstain entirely). Another study has found that among student binge drinkers, 17% have been involved in an alcohol related automobile accident. Among students who are not binge drinkers, 9% have been involved in such accidents. a) (5 points) Are the events binge drinking and being involved in an alcohol-related automobile accident independent? Support your statement numerically. Let s define the events: A = auto accident B = binge drinker We were given P(B) = 0.44, P(A B) = 0.17 and P(A B C ) = 0.09. Since P(A B) = 0.17 P(A B C ) = 0.09, we know that A and B are dependent. b) (5 points) What is the probability that a randomly selected college student will be both a binge drinker and will have been involved in an alcohol related automobile accident? P(A and B) = P(A B)*P(B) = 0.44(0.17) = 0.0748 c) (5 points) What is the probability that a randomly selected college student will be a binge drinker or will have been involved in an alcohol related automobile accident? P(A or B) = P(A) + P(B) P(A and B) = 0.1252 + 0.44 0.0748 = 0.4904 Note: P(A) = P(A and B) + P(A and B C ) = 0.44(0.17) + 0.056(0.09) = 0.0748 + 0.0504 = 0.1252 Or it can be solved by: P(A or B) = P(B) + P(A and B C ) = 0.44 + 0.56(0.09) = 0.4904

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 6 OF 7 d) (5 points) Given a college student has been involved in an alcohol-related automobile accident, what is the probability that he or she is a binge drinker? Based on previous work: P(B A) = P(A and B) / P(A) = 0.0748 / 0.1252 = 0.597 Note, this can be calculated based on the 2x2 table or directly by Bayes theorem. 5. (9 points) Multiple Choice: no explanation is needed. Note, these problems are not related. a) (3 points) Which of the following is NOT a property of r, the correlation coefficient? i) r is always between 0 and 1. ii) r does not depend on the units of y or x. iii) r measures the strength of the linear relationship between x and y. iv) r does not depend on which of the two variables is labeled as x. b) (3 points) On a statistics exam with a mean of 76 and SD of 12, Tom scored one standard deviation above the mean, Mary had an exam score 79, and Bill had a z-score of z = - 0.5. Place these three students in order from lowest to the highest score. i) Bill, Mary, Tom ii) Mary, Tom, Bill iii) Tom, Bill, Mary iv) Tom, Mary, Bill v) Cannot be determined from the information given. c) (3 points) A researcher calculated the values and probabilities for a random variable X as shown below. Unfortunately, he erased the last value and needs to figure out what it was. If the mean of X was 4, then what was the last value? i) 6 ii) 10 iii) 14 iv) 18 x 0 1 5?? P(X = x) 0.4 0.2 0.2 0.2

STATISTICS 101 FALL 2013 MIDTERM I EXAM SOLUTIONS PAGE 7 OF 7 6. (18 points) A study was conducted to determine the GPA of Harvard college students who have experienced a case of mononucleosis infection (mono). A sample of 49 students who experienced a case of mono had an average GPA of 3.02 points in the same semester they had mono, with a standard deviation of 0.56 points. It is known that the population distribution of GPA at Harvard has a mean of 3.25 points. a) (6 points) Calculate the 95% confidence interval for the average GPA for Harvard students who experienced a case of mono. x ± t*(s/ n ) = 3.02 ± 2.021*(0.56/ 49 ) = (2.86, 3.18) b) (4 points) Ignoring any issue regarding study design, consider the 95% confidence limits you calculated in part (a) and comment on whether or not this appears to be evidence that experiencing mono affects GPA. Support your statement numerically. Since µ = 3.25 is not inside the confidence interval, then this value could be rejected as a null hypothesis. This is the mean GPA for all of Harvard, so it appears that the mean for our group, the mono-infected students, is different (in fact it is lower than all of Harvard). c) (3 points) If the true population mean GPA for all students with mono was 3.10, what would be the correct type of conclusion from 2-sided hypothesis test based on this data? i) Type I error ii) Type II error iii) Correct conclusion d) (4 points) Comment on the quality of this study design and whether or not this study has shown that experiencing mono affects GPA. Provide a specific explanation, if appropriate. This is also an observational study (we cannot ethically randomize people to have mono or not), so there may be a confounding variable at the root of the lowered GPA for these students. One such example: these students may be partiers, which put them at a higher risk of getting mono, and the partying may be why they have a lower GPA (not the mono). - END OF EXAM - (remaining pages are tables)