MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010



Similar documents
AMS 5 CHANCE VARIABILITY

Chapter 26: Tests of Significance

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

Stat 20: Intro to Probability and Statistics

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

The overall size of these chance errors is measured by their RMS HALF THE NUMBER OF TOSSES NUMBER OF HEADS MINUS NUMBER OF TOSSES

John Kerrich s coin-tossing Experiment. Law of Averages - pg. 294 Moore s Text

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Elementary Statistics and Inference. Elementary Statistics and Inference. 16 The Law of Averages (cont.) 22S:025 or 7P:025.

Name: Date: Use the following to answer questions 3-4:

13.0 Central Limit Theorem

Chapter 20: chance error in sampling

Study Guide for the Final Exam

$ ( $1) = 40

Unit 27: Comparing Two Means

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Chapter 16: law of averages

NCSS Statistical Software

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

Chapter 16. Law of averages. Chance. Example 1: rolling two dice Sum of draws. Setting up a. Example 2: American roulette. Summary.

In the situations that we will encounter, we may generally calculate the probability of an event

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Stat 20: Intro to Probability and Statistics

Section 7C: The Law of Large Numbers

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Statistics 100A Homework 4 Solutions

The Math. P (x) = 5! = = 120.

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Two-sample inference: Continuous data

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

University of Chicago Graduate School of Business. Business 41000: Business Statistics

Chapter 23 Inferences About Means

3.2 Roulette and Markov Chains

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

p-values and significance levels (false positive or false alarm rates)

NCSS Statistical Software

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Regression Analysis: A Complete Example

MOVIES, GAMBLING, SECRET CODES, JUST MATRIX MAGIC

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 2 Probability Topics SPSS T tests

WISE Power Tutorial All Exercises

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

6.042/18.062J Mathematics for Computer Science. Expected Value I

Tutorial 5: Hypothesis Testing

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Expected Value. 24 February Expected Value 24 February /19

THE WINNING ROULETTE SYSTEM by

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 8 Section 1. Homework A

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Non-Inferiority Tests for One Mean

The Variability of P-Values. Summary

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Testing a claim about a population mean

Betting systems: how not to lose your money gambling

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Probability: The Study of Randomness Randomness and Probability Models. IPS Chapters 4 Sections

36 Odds, Expected Value, and Conditional Probability

Fairfield Public Schools

Term Project: Roulette

Module 4 (Effect of Alcohol on Worms): Data Analysis

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Probability and Expected Value

Hypothesis Testing. Steps for a hypothesis test:

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Choice Under Uncertainty

AP Statistics 7!3! 6!

Section 13, Part 1 ANOVA. Analysis Of Variance

Chapter 7 Section 1 Homework Set A

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Chapter 5. Discrete Probability Distributions

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Some Essential Statistics The Lure of Statistics

In the past, the increase in the price of gasoline could be attributed to major national or global

Math 108 Exam 3 Solutions Spring 00

Betting on Excel to enliven the teaching of probability

Hypothesis Tests for 1 sample Proportions

Example: Find the expected value of the random variable X. X P(X)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

HYPOTHESIS TESTING WITH SPSS:

MTH 140 Statistics Videos

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Quantitative Methods for Finance

Expected Value and the Game of Craps

Transcription:

MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times the average of the tickets in the box The SE is n times the SD of the tickets in the box (b) (0) What does the Central Limit Theorem say? Solution: The Central Limit Theorem says that the probability histogram for sum of n draws with replacement from any box will tend to a (shifted, scaled) normal curve as n, in the sense that the area in the histogram for for any range of sum values will tend to the area under the corresponding normal curve (c) (0) What is a 99% confidence interval for an average? Solution: The 99% confidence interval is the interval sample average ± 3 (d) (0) What is the p-value of a test of significance? sample SD n Solution: The p-value is the chance that the observed value of the test statistic, or something more extreme, would occur if the null hypothesis is true (e) (0) When is a t-test for an average used instead of a z-test? The t-test is used when the size of the sample is small (usually n < 30), and it is known that the sampled values come from a population that has a normal distribution A statistician tosses a coin 00 times His null hypothesis is that the coin is fair, and his alternative is that the coin is biased in favor of heads The coin comes up heads 60 times For each of the following, say whether the statement is true or false, and explain (a) (0) If the coin was fair, there would be about a 3% chance of getting 60 or more heads in 00 tosses Solution: This is TRUE Making 00 tosses of a fair coin is like drawing 00 times with replacement from the box [0,] The expected number of heads is 00 = 50, and the SE for the number of heads is 00 = 5 Using the normal approximation, the chance of getting 60 or more heads is approximated by the area in the upper tail of the normal curve starting at 595 50 5 = 9 From the normal table, his area is (00% A(9)) = (00% 946%) = 87%, which is about 3% (Recall that in computing chances for numbers of discrete events using the normal curve, the corresponding area for a single number extends from that number minus 5 to that number plus 5 This is the continuity correction )

(b) (0) Given that 60 heads are observed, there is only about a 3% chance that the coin is fair Solution: This is FALSE A particular coin is either fair or not There is no chance there The chance is in the tossing, not in the coin itself 3 A gambler is playing roulette On each play, he bets $ on a split (two adjacent numbers), which pays 7 to That is, if one of his numbers comes up, he will win $7 on a $ bet Otherwise, he loses the $ (a) (5) Construct a box model for the payoff on this bet (Hint: Recall there are a total of 38 slots on a roulette wheel) Solution: The box is [7, 7, 36 ] (b) (0) Compute the average and SD of the box Solution: The average is (7 + 7 + 34 ( ))/38 = 053 The SD, by the shortcut method, is: SD = (7 ( )) 38 36 = 40 38 (c) (0) If the bet is played 00 times, compute the EV and the SE for the payoff Solution: The EV is 00 053 = 53 (that is the gambler is expected to lose about $530 overall The SE is 00 $40 = $40 (d) (0) What are the gambler s chances of coming out ahead on 00 spins of the wheel? Solution: We want to estimate the chance that the net payoff is at least 0 In standard units, this corresponds to z = 0 ( 53) = 3 40 From the normal table, we can estimate the chance as being between (00% A(0)) = (00% 797) = 46% and (00% A(5)) = (00% 9) = 44% (Not bad, but the casino always wins in the end!) 4 An airline does a market research survey on travel patterns It takes a simple random sample of 5 people and works out the 95% confidence interval for the average distance they traveled on vacation in the previous year The computed interval extends from 488 to 59 miles Say whether each statement below is true or false and give reasons If there is not enough information given to decide, explain what other information you would need (a) (5) The average distance traveled was about 540 miles

Solution: TRUE As in question part c, the midpoint of the confidence interval is the observed average The midpoint is (488 + 59)/ = 540 (b) (5) The SD of the distances traveled was about 00 miles Solution: FALSE The confidence interval extends SD 5 in both directions from the midpoint 5 = SD 5 implies the SD had to be about 390 (c) (5) The probability histogram for averages of samples of size 5 drawn from the population is probably close to (scaled, shifted) normal curve Solution: TRUE This is far enough into the large sample range that the Central Limit Theorem would apply in almost every case (d) (5) The probability histogram for the distances traveled by randomly chosen individuals in the population is close to a (scaled, shifted) normal curve Solution: NEED MORE INFORMATION There is no way to tell whether that is true from the information given (Like many real-world distributions, this one could have a long right tail if there were a few people who traveled very long distances, while the majority traveled much smaller distances) (e) (5) If the sample size is increased to 450, the width of the confidence interval will always be about half as wide Solution: FALSE If the SD stayed the same, we would expect the width to decrease by a factor of = 707, not = 5 The SD of the sample could also change 5 A simple random sample of 000 persons is taken to determine the percentage of Democrats in a large population It turns out that 543 of the people in the sample are Democrats (a) (0) Compute a 95% confidence interval for the percentage of Democrats in the population Solution: The observed percentage in the sample is 543 by the bootstrap method: 000 = 543% The SE is estimated SE for percent = Then the confidence interval is (543)(457) 000 00% = 58% observed ± SE = 543% ± 3% (b) (0) True or False and explain: There is about a 5% chance that the actual percentage of Democrats is outside the interval found in part A Solution: FALSE The chance is in the sampling process, not in the location of the actual percentage of Democrats The correct interpretation is that about 5% of samples would yield intervals that did not contain the actual percentage of Democrats 3

6 The National Assessment of Educational Progress tested 7-year-olds on mathematics in 990 and again in 004 In 990, the average score was 305 and the SD was 34 In 004, the average score was 307 and the SD was 7 In both years, simple random samples of size 000 were used Can the difference be explained as chance variation, or have average mathematics scores really gone up? (a) (5) Should you use a one-sample or a two-sample z-test for this kind of question? Solution: This calls for a two-sample z-test, since there are two completely independent groups of students who would have taken the tests in the two different years (b) (0) Formulate null and alternative hypotheses in terms of a box model (Say how many boxes there are, how many tickets go in each, and what is on the tickets scores or 0/ s) Solution: There are two boxes, one for the population of students that took the test in each year There is a ticket for each student each time The tickets have their scores, not 0/ The null hypothesis would be that the averages of the two boxes are the same; the alternative is that the 004 average is higher (c) (0) Carry out the test, find the associated p-value, and discuss your results Solution: We use the -sample z-test formulas The test statistic is z = 307 305 (35) 000 + (7) 000 = 43 From the normal table, we see that the p-value is found between (00% A(40)) = (00% 8385) = 8% p = 08 and (00% A(45)) = (00% 859) = 74% p = 074 Usually p-values like these are not considered small enough to reject the null hypothesis We conclude the difference could have been due to chance (And the -point difference in average score hardly seems practically significant in any case!) 7 A New York Times/CBS poll conducted between April 5 and April, 00 included the question Do you approve or disapprove of the way your Representative in Congress is handling his or her job? The poll was carried out in two stages First a large national simple random sample was asked the question The results were that 46% of respondents said they approved, 36% said they disapproved, and 8% said they had no opinion Then the same question was asked of a separate sample of 88 people, all of whom had identified themselves as supporters of the Tea Party movement The responses broke down like this: Approve Disapprove No opinion Total Tea Party Sample 35 43 97 88 4

(Data is adapted from the New York Times web site) It seems from the numbers that Tea Party supporters may differ from the general population when it comes to their opinions concerning their Congressional Representatives But is this a real difference, or could it just be a product of chance variation in the sampling process? (a) (5) Formulate a null and alternative hypothesis for this question Solution: The null hypothesis would be that there is really no difference, and the observed differences were due to chance The alternative would be that Tea Partiers really do have a different opinion about the jobs that their Congressional Representatives are doing (b) (5) What significance test should you use to decide between these? Solution: This calls for a χ test The idea is to see whether the data from the Tea Partiers matches the national percentages out of n = 88 In other words, do a goodness of fit test (c) (0) Carry out your test of signficance What do you get for a test statistic and observed significance level (p-value)? Solution: If the Tea Partiers were really the same as the general population in this regard, then the expected numbers of people in the Approve, Disapprove, and No opinion categories would be: 46 88 = 4053, 36 88 = 37, 8 88 = 586 We use these to compute the χ statistic using the entries from the second row: χ = (35 4053) 4053 + (43 37) 37 + (97 586) 586 = 75 From the χ table with degrees of freedom, we see that p < 0 (much, much smaller!) (d) (5) Does this data give evidence of a real difference between Tea Party supporters and the public as a whole? Explain Solution: This is extremely strong evidence that Tea Partiers have a different set of opinions on this question (They are much less likely to approve, and also much less likely not to have an opinion!) Have a safe, enjoyable, and productive summer! 5