Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011



Similar documents
Statistics 104: Section 6!

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Simple Linear Regression Inference

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 7: Simple linear regression Learning Objectives

Univariate Regression

4. Continuous Random Variables, the Pareto and Normal Distributions

5. Linear Regression

Final Exam Practice Problem Answers

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Chapter 7 Section 1 Homework Set A

Recall this chart that showed how most of our course would be organized:

University of Chicago Graduate School of Business. Business 41000: Business Statistics

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

17. SIMPLE LINEAR REGRESSION II

AMS 5 CHANCE VARIABILITY

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

1 Simple Linear Regression I Least Squares Estimation

Part 2: Analysis of Relationship Between Two Variables

2013 MBA Jump Start Program. Statistics Module Part 3

Problem sets for BUEC 333 Part 1: Probability and Statistics

August 2012 EXAMINATIONS Solution Part I

Comparing Means in Two Populations

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Section 1: Simple Linear Regression

How To Check For Differences In The One Way Anova

Regression Analysis: A Complete Example

Lecture Notes Module 1

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Simple linear regression

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Hypothesis Testing for Beginners

AP STATISTICS (Warm-Up Exercises)

STAT 350 Practice Final Exam Solution (Spring 2015)

DATA INTERPRETATION AND STATISTICS

Statistics 151 Practice Midterm 1 Mike Kowalski

Introduction to Quantitative Methods

Two-sample hypothesis testing, II /16/2004

Simple Linear Regression

Exploratory Data Analysis

Section 13, Part 1 ANOVA. Analysis Of Variance

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Quantitative Methods for Finance

Results from the 2014 AP Statistics Exam. Jessica Utts, University of California, Irvine Chief Reader, AP Statistics

Tutorial 5: Hypothesis Testing

Simple Regression Theory II 2010 Samuel L. Baker

Permutation Tests for Comparing Two Populations

MULTIPLE REGRESSION EXAMPLE

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Using R for Linear Regression

International Statistical Institute, 56th Session, 2007: Phil Everson

2. Simple Linear Regression

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

Introduction to Regression and Data Analysis

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Joint Exam 1/P Sample Exam 1

Expected Value and the Game of Craps

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key

Chapter 4. Probability Distributions

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

1.5 Oneway Analysis of Variance

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Chapter 7. One-way ANOVA

How To Write A Data Analysis

Name: Date: Use the following to answer questions 3-4:

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Interpreting Data in Normal Distributions

The Math. P (x) = 5! = = 120.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Hypothesis testing - Steps

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Geostatistics Exploratory Analysis

Data Analysis Tools. Tools for Summarizing Data

UNDERSTANDING THE TWO-WAY ANOVA

9. Sampling Distributions

Multiple Linear Regression

Description. Textbook. Grading. Objective

1. How different is the t distribution from the normal?

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

HYPOTHESIS TESTING WITH SPSS:

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Statistical Models in R

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Transcription:

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this exam. When time is called please stop writing immediately. There are 9 questions. Unless otherwise indicated, each part of each question is worth 2 points. You may use a calculator and two letter size (both sides) cheat sheets of your own notes. Present your answers in a clear and concise manner. 1

Question 1: 13 parts, 26 points Question 2: 5 parts, 10 points Question 3: 6 parts, 12 points Question 4: 3 parts, 6 points Question 5: 7 parts, 15 points Question 6: 7 parts, 15 points Question 7: 3 parts, 6 points Question 8: 9 parts, 18 points Question 9: 6 parts, 12 points Total: 120 points 2

Question # 1. The data in this question are returns on three portfolios called Market, SMB, and HML. These portfolios were made famous by Eugene Fama and Kenneth French and have been widely used by finance practitioners for over 30 years. For this question I collected monthly annualized returns on the three portfolios from January 1983 through February 2008, for a total of n = 302 observations. All returns are in percent. 25.0 Time series plot of SMB 20.0 15.0 10.0 5.0 0.0-5.0-10.0-15.0-20.0 198301 198501 198701 198901 199101 199301 199501 199701 199901 200101 200301 200501 200701 Use the time series plot above to answer the following questions. (a.) The sample mean of the SMB returns is Answer: (ii) (i.) -3.78 (ii.) 0.035 (iii.) 3.07 (iv.) 7.42 3

(b.) The sample standard deviation of SMB returns is approximately Answer: (ii) (i.) 1.56 (ii.) 3.27 (iii.) 5.38 (iv.) 9.12 NOTE: You can see this by noting that about 5% of the returns lie outside -6.5 to 6.5, which would be roughly 2 standard deviations above and below the mean. Remember to use the empirical rule as a rough approximation. (c.) Suppose I conduct a hypothesis test for the null hypothesis that the above data are i.i.d.. The p-value associated with this test is approximately Answer: (iii) (i.) 0.001 (ii.) 0.012 (iii.) 0.98 (iv.) 1.14 NOTE: From the plot, the data look approximately i.i.d.. Therefore we would expect a large p-value from the test. A large p-value would provide no evidence against the null hypothesis that they are i.i.d.. 4

Below are histograms of the three variables. The horizontal and vertical axes are the same in all three plots. 100 90 80 70 60 50 40 30 20 10 0 SMB 100 90 80 70 60 50 40 30 20 10 0 Market 100 90 80 70 60 50 40 30 20 10 0 HML Answer the following questions using the histograms on the previous page. (d.) Which of the three portfolio returns has the largest sample variance? Answer: (iii) (i.) SMB (ii.) HML (iii.) Market 5

(e.) Which of the three portfolio returns is most left-skewed? Answer: (iii) (i.) SMB (ii.) HML (iii.) Market (iv.) none of them (f.) I conducted a test of normality on each variable. The null hypothesis is the data are i.i.d. normal. Which produces the smallest p-value? Answer: (iii) (i.) SMB (ii.) HML (iii.) Market (iv.) all three are the same NOTE: A small p-value means that it provides evidence against the null hypothesis. Of the three histograms, the Market histogram looks the least bell-shaped, i.e. non-normal. (g.) The sample variance of the HML portfolio returns is approximately Answer: (iii) (i.) 3.12 (ii.) 6.57 (iii.) 9.52 (iv.) 18.04 NOTE: Remember to use the empirical rule as a rough approximation. We can see that 95% of the HML returns are between -5.5 and 6.5. This means that the standard dev. must be about 3. We can see that 9.52 = 3.085. 6

Below is a scatter plot of SMB versus HML returns. 20.0 February 2000 15.0 10.0 SMB 5.0 0.0-5.0-10.0-15.0-20.0-15 -10-5 0 5 10 15 HML Answer the following questions using the scatter plot on the previous page. (h.) The sample correlation between the SMB and HML returns is Answer: (ii) (i.) -0.85 (ii.) -0.42 (iii.) 0.42 (iv.) 0.85 (i.) If we deleted the February 2000 observation (indicated on the plot) from the sample, the sample correlation would be Answer: (ii) (i) closer to -1 (ii) closer to 0 (iii) closer to 1 (iv) unchanged (j.) What is the sample covariance between SMB and HML? s SMB,HML = s SMB s HML r s = (3.27) (3.085) ( 0.42) = 4.24 7

Suppose that we estimate the following linear regression model: SMB i = α + βhml i + ε i (k.) The R-squared of this regression will be approximately Answer: (i) (i.) 0.17 (ii.) 0.42 (iii.) 0.73 (iv.) 0.85 NOTE: remember that for simple linear regression, the R-squared is equal to the correlation squared! (l.) What is our estimate of the slope, b? NOTE: you have to know the formula for the slope coefficient from class. b = s SMB,HML s 2 HML = 4.24 9.52 = 0.441 (m.) Suppose we know that the HML return next month will be 2%, and we use the regression above to construct a 95% plug-in predictive interval for next month s SMB return. Such an interval implicitly assumes that Answer: (ii) (i.) HML is i.i.d. normal (ii.) ε is i.i.d. normal (iii.) both (i) and (ii) (iv.) none of the above NOTE: To compute the predictive interval for y, we don t need to make assumptions about the distribution of the right-hand side variable x but we do need to make an assumption about the distribution of the errors ε. 8

Question # 2. Multiple choice. For each question, choose one answer. (a.) Fill in the blanks in the following phrase, in order: Answer: (iv) Statistical methods draw conclusions about unknown based on computed from. (i.) parameters, samples, a statistic. (ii.) samples, statistics, a parameter. (iii.) statistics, parameters, a sample. (iv.) parameters, statistics, a sample. (v.) samples, parameters, a statistic. (b.) Suppose that most of the observations in a given data set are of the same magnitude, except for a few data points that are substantially larger. Which of the following would be true? Answer: (ii) (i.) The sample mean would be smaller than the median, and the histogram would be skewed with a long right tail. (ii.) The sample mean would be larger than the median, and the histogram would be skewed with a long right tail. (iii.) The sample mean would be smaller than the median, and the histogram would be skewed with a long left tail. (iv.) The sample mean would be larger than the median, and the histogram would be skewed with a long left tail. (v.) The sample mean and median would be approximately the same, and the histogram would be roughly symmetric. NOTE: We saw an example of this in the bank arrival time data. Large outliers affect the mean more than the median. The histogram had a pronounced right tail. 9

(c.) An achievement test is given each year to 3rd graders in a certain school district. Scores on the test are normally distributed with a mean of 100 points and a standard deviation of 15 points. If Jane s z-score was 1.2, how many points did she score on the test? Answer: (v) (i.) 82 (ii.) 88 (iii.) 100 (iv.) 112 (v.) 118 (d.) The Central Limit theorem implies that: Answer: (v) (i.) If we simulate 5,000 i.i.d. draws from any probability distribution, the histogram will appear bell-shaped. (ii.) If we simulate 5,000 i.i.d. draws from any probability distribution, the time series plot should not display any obvious patterns. (iii.) The average of 5,000 i.i.d. draws from any probability distribution should exactly equal the population mean. (iv.) If our sample consists of 5,000 iid draws from any probability distribution, the data points will be approximately normally distributed around the sample mean (v.) If we start with 5,000 i.i.d. draws from any probability distribution and let x 1 be the average of the first 50 data points, x 2 be the average of the 51st through 100th data points, x 3 be the average of the 101st through 151st data points, etc. then a histogram of the numbers x 1 through x 100 should appear bell-shaped. 10

(e.) You obtain a sample of 25 students from the same high school. Based on this data, a 95% confidence interval for the expected value of a student s SAT score is 900 to 1100. Which of the following is a valid interpretation of this interval? Answer: (iv) (i.) 95% of the 25 students in the sample have an SAT score between 900 and 1100. (ii.) 95% of the population of students at this high school will have an SAT score between 900 and 1100. (iii.) Given the outcomes in this sample, there is a 95% probability that the true expected value of SAT scores is between 900 and 1100. (iv.) If all high schools were the same and we repeated this procedure at many other schools, 95% of the resulting intervals would contain the true expected value of a student s SAT score. (v.) If all high schools were the same and we repeated this procedure at many other schools, 95% of the sample means would be between 900 and 1100. NOTE: Before we see the data, the 95% confidence interval has a 95% probability of covering the true (population) value of the parameter. For a particular sample (after we have seen the data), the true value is either in the interval or it is not. The answer (iii) may seem correct but it is technically wrong. 11

Question # 3. The following joint probability distribution is based on survey data collected by a major financial publication in 2002. For a randomly selected person living in the U.S., define the random variable S as the percentage of retirement income invested in the stock market. Define the random variable A as A = 1 A = 2 A = 3 if the person is below 30 years of age if the person is between 30 and 50 years old if the person is above 50 years old Based on the survey, we have come up with the following joint probability distribution for S and A: S 10% 30% 60% 1 0.04 0.05 0.01 A 2 0.05 0.23 0.19 3 0.10 0.26 0.07 (a.) What is the marginal probability that A = 3? P (A = 3) = P (A = 3, S = 0.1) + P (A = 3, S = 0.3) + P (A = 3, S = 0.6) = 0.10 + 0.26 + 0.07 = 0.43 (b.) What is the expected value of S? First, we need to know the marginal distribution of S. This is s p(s) 10% 0.19 S 30% 0.54 60% 0.27 12

E[S] = 0.19 (10) + 0.54 (30) + 0.27 (60) = 34.3% (c.) What is the standard deviation of S? V [S] = 0.19 (10 34.3) 2 + 0.54 (30 34.3) 2 + 0.27 (60 34.3) 2 = 300.51 This implies that the standard deviation is SD(S) = 300.51 = 17.3%. (d.) What is the probability that a randomly selected investor is below 50 years of age and has 30% or more of his retirement savings invested in the stock market? 0.05 + 0.01 + 0.23 + 0.19 = 0.48 (e.) Suppose we know a particular investor has only 10% of her retirement savings invested in the stock market. What is the probability she is over 50 years old? P (A = 3, S = 10%) P (A = 3 S = 10%) = P (S = 10%) = 0.10 0.19 = 0.526 13

(f.) Are A and S independent? Briefly justify your answer. From part (e), P (A = 3 S = 10%) = 0.526, while the marginal probability of being over 50 is P (A = 3) = 0.43. Therefore, the random variables A and S are not independent. 14

Question # 4. Suppose starting next Monday that I go to a casino every night for a week and play 125 hands of blackjack. Suppose that I bet $10 per hand, so on each hand I will either lose $10, push, win $10, or double down and win $20 (assume that house rules prohibit me from doubling down or splitting more than once). Suppose that my winnings on each hand, i = 1, 2,..., 125, is a random variable W i with E[W i ] = $0.10 σ(w i ) = $12.30 (a.) Suppose I look at my average winnings per hand on a given night, w = w 1 + w 2 + + w 125 125 where each hand W i is i.i.d.. What are the expected value and variance of w? We can use our linear formulas from Lecture #4 to show that: E[W ] = 0.10 V [W ] = (12.3)2 125 = 1.21 (b.) Now suppose I do this every night for a month (30 days). Assuming I play 125 hands each night, on approximately how many nights will I average a $1 or more loss per hand? By the Central Limit Theorem, we know W N(0.10, 1.21). On a normal distribution with µ = 0.1 and σ = 1.1, P (W < 1) = 0.16 because this is 1 standard deviation to the left of the mean. Consequently, I average a $1 loss or more on about 0.16*30 = 4.8 nights. 15

(c.) Your answer to part (b) implicitly makes use of the Central Limit Theorem. Which of the following correctly justifies your use of the CLT: Answer: (ii) (i.) Outcomes for each hand are very nearly normally distributed. (ii.) Outcomes for each hand are i.i.d. and I am playing a large number of hands per night. (iii.) Outcomes for each hand are i.i.d. and I am playing for a sufficiently large sample of days. (iv.) Outcomes for each night are i.i.d. and I am averaging over a large sample of nightly outcomes. NOTE: We are taking an average over the hands played each night, which are assumed to be i.i.d.. 16

Question # 5. First People s Bank (FPB) has most of their commercial loan department working with small business clients. The bank s managers consider this their most important growth area and several years ago hired a consulting team to improve two aspects of their loan process. In particular, they want to decrease the default rate on the loans (that is, the proportion of loans for which the borrower is unable to make payments). They also want to improve customer service by decreasing the time it takes to process loan applications. Historically (prior to the consultants being hired), management has found that the number of business days required to process a small business loan application is i.i.d. normal with a mean of 14 and a variance of 4. (a.) Before the consultants were hired, approximately what percentage of loan applications were processed in 10 days or less? Use the normal distribution with µ = 14 and σ = 2. P (X < 10) = 0.025 because we are 2 standard deviations to the left of the mean. (b.) The consulting team identified and implemented a number of measures to speed up the application process. Management has reviewed a sample of 25 loan applications processed after these measures were implemented. The average processing time in the sample was 11.2 days and the sample standard deviation of processing times is 2.0 days. Were the consultants measures effective? Formulate an appropriate hypothesis test or confidence interval and state your conclusions. The null hypothesis is H 0 : µ = 14. This is like saying that the processing time is the same as it was before the consultants were hired. 17

z = x µ0 σ n 11.2 14 = 2 25 = 7 We reject the null hypothesis and conclude that the measures were effective. (c.) If we treat the estimates in part (b) (mean of 11.2 and standard deviation of 2.0) as if they were the actual mean and standard deviation, approximately what percentage of loans will be processed in 10 days or less? Answer: (iii) (i.) 5.3% (ii.) 15.9% (iii.) 27.4% (iv.) 51.1% NOTE: 10 is less than 1 standard deviation below the mean, so P (X < 10) is bigger than 16% and less than 50%. Historically (prior to the consultants being hired), 15% of FPB s small business loans resulted in default. The consulting team trained FPB s analysts to use software designed to reduce the default rate by more effectively identifying high risk businesses that are more likely to default. (d.) In a typical year, FPB grants 120 loans to small businesses. Assume that defaults are i.i.d. events; that is, if two firms are granted loans, whether the first firm defaults is independent of whether the second firm defaults. Let Y be the number of loans granted in a typical year that will eventually end up in default. What is the distribution of Y? Binomial(120,0.15) We are looking at n = 120 i.i.d. Bernoulli outcomes where each one has p = 0.15. 18

(e.) Give an interval that is 95% likely to contain the number of loans granted in a typical year which will eventually end up in default. E[Y ] = np = 120 (0.15) = 18 V [Y ] = np(1 p) = 120 (0.15) (0.85) = 15.3 A 95% interval is 18 ± 2 15.3, which is approximately (10,26). (f.) Looking at a sample of 100 loans granted after FPB s analysts started using the new software, management finds that 7 of those loans ended up in default. Was the new software effective in reducing defaults? Formulate an appropriate hypothesis test and state your conclusions. An appropriate null hypothesis is H 0 : p = 0.15. z = ˆp p 0 p 0 (1 p 0 ) n = 0.07 0.15 0.15 0.85 100 = 2.24 We would reject the null hypothesis at a 5% level and conclude that the software is effective. 19

In reality, defaults on small business loans are probably not independent. One reason for this is that a broad economic downturn can cause lots of small businesses to default in a relatively short time period. Because of this, defaults may be positively correlated across firms. That is, if we look at a sample of n loans given in the same year, and let X i = 1 if loan i ends up in default and 0 otherwise for i = 1, 2, 3,..., n, we now assume that: cov(x i, X j ) > 0 for any loans i j (g.) [3 points] Let n = 120 and again define Y = X 1 + X 2 +... + X 120 as the number of loans given in a particular year that end up in default. Suppose we still believe that any single loan has a 15% chance of ending up in default. This is the same random variable we considered in parts (d)-(e), except there we assumed that the individual loan defaults were i.i.d. and now we are assuming cov(x i, X j ) > 0. How does this affect the expected value of Y? How does it affect the variance of Y? Briefly explain. The expected value of a sum of random variables is always the sum of the expected values, so E[Y ] is unaffected. However, the variance will be affected. We know that in general V [Y ] = V [X 1 ] + V [X 2 ] +... + V [X 120 ] + 2 [Cov (X 1, X 2 ) +... + Cov (X 119, X 120 )]. Since the covariances are positive, this means that the variance of Y is substantially larger. NOTE: When discussing the affect on the variance, it would be fine if you just state the case for n = 2, i.e. V [Y ] = V [X 1 ] + V [X 2 ] + 2Cov (X 1, X 2 ). 20

Question # 6. In this problem we estimate the market model using returns on an asset GE and returns on the S&P 500. The sample size is n = 254. GE: returns on General Electric stock Market: The market portfolio (the S&P 500) I took monthly returns on each asset and ran the following regression: GE = α + βmarket + ε Some of the results from running this regression in StatPro are reported here: ANOVA table Source df SS MS F p-value Explained 1 0.1425 0.1425 295.2001 0.0000 Unexplained 252 0.1216 0.0005 Regression coefficients Coefficient Std Err Constant 0.0005 0.0014 SP500 1.2589 0.0733 (a.) Give a 95% confidence interval for β, the coefficient on Market. b ± 2 s b = 1.2589 ± 2 0.0733 = 1.2589 ± 0.1466 = (1.1123, 1.4055) (b.) Test the null hypothesis that the Market is not related to GE (β = 0) at the 5% level. t = b β0 s b = 1.2589 0 0.0733 = 17.17 21

We would reject the null hypothesis at the 5% level. (c.) What is the standard deviation of the residuals s e? unexplained sum of squares s e = (n 2) 0.1216 = 252 = 0.0220 (d.) What is the sample correlation between the Fitted Values and Residuals? The sample correlation between the fitted values ŷ and the residuals e is zero. This is one of the major properties of the residuals and is a result of using least squares. 22

Suppose returns for the Market next month are given by: SP500 Fitted Values Residuals 4/1/2010 0.006152?? (e.) [3 points] Construct a 95% plug-in predictive interval for GE for this month. A 95% plug-in predictive interval is: (a + b x 2 s e, a + b x + 2 s e ) = (0.0005 + 1.2589 0.006152 ± 2 0.022) = ( 0.0358, 0.0522) (f.) In the table above, what is the fitted value? Can you calculate the residual? The fitted value is: a + b x = 0.0005 + 1.2589 0.006152 = 0.00824 No. You cannot calculate the residual because you have not observed the value of GE for this month yet. (g.) Test the null hypothesis that H 0 : α = 0.0315 at the 5% level. t = a α0 s a 0.0005 0.0315 = 0.0014 = 0.031 0.0014 = 22.14 We would clearly reject this null hypothesis at the 5% level. 23

Question # 7. When coded messages are received, there are sometimes errors in transmission creating uncertainty about the message that was actually sent. In particular, Morse code uses dots and dashes as a way to encode messages. Specifically, each letter of the alphabet and each number are given a special sequence of dots and dashes. Let the random variable S = 1 if a dot is sent and S = 0 if a dash is sent. Define the random variable R = 1 if a dot is received and R = 0 if a dash is received. Dots and dashes are known to occur in the proportion 3:4. This means that P (S = 1) = 3 7 and P (S = 0) = 4 7. Suppose there is interference on the transmission line, and with probability 1 8 received as a dash, and vice versa. a dot is mistakenly (a.) What is the probability that a dot was received given a dot was sent P (R = 1 S = 1)? This is a 1 8 chance of a mistake, which makes the probability of getting it right equal to: P (R = 1 S = 1) = 7 8 (b.) What is the marginal probability that a dot is received P (R = 1)? The marginal probability is the sum of the two joint probabilities. P (R = 1) = P (R = 1 S = 1)P (S = 1) + P (R = 1 S = 0)P (S = 0) = 7 3 8 7 + 1 4 8 7 = 25 56 24

(c.) If we receive a dot, can we be sure that a dot was sent? Calculate the probability of a dot being sent given that a dot was received. We use Bayes Rule. P (S = 1 R = 1) = P (R = 1 S = 1)P (S = 1) P (R = 1) = ( 7 8 ) ( 3 7 ) 25 56 = 21 25 25

Question # 8. Suppose I toss two six-sided dice, like those in this picture: Let the random variable X 1 be the number shown on the first die. Let the random variable X 2 be the number shown on the second die. The possible outcomes for each of X 1 and X 2 are 1, 2, 3, 4, 5, or 6. Each of the six outcomes is equally likely. Assume that X 1 and X 2 are independent. If it helps you visualize, the joint distribution of X 1 and X 2 would look like: X 2 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 X 1 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 (a.) What is P (X 1 > 3)? Using the marginal distribution of X 1, we get P (X 1 > 3) = P (X 1 = 4) + P (X 1 = 5) + P (X 1 = 6) = 1 6 + 1 6 + 1 6 = 1 2 (b.) Given that X 1 + X 2 = 10, what is the probability that X 1 = 5 and X 2 = 5? There are 3 ways for the sum to equal 10: (4,6), (5,5), (6,4) and these are all equally likely. Therefore, the probability is 1. 3 (c.) Given that X 1 = 5, what is the expected value of X 1 + X 2? (In other words, suppose the first die shows a 5. What is the expected value of the sum of the two die rolls?) E [X 2 ] = (1)(1/6) + (2)(1/6) + (3)(1/6) + (4)(1/6) + (5)(1/6) + (6)(1/6) = 3.5 26

We know that X 1 = 5. Therefore, we get: E [X 1 + X 2 ] = E [5 + X 2 ] = 5 + E [X 2 ] = 8.5 27

(d.) The popular dice game craps begins with a person (the shooter ) rolling two dice. If the sum of the two dice (X 1 + X 2 ) equals 7 or 11, the shooter is said to have rolled a natural and automatically wins. If the sum is 2, 3, or 12, the shooter is said to crap out and automatically loses. What is the probability that X 1 + X 2 equals 2, 3, or 12? = P (X 1 = 1, X 2 = 1) + P (X 1 = 1, X 2 = 2) + P (X 1 = 2, X 2 = 1) + P (X 1 = 6, X 2 = 6) = 1 36 + 1 36 + 1 36 + 1 36 = 4 36 = 1 9 Let s suppose the first time the two dice are rolled, the total is ten (X 1 + X 2 = 10). In this case, 10 becomes the point. The shooter then continues rolling the two dice over and over again (both dice are always thrown at the same time). Each time the two dice are thrown, one of the following three things happens: If the total is 10, the game ends and the shooter wins. If the total is 7, the game ends and the shooter loses. Otherwise, the game continues, and the shooter rolls again. Theoretically this could continue forever! (e.) Each time the two dice are rolled, what is the probability the game continues? The game ends if either a 7 or 10 is rolled. We need to compute these probabilities. There are six ways to roll a 7 and three ways to roll a 10. P (X 1 + X 2 = 7 or X 1 + X 2 = 10) = 6 36 + 3 36 = 9 36 = 1 4 The probability that the game continues is 1 - P (X 1 + X 2 = 7 or X 1 + X 2 = 10) or 3 4. 28

(f.) Suppose we know the game is going to end on the next roll. What is the probability the shooter wins? Similar to part (b), if we know we re going to get 7 or 10, there are 9 total possibilities, 3 of which result in a win. Therefore, it is 3 or 1. 9 3 (g.) Starting with i = 2, let U i = 1 if the game ends on the i-th roll and 0 otherwise. What is the probability distribution of U i? (Hint: use your answer to (e).) U i Bernoulli(0.25) 29

Assume that each U i is i.i.d.. Let R be a random variable equal to the number of rolls before the game ends. R can be any positive integer (1, 2, 3,...). We assumed the first roll was 10. If the game ends on the second roll (U 2 = 1), then R = 1. If the game ends on the third roll (that is, U 2 = 0 and U 3 = 1), then R = 2. If the game ends on the fourth roll (that is, U 2 = 0, U 3 = 0, and U 4 = 1), then R = 3, etc. As an interesting side note, it turns out that the probability distribution of the random variable R is known as the geometric distribution. (h.) What is the probability that R = 3, i.e. the craps game ends after four rolls? P (R = 3) = P (U 2 = 0, U 3 = 0, U 4 = 1) = P (U 2 = 0)P (U 3 = 0)P (U 4 = 1) = 3 3 1 4 4 4 = 9 64 Be careful! Actual rules for craps can differ from what we ve assumed here (e.g., sometimes a 12 will end the game as well a 7). In casinos, betting the pass line is equivalent to betting that the shooter wins as we defined it here. After the point is established, you can then take odds, which here would mean betting that a 10 will be rolled before a 7. The interesting thing is the odds bet is actually a fair bet (if the point is 10, it would pay 2-to-1), i.e. there is no house advantage! Because of this many casinos limit odds bets to 6-7 times your bet on the pass line. (i.) What is the probability that R > 3, i.e. the craps game lasts longer than four rolls? Here, you must recognize that this is 1 minus the probability of being less than or equal to 3. P (R > 3) = 1 P (R = 1) P (R = 2) P (R = 3) = 1 1 4 3 1 4 4 3 3 1 4 4 4 = 0.428 We could go on from here, and you d see that, despite it being possible for craps to continue forever, there s a 90% probability the game ends within the first 8 rolls. 30

Question # 9. Suppose we are working with 0-1 data (i.e., a dummy variable), and as usual we have assumed that X i Bernoulli (p) i.i.d. We are going to look at a sample of size n, and use the sample proportion of one s, ˆp, as an estimator of p. Recall that for dummy variables, the sample proportion is an average: ˆp = X 1 + X 2 +... + X n n (a.) Suppose that n is large enough for us to use the Central Limit Theorem. What is the sampling distribution of ˆp? (HINT: Your answer should depend on the unknown parameter p.) ( The sampling distribution is ˆp N p, p(1 p) n ) Now suppose we want to build a confidence interval for p, but we run into two issues. First, we have a sample of only n = 10 observations. Second, our actual data is {1, 1, 1, 1, 1, 1, 1, 1, 1, 1} All ten of our observations in the sample are equal to one, which means that ˆp = 1! (b.) Give a 90% confidence interval for p using your answer to part (a). (NOTE: the appropriate critical value here is 1.64, but it doesn t matter, you still get a very silly answer!) Based on the sampling distribution from part (a), the confidence interval is (1, 1). This is pretty obviously messed up... we are in no way absolutely certain that the true value of p is equal to 1 based on a sample of n = 10 observations! 31

Because n = 10 is a relatively small sample size and our data is highly non-normal, we should probably not rely on the Central Limit Theorem here. However, we can actually build a 90% confidence interval for p without using the CLT. Remember that p is a probability, so it must be somewhere between 0 and 1. (c.) Suppose we knew the true value of p. Without using the CLT, we can find the sampling distribution of ˆp = 1 by recognizing that ˆp = Y n distribution of Y? where Y is a random variable. What is the probability The random variable Y is the sum of n = 10 i.i.d. Bernoulli random variables. Therefore, the distribution of Y is binomial(n,p) or binomial(10,p) (d.) Suppose that p = 0.9. What is P (ˆp = 1) in a sample of size n = 10? Should our 90% confidence interval include p = 0.9? (i.e., is 0.9 a reasonable value of p?) With p = 0.9, P (ˆp = 1) = (0.9) 10 = 0.349. Our confidence interval SHOULD include p = 0.9. With n = 10 observations and p = 0.9, it is definitely possible (there is a 35% chance) we would see a value of ˆp = 1. (e.) Suppose that p = 0.7. What is P (ˆp = 1) in a sample of size n = 10? Should our 90% confidence interval include p = 0.7? With p = 0.7, P (ˆp = 1) = (0.7) 10 = 0.028. Our confidence interval should probably NOT include p = 0.7. With n = 10 observations and p = 0.7, it is pretty unlikely (there is only a 3% chance) we would see a value of ˆp = 1. (f.) Based on the sample of n = 10 observations on the previous page, give a 90% confidence interval for p without using the Central Limit Theorem. Our confidence interval should obviously include p = 1. What is the smallest value of p we d call reasonable? Well, for a 90% CI, we d rule out any p for which P (ˆp = 1) < 0.10. Solving p 10 = 0.1 for p, we get p = (0.1) 1/10 = 0.794. The exact 90% confidence interval is (0.794, 1). 32

USE FOR SCRATCH PAPER. WORK ON THIS PAGE WILL NOT BE GRADED. 33

USE FOR SCRATCH PAPER. WORK ON THIS PAGE WILL NOT BE GRADED. 34