August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample, the probability that less than 4% are in favour of this policy is 0.67. The sample size is closest to (A) (3) The 90th percentile of daily sales is closest to (D) (4) In the next 4 days, the probability that their average daily sales exceed $600 is closest to (A) (5) In the next 4 days, the probability that the daily dales exceed $500 in only one of these days is closest to (E) (6) If Line 1 and Line are independent, the probability that Line 1 produces more parts than Line in any single day is closest to (D) (7) If Line 1 and Line are independent, what is the probability that the average number of parts produced by Line 1 is greater than that produced by Line in the next 5 days? (E) (8) This firm specifies that the estimation of this proportion has a margin of error 0.05 with 90% confidence. The smallest sample size required is closest to (B) (9) The smallest sample size required is closest to (A) (10) A 90% confidence interval for the real proportion is closest to (D) (11) What is the standard deviation of your sample? (D) (1) Which one of the following statements is true? (B) (13) How will it change your OLS estimate for the slope of the regression line, 1? (B) (14) How do you interpret the slope estimate for x 3? (C) (15) Which one of these variables will cause perfect multicollinearity? (E) (16) When the true value under the alternative hypothesis shifts closer to the value under the null hypothesis, while the critical value stays the same, (A) (17). If you do not find out about his systematic mistakes, what consequences will they have on the results of your tests? (B) (18) What kind of data is it? (D) (19) What is your set of hypotheses corresponding to your research question? (A) (0) What is the P-value of your test for the hypotheses that you identified in question (19)? (C)
Page 1 of 16 UNIVERSITY OF TORONTO Faculty of Arts and Science August 01 EXAMINATIONS ECO0Y1Y Duration - 3 hours Examination Aids: Calculator Solution Part II: Short Answer Questions [60 points] (1) [1 points] You are hired as a consultant by the marketing department of Crown Bank and asked to analyze the data of customer satisfaction survey. A key measure of customer satisfaction is the response on a scale from 1 to 10 to the question, Considering all business you do with Crown Bank, what is your overall satisfaction with Crown Bank? If the response is 9 or 10, the customer is considered delighted by Crown Bank. The department wants to know if customers are more likely to be delighted in the areas with more Crown Bank ATMs. They obtained random samples from two areas that have the same area, but vary in ATM density (number of ATMs per capita). The following table shows the result. Area 1 Area ATM density (per km ) 10 3 Total responses 175 175 Responses with 9 or 10 11 105 (a) [4 points] What is the set of hypotheses that the marketing department wants to test? [A set of hypotheses] H 0 : p 1 -p =0 H 1 : p 1 -p >0 (b) [8 points] Conduct the test for the hypothesis you identified in question (a) by the P-value method. Use the significance level =0.05. Write a short report to the marketing department about the result. For full marks, you should clearly state the test statistic, the P-value, and the decision. [Analysis, 3 items & 3 or 4 sentences]
Page of 16 Since, it satisfies the success/failure conditions. Thus, we can use normal approximation for the distribution of difference in population proportions. Test statistic: ( pˆ ˆ 1 p ) 11105 z 1.788, where p ˆ 0. 646 1 1 175 175 pˆ(1 pˆ) n1 n P-value: P ( Z 1.788) 1 0.9633 0.0367 Decision: Since the P-value is 0.037, less than 0.05, we reject the null hypothesis. There is sufficient evidence to suggest that customers are more likely to be delighted in the areas with more Crown Bank ATMs. () [15 points] Insurance companies track life expectancy information to assist in determining the cost of life insurance policies. Last year the average life expectancy of all policyholders was 77 years. ABI Insurance wants to determine if their clients now have a longer life expectancy, on average, so they randomly sample some of their recently paid policies. The insurance company will only change their premium structure if there is evidence that people who buy their policies are living longer than before. The sample has 8 observations, a mean of 78.6 years, and a standard deviation of 4.48 years. (a) [ points] What set of hypotheses does the ABI insurance wish to test? [A set of hypotheses] H 0 :=77, H 1 : >77 (b) [4 points] Conduct the test for the hypotheses you identified in question (a) by rejection region method. For full marks, you should clearly state the rejection region, the test statistic, and the decision. Based on the result, what will the insurance company do to its premium structure? [Analysis, 3 items & -3 sentences]
Page 3 of 16 Since sample size is n=8, degrees of freedom for t statistic is 7. The critical value for =0.05 for one sided test when degrees of freedom is 7 is 1.703. Thus rejection region is t 1. 703. x 0 78.6 77 78.6 77 The test statistic is t 1. 890. SE( x) 4.48/ 8 0.847 Since t=1.890>1.703, we reject the null hypothesis. There is sufficient evidence to suggest that the life expectancy of policy holders for ABI Insurance increased from 77 years. Thus, the company will change its premium structure. (c) [5 points] Suppose the true mean life expectancy of policyholders is 80.18 years. Obtain the power of the test. [Analysis, one value] Given =0.05, the critical value of the test in original unit is c=77+1.703 0.847=78.44. Given the mean of the distribution under the alternative is 81.18, t statistic corresponding to the critical value is 78.44 80.18 t.053 0.847 Thus, the power of the test, the probability of rejecting the wrong null, is given by P(t>-.05)=1-0.05=0.975. Hypothesis Test =.05 (H 0 := 0,H A :> 0 ) 77 78.4 80.18
Page 4 of 16 (d) [4 points] Obtain the 0.99 confidence interval for the mean life expectancy of the policyholders and interpret the result. [Analysis, a set of values & 1- sentences] For =n-1=7, the critical value for =0.005 is.771 x.771 SE( X ) 78.6.771 0.847 (76.54,80.946) With 0.99 confidence, the mean life expectancy of policy holders of ABI Insurance is at least 76.54 years and at most 80.946 years. (3) [18 points] A researcher would like to know if productivity of factory workers changes by better lighting in the room. In order to investigate this question, he collected data from a factory. He randomly chose 17 workers and sent them to work in room 1. He randomly chose another set of 17 workers and sent them to work in room. Then he set the lighting of room 1 at the regular level and the lighting of room to be brighter. Other than the lighting, work conditions in the two rooms were identical. He collected data on daily productivity of each worker in the two rooms. The theoretical model to be estimated is as follows: productivity i = 0 + 1 room i + age i + 4 experience i + Where productivity i =number of production by worker i on that day room i =1 if worker i is in room, 0 if worker i is in room 1 age i =age of worker i experience i = years of experience of worker i at the factory The regression result is given as follows. productivity i =-4.57+5.94 room i + 1.87 age i + 0.79 experience i + (9.68) (1.63) (0.51) (0.50) n=34, R =0.8584
Page 5 of 16 (a) [4 points] What is the set of hypotheses that the researcher would like to test? [A set of hypotheses] H 0 : 1 =0, H 1 : 1 0 (b) [4 points] Conduct the test you stated in (a) by the rejection region method. For full marks, you have to clearly state the rejection region, the test statistic, and the decision. Based on the result of the test, report and interpret the result of the research. [Analysis,3 items, -3 sentences] Given significance level and degrees of freedom corresponding critical value is.04. Thus, the rejection region is: t>.04, t<-.04. Based on the regression result, the test statistic is Since t=3.644>.04, it is in the rejection region. Thus we reject the null hypothesis in favor of the alternative. There is sufficient evidence to suggest that 1 is statistically significantly zero. This estimate suggests that the lighting in a room changes productivity of workers. (c) [4 points] Conduct the test of overall significance for this model by the rejection region method. Use the significance level =0.05.For full marks, you have to clearly state the rejection region, the test statistic, and the decision.[analysis, 3 items & 1- sentences] With significance level =0.05 and degrees of freedom, the rejection region is F>.9. The F statistics is obtained as follows. R / k F 60.6 (1 R ) / n k 1 Since F=60.6>.9, we reject the null hypothesis in favor of the alternative. There is enough evidence to suggest that at least one beta is statistically significantly different from zero.
Page 6 of 16 (d) [6 points] Suppose that the researcher lets each worker choose whether to work in the room with the brighter lighting or the one with the regular lighting. Then, which assumptions, if any, of the multiple regression model will be violated? Can the coefficient estimates obtained from this sample be reliable? Explain. [4-5 sentences] The exogeneity assumption of x is violated. (i.e. E(x j i )=0 for all i and j) If workers choose their room to work in, it is likely to create endogeneity. For example, workers who care about producing more, may tend to choose brighter room. It means workers morale may be lurking variables and positively correlate with both x i and y i. In this case, the coefficient estimate for room1 is biased upward and is not reliable as an estimate of the effect of brighter lighting to the productivity. (4) [15 points] A sales manager is interested in determining if there is a relationship between college GPA and sales performance among salespeople hired within the last year. He selected a sample of recently hired salespeople and recorded the number of units each salesperson sold in the last month. Variables obtained were: ID i = identification number of salesperson i, unitssold i =the number of units sold last month by salesperson i, GPA i = college GPA of salesperson i. The mean of unitssold i was.4 units and the mean of GPA i was 3.09.The table below shows the regression result..reg unitssold GPA Source SS df MS Number of obs = 15 -------------+------------------------------ F( 1, 13) = 46.8 Model 13.04111 1 13.04111 Prob > F = 0.0000 Residual 34.558879 13.6583753 R-squared = 0.7807 -------------+------------------------------ Adj R-squared = 0.7638 Total 157.6 14 11.57149 Root MSE = 1.6305 ------------------------------------------------------------------------------ unitssold Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- GPA 7.396965 1.08768 6.80 0.000 5.048066 9.745865 _cons -.470344 3.381615-0.13 0.901-7.7357 6.878501 ------------------------------------------------------------------------------ Mean of GPA=3.09 (a) [4 points]write down the theoretical model that is being estimated. [An equation] unitssold i = 0 + 1 GPA i + i
Page 7 of 16 (b) [5 points] Interpret the coefficient estimate for GPA i. [Interpretation, 1 sentence] An increase in GPA by 1 point is associated with an increase of the number of units sold last month by a salesperson by 7.4 units on average. (c) [6 points] Obtain the 0.9 prediction band of sales performance for a salesperson with GPA of 3.0. [Analysis & a pair of values] yˆ t / SE ( b ) ( x 1 x) s n e s e.47 7.39*3.0 1.771 (1.087) (3 3.09) (1.631) 15 (1.631) (18.78,4.75)