August 2012 EXAMINATIONS Solution Part I



Similar documents
MULTIPLE REGRESSION EXAMPLE

Regression Analysis: A Complete Example

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Lecture 15. Endogeneity & Instrumental Variable Estimation

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Nonlinear Regression Functions. SW Ch 8 1/54/

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Correlation and Regression

Interaction effects between continuous variables (Optional)

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

STAT 350 Practice Final Exam Solution (Spring 2015)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Chapter 23 Inferences About Means

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

MODELING AUTO INSURANCE PREMIUMS

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Simple Linear Regression Inference

Hypothesis testing - Steps

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Regression step-by-step using Microsoft Excel

Chapter 7: Simple linear regression Learning Objectives

Discussion Section 4 ECON 139/ Summer Term II

Week TSX Index

Introduction to Regression and Data Analysis

SPSS Guide: Regression Analysis

Factors affecting online sales

Final Exam Practice Problem Answers

17. SIMPLE LINEAR REGRESSION II

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

GLM I An Introduction to Generalized Linear Models

1 Simple Linear Regression I Least Squares Estimation

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Using R for Linear Regression

Multiple Linear Regression

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

The average hotel manager recognizes the criticality of forecasting. However, most

Simple Regression Theory II 2010 Samuel L. Baker

Rockefeller College University at Albany

MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE

2013 MBA Jump Start Program. Statistics Module Part 3

Premaster Statistics Tutorial 4 Full solutions

International Statistical Institute, 56th Session, 2007: Phil Everson

Data Analysis Methodology 1

Solución del Examen Tipo: 1

Example: Boats and Manatees

2. Linear regression with multiple regressors

Handling missing data in Stata a whirlwind tour

Elementary Statistics Sample Exam #3

Study Guide for the Final Exam

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

5. Linear Regression

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Recall this chart that showed how most of our course would be organized:

Stata Walkthrough 4: Regression, Prediction, and Forecasting

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Chapter 13 Introduction to Linear Regression and Correlation Analysis

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

2. Simple Linear Regression

The importance of graphing the data: Anscombe s regression examples

Econometrics I: Econometric Methods

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Two-sample hypothesis testing, II /16/2004

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

11. Analysis of Case-control Studies Logistic Regression

25 Working with categorical data and factor variables

Linear Regression Models with Logarithmic Transformations

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Quick Stata Guide by Liz Foster

BUS/ST 350 Exam 3 Spring 2012

One-Way Analysis of Variance (ANOVA) Example Problem

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Module 5: Multiple Regression Analysis

Part 2: Analysis of Relationship Between Two Variables

Multinomial and Ordinal Logistic Regression

Correlation and Simple Linear Regression

Chapter 4 and 5 solutions

Module 2 Probability and Statistics

Chapter 23. Inferences for Regression

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Chapter 2 Probability Topics SPSS T tests

Transcription:

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample, the probability that less than 4% are in favour of this policy is 0.67. The sample size is closest to (A) (3) The 90th percentile of daily sales is closest to (D) (4) In the next 4 days, the probability that their average daily sales exceed $600 is closest to (A) (5) In the next 4 days, the probability that the daily dales exceed $500 in only one of these days is closest to (E) (6) If Line 1 and Line are independent, the probability that Line 1 produces more parts than Line in any single day is closest to (D) (7) If Line 1 and Line are independent, what is the probability that the average number of parts produced by Line 1 is greater than that produced by Line in the next 5 days? (E) (8) This firm specifies that the estimation of this proportion has a margin of error 0.05 with 90% confidence. The smallest sample size required is closest to (B) (9) The smallest sample size required is closest to (A) (10) A 90% confidence interval for the real proportion is closest to (D) (11) What is the standard deviation of your sample? (D) (1) Which one of the following statements is true? (B) (13) How will it change your OLS estimate for the slope of the regression line, 1? (B) (14) How do you interpret the slope estimate for x 3? (C) (15) Which one of these variables will cause perfect multicollinearity? (E) (16) When the true value under the alternative hypothesis shifts closer to the value under the null hypothesis, while the critical value stays the same, (A) (17). If you do not find out about his systematic mistakes, what consequences will they have on the results of your tests? (B) (18) What kind of data is it? (D) (19) What is your set of hypotheses corresponding to your research question? (A) (0) What is the P-value of your test for the hypotheses that you identified in question (19)? (C)

Page 1 of 16 UNIVERSITY OF TORONTO Faculty of Arts and Science August 01 EXAMINATIONS ECO0Y1Y Duration - 3 hours Examination Aids: Calculator Solution Part II: Short Answer Questions [60 points] (1) [1 points] You are hired as a consultant by the marketing department of Crown Bank and asked to analyze the data of customer satisfaction survey. A key measure of customer satisfaction is the response on a scale from 1 to 10 to the question, Considering all business you do with Crown Bank, what is your overall satisfaction with Crown Bank? If the response is 9 or 10, the customer is considered delighted by Crown Bank. The department wants to know if customers are more likely to be delighted in the areas with more Crown Bank ATMs. They obtained random samples from two areas that have the same area, but vary in ATM density (number of ATMs per capita). The following table shows the result. Area 1 Area ATM density (per km ) 10 3 Total responses 175 175 Responses with 9 or 10 11 105 (a) [4 points] What is the set of hypotheses that the marketing department wants to test? [A set of hypotheses] H 0 : p 1 -p =0 H 1 : p 1 -p >0 (b) [8 points] Conduct the test for the hypothesis you identified in question (a) by the P-value method. Use the significance level =0.05. Write a short report to the marketing department about the result. For full marks, you should clearly state the test statistic, the P-value, and the decision. [Analysis, 3 items & 3 or 4 sentences]

Page of 16 Since, it satisfies the success/failure conditions. Thus, we can use normal approximation for the distribution of difference in population proportions. Test statistic: ( pˆ ˆ 1 p ) 11105 z 1.788, where p ˆ 0. 646 1 1 175 175 pˆ(1 pˆ) n1 n P-value: P ( Z 1.788) 1 0.9633 0.0367 Decision: Since the P-value is 0.037, less than 0.05, we reject the null hypothesis. There is sufficient evidence to suggest that customers are more likely to be delighted in the areas with more Crown Bank ATMs. () [15 points] Insurance companies track life expectancy information to assist in determining the cost of life insurance policies. Last year the average life expectancy of all policyholders was 77 years. ABI Insurance wants to determine if their clients now have a longer life expectancy, on average, so they randomly sample some of their recently paid policies. The insurance company will only change their premium structure if there is evidence that people who buy their policies are living longer than before. The sample has 8 observations, a mean of 78.6 years, and a standard deviation of 4.48 years. (a) [ points] What set of hypotheses does the ABI insurance wish to test? [A set of hypotheses] H 0 :=77, H 1 : >77 (b) [4 points] Conduct the test for the hypotheses you identified in question (a) by rejection region method. For full marks, you should clearly state the rejection region, the test statistic, and the decision. Based on the result, what will the insurance company do to its premium structure? [Analysis, 3 items & -3 sentences]

Page 3 of 16 Since sample size is n=8, degrees of freedom for t statistic is 7. The critical value for =0.05 for one sided test when degrees of freedom is 7 is 1.703. Thus rejection region is t 1. 703. x 0 78.6 77 78.6 77 The test statistic is t 1. 890. SE( x) 4.48/ 8 0.847 Since t=1.890>1.703, we reject the null hypothesis. There is sufficient evidence to suggest that the life expectancy of policy holders for ABI Insurance increased from 77 years. Thus, the company will change its premium structure. (c) [5 points] Suppose the true mean life expectancy of policyholders is 80.18 years. Obtain the power of the test. [Analysis, one value] Given =0.05, the critical value of the test in original unit is c=77+1.703 0.847=78.44. Given the mean of the distribution under the alternative is 81.18, t statistic corresponding to the critical value is 78.44 80.18 t.053 0.847 Thus, the power of the test, the probability of rejecting the wrong null, is given by P(t>-.05)=1-0.05=0.975. Hypothesis Test =.05 (H 0 := 0,H A :> 0 ) 77 78.4 80.18

Page 4 of 16 (d) [4 points] Obtain the 0.99 confidence interval for the mean life expectancy of the policyholders and interpret the result. [Analysis, a set of values & 1- sentences] For =n-1=7, the critical value for =0.005 is.771 x.771 SE( X ) 78.6.771 0.847 (76.54,80.946) With 0.99 confidence, the mean life expectancy of policy holders of ABI Insurance is at least 76.54 years and at most 80.946 years. (3) [18 points] A researcher would like to know if productivity of factory workers changes by better lighting in the room. In order to investigate this question, he collected data from a factory. He randomly chose 17 workers and sent them to work in room 1. He randomly chose another set of 17 workers and sent them to work in room. Then he set the lighting of room 1 at the regular level and the lighting of room to be brighter. Other than the lighting, work conditions in the two rooms were identical. He collected data on daily productivity of each worker in the two rooms. The theoretical model to be estimated is as follows: productivity i = 0 + 1 room i + age i + 4 experience i + Where productivity i =number of production by worker i on that day room i =1 if worker i is in room, 0 if worker i is in room 1 age i =age of worker i experience i = years of experience of worker i at the factory The regression result is given as follows. productivity i =-4.57+5.94 room i + 1.87 age i + 0.79 experience i + (9.68) (1.63) (0.51) (0.50) n=34, R =0.8584

Page 5 of 16 (a) [4 points] What is the set of hypotheses that the researcher would like to test? [A set of hypotheses] H 0 : 1 =0, H 1 : 1 0 (b) [4 points] Conduct the test you stated in (a) by the rejection region method. For full marks, you have to clearly state the rejection region, the test statistic, and the decision. Based on the result of the test, report and interpret the result of the research. [Analysis,3 items, -3 sentences] Given significance level and degrees of freedom corresponding critical value is.04. Thus, the rejection region is: t>.04, t<-.04. Based on the regression result, the test statistic is Since t=3.644>.04, it is in the rejection region. Thus we reject the null hypothesis in favor of the alternative. There is sufficient evidence to suggest that 1 is statistically significantly zero. This estimate suggests that the lighting in a room changes productivity of workers. (c) [4 points] Conduct the test of overall significance for this model by the rejection region method. Use the significance level =0.05.For full marks, you have to clearly state the rejection region, the test statistic, and the decision.[analysis, 3 items & 1- sentences] With significance level =0.05 and degrees of freedom, the rejection region is F>.9. The F statistics is obtained as follows. R / k F 60.6 (1 R ) / n k 1 Since F=60.6>.9, we reject the null hypothesis in favor of the alternative. There is enough evidence to suggest that at least one beta is statistically significantly different from zero.

Page 6 of 16 (d) [6 points] Suppose that the researcher lets each worker choose whether to work in the room with the brighter lighting or the one with the regular lighting. Then, which assumptions, if any, of the multiple regression model will be violated? Can the coefficient estimates obtained from this sample be reliable? Explain. [4-5 sentences] The exogeneity assumption of x is violated. (i.e. E(x j i )=0 for all i and j) If workers choose their room to work in, it is likely to create endogeneity. For example, workers who care about producing more, may tend to choose brighter room. It means workers morale may be lurking variables and positively correlate with both x i and y i. In this case, the coefficient estimate for room1 is biased upward and is not reliable as an estimate of the effect of brighter lighting to the productivity. (4) [15 points] A sales manager is interested in determining if there is a relationship between college GPA and sales performance among salespeople hired within the last year. He selected a sample of recently hired salespeople and recorded the number of units each salesperson sold in the last month. Variables obtained were: ID i = identification number of salesperson i, unitssold i =the number of units sold last month by salesperson i, GPA i = college GPA of salesperson i. The mean of unitssold i was.4 units and the mean of GPA i was 3.09.The table below shows the regression result..reg unitssold GPA Source SS df MS Number of obs = 15 -------------+------------------------------ F( 1, 13) = 46.8 Model 13.04111 1 13.04111 Prob > F = 0.0000 Residual 34.558879 13.6583753 R-squared = 0.7807 -------------+------------------------------ Adj R-squared = 0.7638 Total 157.6 14 11.57149 Root MSE = 1.6305 ------------------------------------------------------------------------------ unitssold Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- GPA 7.396965 1.08768 6.80 0.000 5.048066 9.745865 _cons -.470344 3.381615-0.13 0.901-7.7357 6.878501 ------------------------------------------------------------------------------ Mean of GPA=3.09 (a) [4 points]write down the theoretical model that is being estimated. [An equation] unitssold i = 0 + 1 GPA i + i

Page 7 of 16 (b) [5 points] Interpret the coefficient estimate for GPA i. [Interpretation, 1 sentence] An increase in GPA by 1 point is associated with an increase of the number of units sold last month by a salesperson by 7.4 units on average. (c) [6 points] Obtain the 0.9 prediction band of sales performance for a salesperson with GPA of 3.0. [Analysis & a pair of values] yˆ t / SE ( b ) ( x 1 x) s n e s e.47 7.39*3.0 1.771 (1.087) (3 3.09) (1.631) 15 (1.631) (18.78,4.75)