Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Statistics II Final Exam - January Use the University stationery to give your answers to the following questions."

Transcription

1 Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly the beginning and end of each question. Exercises 1. (2 points) In a certain game, a good player is assumed to be one who scores more than 4 points per match. You have been following player A, who scored an average of 5 points per match in a large series of 100 matches, with a sample (quasi)variance of a) (0.5 points) Would you consider player A to be a good player at a 95 % confidence level? b) (0.5 points) Suppose you also observed player B, whose p-value corresponding to the goodplayer test is According to this evidence, whom would you consider a better player, A or B? Why? c) (0.5 points) You have used Statgraphics to carry out a hypothesis test on the data for player A, with the following results: Hypothesis Tests Sample mean = 5,0 Sample standard deviation = 1,98479 Sample size = ,0% confidence interval for mean: 5,0 +/- 0, [4,47871;5,52129] Null Hypothesis: mean = 4,5 Alternative: not equal Computed t statistic = 2,51916 P-Value = 0, ****************************** Indicate the null and alternative hypotheses for this test. Would you reject the null hypothesis for a significance level of 1 %? Why? d) (0.5 points) Suppose that the sample average and variance values for player A have been obtained from a small series of 5 matches (instead of 100). Can you reach any meaningful conclusion for the test about the goodness of A? 1) Yes, without any further assumptions. 2) Yes, but we need to make some distributional assumption about the scores. 3) No. Let X i denote the points scored by the player for the i-th match, and X = (X X n )/n. a) We have to test the null hypothesis H 0 : µ 4 vs. H 1 : µ > 4.

2 where µ denotes player A s average score per match. As n = 100, from the Central Limit Theorem we have that the test statistic is Z = ( X µ 0 )/S/ n N(0, 1) and we reject H 0 if z obs > z 0.05 = In this case, z obs = (5 4)/ 3.94/100 = 5.038, we reject the null hypothesis and we conclude that player A can be considered a good one. b) To compare the scores of the two players, we consider the corresponding p-values. For player A, its value is p-value = Pr(Z > z obs ) = Pr(Z > 5.038) = << 0.002, and we conclude that the probability of getting the scores obtained by A under the null hypothesis is lower than that for B and consequently player A seems to be much better than B. c) The test carried out in this case is H 0 : µ = 4.5 H 1 : µ 4.5, As the p-value for this test is , we will reject the null hypothesis for all significance levels larger than this value. In particular, for a significance level of 1 % we would not reject H 0, but we would reject it for 5 %. d) 2. If the score per match and its corresponding sample variance were obtained observing only n = 5 matches, this sample size would not be enough to apply the Central Limit Theorem and we could not reach a meaningful decision, unless we were to compensate the lack of information in such a small sample with an assumption on the probability distribution of the scores, such as considering that they follow a Normal distribution. 2. (2 points) You are conducting a study on the seasonal variations in the sales of shellfish in one of Madrid s districts. You have collected sales data from 20 fish markets in the district, corresponding to two days in two different periods of interest: December 20th (Christmastime), and April 17th (Spring); both days are Wednesdays. The following table presents a summary of the shellfish sales income in each one of the days, as well as the value for the difference in sales income between both periods: December April December April Average sales 300 euros 180 euros 120 euros Quasi standard deviation 44 euros 29 euros 44 euros Answer the following questions, indicating in each case any sample or population assumptions that you might need to make: a) (1 point) Compute two confidence intervals for the average of the sales income in each of the two periods, for a confidence level of 99 %. b) (1 point) For a significance level of 5 %, conduct a hypothesis test to determine if the average daily sales in December are at least 100 euros greater than the sales in April. Indicate the null and alternative hypotheses and justify your conclusion. We define the variables of interest as X shellfish sales on December 20, Y shellfish sales on April 17. As we only have information for the 20 fish markets in the district, we cannot assume that we have a large sample; as a consequence, we will need to assume that the population follows a normal distribution. We will also assume that the observations corresponding to X and Y for the 20 markets are simple random samples. These samples (X, Y ) are paired, as they have been obtained for the same markets on two different dates.

3 a) The confidence intervals are given by CI µx (99 %) = s x x ± t 19,0.005 = 300 ± = (271.85; ) in euros; CI µy (99 %) = s y ȳ ± t 19,0.005 = 180 ± = (161.45; ) in euros. b) The null and alternative hypotheses for the test are: and if we define D = X Y, The value of the test statistic is H 0 : µ X µ Y 100 H 1 : µ X µ Y > 100, H 0 : µ D 100 H 1 : µ D > 100. t = d d 0 s d / = n 44/ 20 = As this statistic follows a Student-t distribution with n 1 degrees of freedom, the rejection region is defined as those samples that have a value of the statistic larger than the quantile of the Student-t, t 19,0.05 = 1.73, CR = {t > 1.73}. As this condition is satisfied for our samples, we conclude that we reject the null hypothesis for a significance level of 5 %, that is, we accept that the average increase of sales income between December and April in this district is larger than 100 euros. 3. (3 points) The sales department of a clothing company is conducting a study on the company s catalog sales. Their goal is to determine if there is a meaningful relationship between the number of phone lines open to receive orders ( Phone lines, L) and the volume of catalog sales ( Sales, S) (measured in hundreds of euros). The department has the following data on the values of these variables for the last 20 days: l i = 599, s i = 2835, l is i = 92000, l2 i = 19195, s2 i = , 20 e2 i = where e i denotes the residuals of the regression model explaining the variable S as a function of L. a) (0.5 points) Compute the ANOVA table for S. b) (0.5 points) Test if the variable Phone lines has no impact on the values of the variable Sales, for a significance level of 5 %. c) (0.5 points) Compute the value of the coefficient of determination and interpret it. d) (0.5 points) Obtain the least-squares estimates for the parameters of the regression line explaining the variable Sales (S) as a function of the values of the variable Phone lines (L). e) (0.5 points) Obtain an estimate for the sales forecast corresponding to a day in which you have 12 open phone lines. Compute also a confidence interval at a 95 % level for this forecast. f ) (0.5 points) Additionally, you have information on the number of catalogs that have been distributed each day ( Number catalogs, C). You fit a multiple regression model including this new variable, and you obtain the following Statgraphics output:

4 Multiple Regression - Sales Dependent variable: Sales Independent variables: Phone_lines Number_catalogs Standard T Parameter Estimate Error Statistic P-Value CONSTANT -99,269 69,8328-1, ,1733 Phone_lines 5, , , ,0001 Number_catalogs 0, , , ,2822 Identify the values of the estimates for the parameters of the multiple linear regression model, and interpret the value of the coefficient of the variable Phone lines (L). a) From the data we have been given we obtain SSR = , and also SST = (n 1)s 2 s = = (s i s) 2 = 20 s 2 i 20 s 2 = s 2 i ( 20 s i ) 2 /20 Based on this information, the ANOVA table is given by: Source Sum of squares D.F. Mean Squares F-ratio Model Residuals Total b) From the information in the ANOVA table, and in particular from the value of the F-ratio, we conduct a significance test for the model with critical region given by CR 0.05 = {F > F 1,18;0.05 } = {F > 4.41} As the value of the ratio is in the critical region, we reject the null hypothesis and we conclude that the value of the variable open lines is linearly related to that of the variable sales. c) The coefficient of determination is given by R 2 = SSE SST = = The value of the variable open lines explains 70.4 % of the variability in the variable sales. d) We compute first some required values: 20 l = l i /20 = 29.95, s = s i /20 = s 2 l = ( li 2 20 l 2 )/19 = 66.05, s 2 s = ( s 2 i 20 s 2 )/19 = cov(l, s) = ( l i s i 20 l s)/19 =

5 From these values we obtain Questions ˆβ 1 = cov(l, s) = s 2 l ˆβ 0 = s ˆβ 1 l = 27.50, and the regression model is ŝ = l. We also have that the residual variance is (see the ANOVA table) s 2 R = e 2 i /(n 2) = e) The point estimate for the forecast corresponding to l 0 = 12 is ŝ 0 = l 0 = To obtain the confidence interval we use the formula, CI 0.05 = ŝ 0 ± t 18;0.025 s 2 R = ± ( n + (l 0 l) 2 (n 1)s 2 l f ) The multiple linear regression model of interest is ) ( ( ) ŝ i = ˆβ 0 + ˆβ 1 l i + ˆβ 2 c i, ) = ( 33.12; ). and the values of the parameters from the Statgraphics output are ˆβ 0 = , ˆβ1 = , ˆβ 2 = , yielding the model ŝ i = l i c i. If we increase the number of open lines by one unit, while keeping constant the value of the variable number of catalogs, the value of the sales increases by euros on the average. 1. (1 point) Determine if the following statements are true or false. Provide a brief justification for your answer. a) (0.5 points) As a response to the current economic crisis, 15 countries have decided to apply a policy based on austerity measures, while another group of 15 countries have chosen to follow a policy based on the use of stimulus packages. You wish to use a statistical testing procedure to evaluate if the growth rates associated to each set of policies are significantly different. An appropriate hypothesis test is a two-sided test for paired samples. b) (0.5 points) We are interested in studying if there is a significant difference between the salaries of men and women in the communications and services sectors. We have selected 100 companies in the communications sector and 100 companies in the services sector. For each company we collect information on a standardized indicator for the difference in salaries between men and women. An appropriate hypothesis test is a two-sided test for independent samples.

6 a) FALSE. We have no information to think that the countries included in both samples can be paired in any meaningful way for this study. It would be more reasonable in this case to consider the samples as independent. b) TRUE. As in the preceding case, we do not have any information that might indicate that the companies included in both samples have any relationship. Thus, it seems reasonable in this case to treat both samples as independent. 2. (1 point) For a simple linear regression model y = β 0 + β 1 x + u, determine if the following statements are true or false. Provide a brief justification for your answer. a) (0.5 points) If the variance of the errors is equal to 0, the coefficient of determination is also equal to 0. b) (0.5 points) For the estimated linear regression model ŷ i = x i, each additional unit of variable X implies a decrease of 3 units in the value of variable Y. a) FALSE. If the variance of the errors is equal to 0, then the coefficient of determination is equal to 1. If the variance of the errors is 0, then SSR = 0 and R 2 = SST SSR SST = SST SST = 1. b) FALSE. For each additional unit of X the variable Y has an increase equal to ˆβ 1, that is, 0.5 units. 3. (1 point) Answer the following questions, using the information provided in the Statgraphics output. Simple Regression - Y vs. X Dependent variable: Y Independent variable: X Linear model: Y = a + b*x Coefficients Least Squares Standard T Parameter Estimate Error Statistic P-Value Intercept 21,5885 2, , ,0001 Slope -2, , , ,0150 Analysis of Variance Source Sum of Squares Df Mean Square F-Ratio P-Value Model 561, ,472 10,25 0,0150 Residual 383, ,7933 Total (Corr.) 945,025 8 Correlation Coefficient = -0, R-squared = 59,4134 percent R-squared (adjusted for d.f.) = 53,6154 percent Standard Error of Est. = 7,40225 Mean absolute error = 4,99915 Durbin-Watson statistic = 2,71064 (P=0,8750) Lag 1 residual autocorrelation = -0, a) (0.5 points) Specify the values of the estimates for the three parameters in the model. b) (0.5 points) Is the independent variable significant to explain the values of the response variable? Why?

7 a) The estimated model is given by ŷ i = x i, with a residual variance s 2 R equal to (from the ANOVA table). b) To carry out this test we look at the p-value associated to the slope of the regression line, equal to (this same p-value is associated to the F-ratio in the ANOVA table). We conclude that for any significance level larger than this p-value (α > ) we reject the null hypothesis and the independent variable x is significant to explain the values of the response variable y.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Regression, least squares

Regression, least squares Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

More information

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

STA 4163 Lecture 10: Practice Problems

STA 4163 Lecture 10: Practice Problems STA 463 Lecture 0: Practice Problems Problem.0: A study was conducted to determine whether a student's final grade in STA406 is linearly related to his or her performance on the MATH ability test before

More information

Statistics - Written Examination MEC Students - BOVISA

Statistics - Written Examination MEC Students - BOVISA Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent.

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Construct a scatterplot for the given data. 2) x Answer:

Construct a scatterplot for the given data. 2) x Answer: Review for Test 5 STA 2023 spr 2014 Name Given the linear correlation coefficient r and the sample size n, determine the critical values of r and use your finding to state whether or not the given r represents

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Simple Regression and Correlation

Simple Regression and Correlation Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Lecture 2: Simple Linear Regression

Lecture 2: Simple Linear Regression DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Econ 424/Amath 462 Hypothesis Testing in the CER Model

Econ 424/Amath 462 Hypothesis Testing in the CER Model Econ 424/Amath 462 Hypothesis Testing in the CER Model Eric Zivot July 23, 2013 Hypothesis Testing 1. Specify hypothesis to be tested 0 : null hypothesis versus. 1 : alternative hypothesis 2. Specify significance

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480 1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

12: Analysis of Variance. Introduction

12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests Spring 2014 Jeremy Orloff and Jonathan Bloom

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests Spring 2014 Jeremy Orloff and Jonathan Bloom Null Hypothesis Significance Testing Signifcance Level, Power, t-tests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Power and Sample Size Determination

Power and Sample Size Determination Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 Power 1 / 31 Experimental Design To this point in the semester,

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Hypothesis testing S2

Hypothesis testing S2 Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

More information

DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

More information

Chapter 9. Section Correlation

Chapter 9. Section Correlation Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

17.0 Linear Regression

17.0 Linear Regression 17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was

More information

AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

t-tests and F-tests in regression

t-tests and F-tests in regression t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Open book and note Calculator OK Multiple Choice 1 point each MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data.

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2 Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

Yiming Peng, Department of Statistics. February 12, 2013

Yiming Peng, Department of Statistics. February 12, 2013 Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

More information

Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results. BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

More information

Introduction to Stata

Introduction to Stata Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

More information

Hypothesis Testing. Bluman Chapter 8

Hypothesis Testing. Bluman Chapter 8 CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

AP Statistics 2011 Scoring Guidelines

AP Statistics 2011 Scoring Guidelines AP Statistics 2011 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Regression Analysis. Data Calculations Output

Regression Analysis. Data Calculations Output Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

More information

Causal Forecasting Models

Causal Forecasting Models CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline. Correlation & Regression, III. Review. Relationship between r and regression Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

More information

EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Randomized Block Analysis of Variance

Randomized Block Analysis of Variance Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of

More information

Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

More information

Multiple Regression Analysis (ANCOVA)

Multiple Regression Analysis (ANCOVA) Chapter 16 Multiple Regression Analysis (ANCOVA) In many cases biologists are interested in comparing regression equations of two or more sets of regression data. In these cases, the interest is in whether

More information

15.075 Exam 4. Instructor: Cynthia Rudin TA: Dimitrios Bisias. December 21, 2011

15.075 Exam 4. Instructor: Cynthia Rudin TA: Dimitrios Bisias. December 21, 2011 15.075 Exam 4 Instructor: Cynthia Rudin TA: Dimitrios Bisias December 21, 2011 Grading is based on demonstration of conceptual understanding, so you need to show all of your work. Problem 1 Choose Y or

More information

Structure of the Data. Paired Samples. Overview. The data from a paired design can be tabulated in this form. Individual Y 1 Y 2 d i = Y 1 Y

Structure of the Data. Paired Samples. Overview. The data from a paired design can be tabulated in this form. Individual Y 1 Y 2 d i = Y 1 Y Structure of the Data Paired Samples Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 11th November 2005 The data from a paired design can be tabulated

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p. Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Testing for serial correlation in linear panel-data models

Testing for serial correlation in linear panel-data models The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear panel-data models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear panel-data

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information