Lecture 2: Simple Linear Regression


 Clemence Powers
 2 years ago
 Views:
Transcription
1 DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching 1
2 Today s Plan 1. The Least Squares Criteria 2. The Simple Linear Regression Model 3. Estimation for the SLR Model sampling distributions confidence intervals hypothesis testing 2
3 Linear Prediction Ŷ i = b 0 + b 1 X i b 0 is the intercept and b 1 is the slope We find b 0 and b 1 using Least Squares 3
4 The Least Squares Criterion The formulas for b 0 and b 1 that minimize the least squares criterion are: b 1 = corr(x, Y ) s Y s X b 0 = Ȳ b 1 X where, s Y = n ( Yi Ȳ ) 2 i=1 and s X = n ( Xi X ) 2 i=1 4
5 The direction i the sign of the c Correlation and Covariance Review: Covariance and Correlation Measure the direction, "./01"$23"$direction and strength of the linear.4*$strength relationship 56$23"$()4".1 between variables Y and 1"(.2)54/3)7$8"29""4$23"$:.1).8("/$;$.4*$< X Applied Regression n Analysis Carlos M. Carvalhoi=1 Cov(Y, X ) = (Y i Ȳ )(X i X ) n 1 5
6 Correlation and Covariance Correlation is the standardized covariance: corr(x, Y ) = cov(x, Y ) = cov(x, Y ) var(x )var(y ) sd(x )sd(y ) The correlation is scale invariant and the units of measurement don t matter: It is always true that 1 corr(x, Y ) 1. This gives the direction ( or +) and strength (0 1) of the linear relationship between X and Y. 6
7 Correlation corr = corr = corr = corr =
8 Correlation Only measures linear relationships: corr(x, Y ) = 0 does not mean the variables are not related! corr = 0.01 corr = Also be careful with influential observations. 8
9 Back to Least Squares 1. Intercept: b 0 = Ȳ b 1 X Ȳ = b 0 + b 1 X The point ( X, Ȳ ) is on the regression line! Least squares finds the point of means and rotate the line through that point until getting the right slope 2. Slope: b 1 = corr(x, Y ) s Y s X So, the right slope is the correlation coefficient times a scaling factor that ensures the proper units for b 1 9
10 More on Least Squares From now on, terms fitted values (Ŷ i ) and residuals (e i ) refer to those obtained from the least squares line. The fitted values and residuals have some special properties. Lets look at the housing data analysis to figure out what these properties are... 10
11 The Fitted Values and X Fitted Values corr(y.hat, x) = X 11
12 The Residuals and X Residuals corr(e, x) = 0 mean(e) = X 12
13 Why? What is the intuition for the relationship between Ŷ and e and X? Lets consider some crazy alternative line: Y Crazy line: X LS line: X X 13
14 Fitted Values and Residuals This is a bad fit! We are underestimating the value of small houses and overestimating the value of big houses. Crazy Residuals corr(e, x) = 0.7 mean(e) = Clearly, we have left some predictive ability on the table! X 14
15 Fitted Values and Residuals As long as the correlation between e and X is nonzero, we could always adjust our prediction rule to do better. We need to exploit all of the predictive power in the X values and put this into Ŷ, leaving no Xness in the residuals. In Summary: Y = Ŷ + e where: Ŷ is made from X ; corr(x, Ŷ ) = 1. e is unrelated to X ; corr(x, e) = 0. 15
16 Another way to derive things The intercept: 1 n n e i = 0 1 n i=1 n (Y i b 0 b 1 X i ) i=1 Ȳ b 0 b 1 X = 0 b 0 = Ȳ b 1 X 16
17 Another way to derive things The slope: corr(e, X ) = = = n e i (X i X ) = 0 i=1 n (Y i b 0 b 1 X i )(X i X ) i=1 n (Y i Ȳ b 1 (X i X ))(X i X ) i=1 b 1 = n i=1 (X i X )(Y i Ȳ ) n i=1 (X i X ) 2 = r xy s y s x 17
18 Decomposing the Variance How well does the least squares line explain variation in Y? Since Ŷ and e are independent (i.e. cov(ŷ, e) = 0), var(y ) = var(ŷ + e) = var(ŷ ) + var(e) This leads to n (Y i Ȳ ) 2 = i=1 n (Ŷ i Ȳ ) 2 + n i=1 i=1 e 2 i 18
19 Decomposing the Variance ANOVA Tables SSR: Variation in Y explained by the regression line. SSE: Variation in Y that is left unexplained. SSR = SST perfect fit. Be careful of similar acronyms; e.g. SSR for residual SS. 19
20 Decomposingthe the Variance The ANOVA Tables pplied Regression Analysis Fall
21 A Goodness of Fit Measure: R 2 The coefficient of determination, denoted by R 2, measures goodness of fit: R 2 = SSR SST = 1 SSE SST 0 < R 2 < 1. The closer R 2 is to 1, the better the fit. 21
22 A Goodness of Fit Measure: R 2 An interesting fact: R 2 = r 2 xy ( i.e., R 2 is squared correlation). R 2 = = n i=1 (Ŷi Ȳ )2 n i=1 (Y i Ȳ ) 2 n i=1 (b 0 + b 1 X i b 0 b 1 X ) 2 n i=1 (Y i Ȳ ) 2 = b2 1 n i=1 (X i X ) 2 n i=1 (Y i Ȳ ) 2 = b2 1 s2 x s 2 y = r 2 xy No surprise: the higher the sample correlation between X and Y, the better you are doing in your regression. 22
23 Back Back to the to the House Data '' ''/ ''. Applied Regression Analysis Carlos M. Carvalho!""#$%%&$'()*"$+, 23
24 Prediction and the Modelling Goal A prediction rule is any function where you input X and it outputs Ŷ as a predicted response at X. The least squares line is a prediction rule: Ŷ = f (X ) = b 0 + b 1 X 24
25 Prediction and the Modelling Goal Ŷ is not going to be a perfect prediction. We need to devise a notion of forecast accuracy. 25
26 Prediction and the Modelling Goal There are two things that we want to know: What value of Y can we expect for a given X? How sure are we about this forecast? Or how different could Y be from what we expect? Our goal is to measure the accuracy of our forecasts or how much uncertainty there is in the forecast. One method is to specify a range of Y values that are likely, given an X value. Prediction Interval: probable range for Yvalues given X 26
27 Prediction and the Modelling Goal Key Insight: To construct a prediction interval, we will have to assess the likely range of residual values corresponding to a Y value that has not yet been observed! We will build a probability model (e.g., normal distribution). Then we can say something like with 95% probability the residuals will be no less than $28,000 or larger than $28,000. We must also acknowledge that the fitted line may be fooled by particular realizations of the residuals. 27
28 The Simple Linear Regression Model The power of statistical inference comes from the ability to make precise statements about the accuracy of the forecasts. In order to do this we must invest in a probability model. Simple Linear Regression Model: Y = β 0 + β 1 X + ε ε N(0, σ 2 ) The error term ε is independent idosyncratic noise. 28
29 Independent Normal Additive Error Why do we have ε N(0, σ 2 )? E[ε] = 0 E[Y X ] = β 0 + β 1 X (E[Y X ] is conditional expectation of Y given X ). Many things are close to Normal (central limit theorem). MLE estimates for β s are the same as the LS b s. It works! This is a very robust model for the world. We can think of β 0 + β 1 X as the true regression line. 29
30 The Regression Model and our House Data Think of E[Y X ] as the average price of houses with size X: Some houses could have a higher than expected value, some lower, and the true line tells us what to expect on average. The error term represents influence of factors other X. 30
31 Conditional Distributions The conditional distribution for Y given X is Normal: Y X N(β 0 + β 1 X, σ 2 ). σ controls dispersion: 31
32 Conditional vs Marginal Distributions More on the conditional distribution: Y X N(E[Y X ], var(y X )). Mean is E[Y X ] = E[β 0 + β 1 X + ε] = β 0 + β 1 X. Variance is var(β 0 + β 1 X + ε) = var(ε) = σ 2. Remember our sliced boxplots: σ 2 < var(y ) if X and Y are related. 32
33 Prediction Intervals with the True Model You are told (without looking at the data) that β 0 = 40; β 1 = 45; σ = 10 and you are asked to predict price of a 1500 square foot house. What do you know about Y from the model? Y = (1.5) + ε = ε Thus our prediction for price is Y N(107.5, 10 2 ) 33
34 Prediction Intervals with the True Model The model says that the mean value of a 1500 sq. ft. house is $107,500 and that deviation from mean is within $20,000. We are 95% sure that 20 < ε < 20 $87, 500 < Y < $127, 500 In general, the 95 % Prediction Interval is PI = β 0 + β 1 X ± 2σ. 34
35 Summary of Simple Linear Regression Assume that all observations are drawn from our regression model and that errors on those observations are independent. The model is Y i = β 0 + β 1 X i + ε i where ε is independent and identically distributed N(0, σ 2 ). The SLR has 3 basic parameters: β 0, β 1 (linear pattern) σ (variation around the line). 35
36 Key Characteristics of Linear Regression Model Mean of Y is linear in X. Error terms (deviations from line) are normally distributed (very few deviations are more than 2 sd away from the regression mean). Error terms have constant variance. 36
37 Break Back in 15 minutes... 37
38 Recall: Estimation for the SLR Model SLR assumes every observation in the dataset was generated by the model: Y i = β 0 + β 1 X i + ε i This is a model for the conditional distribution of Y given X. We use Least Squares to estimate β 0 and β 1 : ˆβ 1 = b 1 = n i=1 (X i X )(Y i Ȳ ) n i=1 (X i X ) 2 ˆβ 0 = b 0 = Ȳ b 1 X 38
39 Estimation for the SLR Model 39
40 Estimation of Error Variance Recall that ε i iid N(0, σ 2 ), and that σ drives the width of the prediction intervals: σ 2 = var(ε i ) = E[(ε i E[ε i ]) 2 ] = E[ε 2 i ] A sensible strategy would be to estimate the average for squared errors with the sample average squared residuals: n ˆσ 2 = 1 n i=1 e 2 i 40
41 Estimation of Error Variance However, this is not an unbiased estimator of σ 2. We have to alter the denominator slightly: s 2 = 1 n 2 n i=1 e 2 i = SSE n 2 (2 is the number of regression coefficients; i.e. 2 for β 0 + β 1 ). We have n 2 degrees of freedom because 2 have been used up in the estimatiation of b 0 and b 1. We usually use s = SSE/(n 2), in the same units as Y. 41
42 Degrees of Freedom Degrees of Freedom is the number of times you get to observe useful information about the variance you re trying to estimate. For example, consider SST = n i=1 (Y i Ȳ )2 : If n = 1, Ȳ = Y 1 and SST = 0: since Y 1 is used up estimating the mean, we haven t observed any variability! For n > 1, we ve only had n 1 chances for deviation from the mean, and we estimate s 2 y = SST /(n 1). In regression with p coefficients (e.g., p = 2 in SLR), you only get n p real observations of variability DoF = n p. 42
43 Estimation of Error Variance Estimation of 2!,""$).$s )/$0,"$123"($ Where is s in the Excel output?. Remember that whenever you see standard error read it as ).$0,"$.0?/*?*$*"<)?0)4/ estimated standard deviation: σ is the standard deviation. Applied Regression Analysis Carlos M. Carvalho!""#$%%%&$'()*"$+ 43
44 Sampling Distribution of Least Squares Estimates How much do our estimates depend on the particular random sample that we happen to observe? Imagine: Randomly draw different samples of the same size. For each sample, compute the estimates b 0, b 1, and s. If the estimates don t vary much from sample to sample, then it doesn t matter which sample you happen to observe. If the estimates do vary a lot, then it matters which sample you happen to observe. 44
45 Sampling Distribution of Least Squares Estimates 45
46 Sampling Distribution of Least Squares Estimates 46
47 Sampling Distribution of Least Squares Estimates LS lines are much closer to the true line when n = 50. For n = 5, some lines are close, others aren t: we need to get lucky 47
48 Review: Sampling Distribution of Sample Mean Step back for a moment and consider the mean for an iid sample of n observations of a random variable {X 1,..., X n } Suppose that E(X i ) = µ and var(x i ) = σ 2 E( X ) = 1 n E(Xi ) = µ var( X ) = var ( 1 ) n Xi = 1 n var 2 (Xi ) = σ2 n ( If X is normal, then X N µ, σ2 n If X is not normal, we have the central limit theorem (more in a minute)! ). 48
49 Oracle vs SAP Example (understanding variation) 49
50 Oracle vs SAP 50
51 Oracle vs SAP Do you really believe that SAP affects ROE? How else could we look at this question? 51
52 Central Limit Theorem Simple CLT states that for iid random variables, X, with mean µ and variance σ 2, the distribution of the sample mean becomes normal as the number of observations, n, gets large. That is, X n N(µ, σ2 ), and sample averages tend to be normally n distributed in large samples. 52
53 Central Limit Theorem Exponential random variables don t look very normal: f(x) E[X ] = 1 and var(x ) = 1. X 53
54 Central Limit Theorem 1000 means from n=2 samples Frequency apply(matrix(rexp(2 * 1000), ncol = 1000), 2, mean) 54
55 Central Limit Theorem 1000 means from n=5 samples Frequency apply(matrix(rexp(5 * 1000), ncol = 1000), 2, mean) 55
56 Central Limit Theorem 1000 means from n=10 samples Frequency apply(matrix(rexp(10 * 1000), ncol = 1000), 2, mean) 56
57 Central Limit Theorem 1000 means from n=100 samples Frequency apply(matrix(rexp(100 * 1000), ncol = 1000), 2, mean) 57
58 Central Limit Theorem 1000 means from n=1k samples Frequency apply(matrix(rexp(1000 * 1000), ncol = 1000), 2, mean) 58
59 Sampling Distribution of b 1 The sampling distribution of b 1 describes how estimator b 1 = ˆβ 1 varies over different samples with the X values fixed. It turns out that b 1 is normally distributed: b 1 N(β 1, σ 2 b 1 ). b 1 is unbiased: E[b 1 ] = β 1. Sampling sd σ b1 determines precision of b 1. 59
60 Sampling Distribution of b 1 Can we intuit what should be in the formula for σ b1? How should σ figure in the formula? What about n? Anything else? var(b 1 ) = σ 2 (Xi X ) 2 = σ 2 (n 1)s 2 x Three Factors: sample size (n), error variance (σ 2 = σ 2 ε), and X spread (s x ). 60
61 Sampling Distribution of b 0 The intercept is also normal and unbiased: b 0 N(β 0, σ 2 b 0 ). σ 2 b 0 = var(b 0 ) = σ 2 ( 1 n + X 2 (n 1)s 2 x ) What is the intuition here? 61
62 The Importance of Understanding Variation When estimating a quantity, it is vital to develop a notion of the precision of the estimation; for example: estimate the slope of the regression line estimate the value of a house given its size estimate the expected return on a portfolio estimate the value of a brand name estimate the damages from patent infringement Why is this important? We are making decisions based on estimates, and these may be very sensitive to the accuracy of the estimates! 62
63 The Importance of Understanding Variation Example from everyday life: When building a house, we can estimate a required piece of wood to 1/4? When building a fine cabinet, the estimates may have to be accurate to 1/16 or even 1/32. The standard deviations of the least squares estimators of the slope and intercept give a precise measurement of the accuracy of the estimator. However, these formulas aren t especially practical since they involve the unknown quantity: σ 63
64 Estimated Variance We estimate variation with sample standard deviations : s b1 = s 2 (n 1)s 2 x s b0 = s 2 ( 1 n + X 2 (n 1)s 2 x ) Recall that s = e 2 i /(n 2) is the estimator for σ = σ ε. Hence, s b1 = ˆσ b1 and s b0 = ˆσ b0 are estimated coefficient sd s. A high level of info/precision/accuracy means small s b values. 64
65 Normal and Student s t Recall what Student discovered: If θ N(µ, σ 2 ), but you estimate σ 2 s 2 based on n p degrees of freedom, then θ t n p (µ, s 2 ). For example: Ȳ t n 1 (µ, s 2 y /n). b 0 t n 2 ( β0, s 2 b 0 ) and b1 t n 2 ( β1, s 2 b 1 ) The t distribution is just a fattailed version of the normal. As n p, our tails get skinny and the t becomes normal. 65
66 Standardized Normal and Student s t We ll also usually standardize things: b j β j σ bj N(0, 1) = b j β j s bj t n 2 (0, 1) We use Z N(0, 1) and Z n p t n p (0, 1) to represent standard random variables. Notice that the t and normal distributions depend upon assumed values for β j : this forms the basis for confidence intervals, hypothesis testing, and pvalues. 66
67 Testing and Confidence Intervals (in 3 slides) Suppose Z n p is distributed t n p (0, 1). A centered interval is P( t n p,α/2 < Z n p < t n p,α/2 ) = 1 α 67
68 Confidence Intervals Since b j t n p (β j, s bj ), 1 α = P( t n p,α/2 < b j β j s bj < t n p,α/2 ) = P(b j t n p,α/2 s bj < β j < b j + t n p,α/2 s bj ) Thus (1 α)*100% of the time, β j is within the Confidence Interval: b j ± t n p,α/2 s bj 68
69 Testing Similarly, suppose that assuming b j t n p (β j, s bj ) for our sample b j leads to (recall Z n p t n p (0, 1)) ( ) ( P Z n p < b j β j s bj + P Z n p > b j β j s bj Then the pvalue is ϕ = 2P(Z n p > b j β j /s bj ). ) = ϕ. You do this calculation for β j = βj 0, an assumed null/safe value, and only reject βj 0 if ϕ is too small (e.g., ϕ < 1/20). In regression, β 0 j = 0 almost always. 69
70 More Detail... Confidence Intervals Why should we care about Confidence Intervals? The confidence interval captures the amount of information in the data about the parameter. The center of the interval tells you what your estimate is. The length of the interval tells you how sure you are about your estimate. 70
71 More Detail... Testing Suppose that we are interested in the slope parameter, β 1. For example, is there any evidence in the data to support the existence of a relationship between X and Y? We can rephrase this in terms of competing hypotheses. H 0 : β 1 = 0. Null/safe; implies no effect and we ignore X. H 1 : β 1 0. Alternative; leads us to our best guess β 1 = b 1. 71
72 Hypothesis Testing If we want statistical support for a certain claim about the data, we want that claim to be the alternative hypothesis. Our hypothesis test will either reject or not reject the null hypothesis (the default if our claim is not true). If the hypothesis test rejects the null hypothesis, we have statistical support for our claim! 72
73 Hypothesis Testing We use b j for our test about β j. Reject H 0 when b j is far from β 0 j (ususally 0). Assume H 0 when b j is close to β 0 j. An obvious tactic is to look at the difference b j β 0 j. But this measure doesn t take into account the uncertainty in estimating b j : What we really care about is how many standard deviations b j is away from β 0 j. 73
74 Hypothesis Testing The tstatistic for this test is z bj = b j β 0 j s bj = b j s bj for β 0 j = 0. If H 0 is true, this should be distributed z bj t n p (0, 1). Small z bj leaves us happy with the null β 0 j. Large z bj (i.e., > about 2) should get us worried! 74
75 Hypothesis Testing We assess the size of z bj with the pvalue : ϕ = P( Z n p > z bj ) = 2P(Z n p > z bj ) (once again, Z n p t n p (0, 1)). pvalue = 0.05 (with 8 df) p(z) z z Z 75
76 Hypothesis Testing The pvalue is the probablity, assuming that the null hypothesis is true, of seeing something more extreme (further from the null) than what we have observed. You can think of 1 ϕ (inverse pvalue) as a measure of distance between the data and the null hypothesis. In other words, 1 ϕ is the strength of evidence against the null. 76
77 Hypothesis Testing The formal 2step approach to hypothesis testing Pick the significance level α (often 1/20 = 0.05), our acceptable risk (probability) of rejecting a true null hypothesis (we call this a type 1 error). This α plays the same role as α in CI s. Calculate the pvalue, and reject H 0 if ϕ < α (in favor of our best alternative guess; e.g. β j = b j ). If ϕ > α, continue working under null assumptions. This is equivilent to having the rejection region z bj > t n p,α/2. 77
78 Example: Hypothesis Testing Consider again a CAPM regression for the Windsor fund. Does Windsor have a nonzero intercept? (i.e., does it make/lose money independent of the market?). H 0 : β 0 = 0 and there is nofree money. H 1 : β 0 0 and Windsor is cashing regardless of market. 78
79 Hypothesis Testing Windsor Fund Example Example: "./(($01"$!)2*345$5"65"33)42$72$8$,9:;< Hypothesis Testing Applied Regression Analysis Carlos M. Carvalho b! s b! b s b!!!""#$%%%&$'()*"$+, It turns out that we reject the null at α =.05 (ϕ =.0105). Thus Windsor does have an alpha over the market. 79
80 Example: Hypothesis Testing Looking at the slope, this is a very rare case where the null hypothesis is not zero: H 0 : β 1 = 1 Windsor is just the market (+ alpha). H 1 : β 1 1 and Windsor softens or exaggerates market moves. We are asking whether or not Windsor moves in a different way than the market (e.g., is it more conservative?). Now, t = b 1 1 s b1 = = t n 2,α/2 = t 178,0.025 = 1.96 Reject H 0 at the 5% level 80
81 Forecasting The conditional forecasting problem: Given covariate X f and sample data {X i, Y i } n i=1, predict the future observation y f. The solution is to use our LS fitted value: Ŷ f = b 0 + b 1 X f. This is the easy bit. The hard (and very important!) part of forecasting is assessing uncertainty about our predictions. 81
82 Forecasting If we use Ŷ f, our prediction error is e f = Y f Ŷ f = Y f b 0 b 1 X f 82
83 Forecasting This can get quite complicated! A simple strategy is to build the following (1 α)100% prediction interval: b 0 + b 1 X f ± t n 2,α/2 s A large predictive error variance (high uncertainty) comes from Large s (i.e., large ε s). Small n (not enough data). Small s x (not enough observed spread in covariates). Large difference between X f and X. Just remember that you are uncertain about b 0 and b 1! Reasonably inflating the uncertainty in the interval above is always a good idea... as always, this is problem dependent. 83
84 Forecasting For X f far from our X, the space between lines is magnified... 84
85 Glossary and Equations Ŷ i = b 0 + b 1 X i is the ith fitted value. e i = Y i Ŷ i is the ith residual. s: standard error of regression residuals ( σ = σ ε ). s 2 = 1 e 2 n 2 i s bj : standard error of regression coefficients. s b1 = s 2 (n 1)s 2 x s b0 = s 1 n + X 2 (n 1)s 2 x 85
86 Glossary and Equations α is the significance level (prob of type 1 error). t n p,α/2 is the value such that for Z n p t n p (0, 1), P(Z n p > t n p,α/2 ) = P(Z n p < t n p,α/2 ) = α/2. z bj t n p (0, 1) is the standardized coefficient tvalue: z bj = b j β 0 j s bj (= b j /s bj most often) The (1 α) 100% for β j is b j ± t n p,α/2 s bj. 86
Section 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationOutline. Topic 4  Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4  Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test  Fall 2013 R 2 and the coefficient of correlation
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationwhere b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.
Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes
More informationRegression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur
Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture  7 Multiple Linear Regression (Contd.) This is my second lecture on Multiple Linear Regression
More informationChapter 11: Two Variable Regression Analysis
Department of Mathematics Izmir University of Economics Week 1415 20142015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationHypothesis Testing for Beginners
Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easytoread notes
More informationStatistics 112 Regression Cheatsheet Section 1B  Ryan Rosario
Statistics 112 Regression Cheatsheet Section 1B  Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything
More informationSampling and Hypothesis Testing
Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationVariance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.
Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationStatistics II Final Exam  January Use the University stationery to give your answers to the following questions.
Statistics II Final Exam  January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly
More informationHypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...
Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measuresoffit in multiple regression Assumptions
More informationRegression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture  2 Simple Linear Regression
Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture  2 Simple Linear Regression Hi, this is my second lecture in module one and on simple
More informationHomework Zero. Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis. Complete before the first class, do not turn in
Homework Zero Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis Complete before the first class, do not turn in This homework is intended as a self test of your knowledge of the basic statistical
More information7 Hypothesis testing  one sample tests
7 Hypothesis testing  one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X
More informationInference in Regression Analysis. Dr. Frank Wood
Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationChapter 11: Linear Regression  Inference in Regression Analysis  Part 2
Chapter 11: Linear Regression  Inference in Regression Analysis  Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.
More informationRegression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between
More informationRegression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology
Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of
More informationHypothesis Testing or How to Decide to Decide Edpsy 580
Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Hypothesis Testing or How to Decide to Decide
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationChapter 7 Notes  Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes  Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More informationTwosample hypothesis testing, II 9.07 3/16/2004
Twosample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For twosample tests of the difference in mean, things get a little confusing, here,
More information5.1 Identifying the Target Parameter
University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying
More informationDefinition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).
Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The
More informationSection 2: Estimation, Confidence Intervals and Testing Hypothesis
Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/
More informationStatistics for Management IISTAT 362Final Review
Statistics for Management IISTAT 362Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationSample Size Determination
Sample Size Determination Population A: 10,000 Population B: 5,000 Sample 10% Sample 15% Sample size 1000 Sample size 750 The process of obtaining information from a subset (sample) of a larger group (population)
More informationIntroduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.
Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.
More informationOLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique  ie the estimator has the smallest variance
Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique  ie the estimator has the smallest variance (if the GaussMarkov
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationDEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests
DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationLecture 13 More on hypothesis testing
Lecture 13 More on hypothesis testing Thais Paiva STA 111  Summer 2013 Term II July 22, 2013 1 / 27 Thais Paiva STA 111  Summer 2013 Term II Lecture 13, 07/22/2013 Lecture Plan 1 Type I and type II error
More informationTwosample inference: Continuous data
Twosample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with twosample inference for continuous data As
More informationBivariate Regression Analysis. The beginning of many types of regression
Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression
More information, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationLinear and Piecewise Linear Regressions
Tarigan Statistical Consulting & Coaching statisticalcoaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Handson Data Analysis
More informationfifty Fathoms Statistics Demonstrations for Deeper Understanding Tim Erickson
fifty Fathoms Statistics Demonstrations for Deeper Understanding Tim Erickson Contents What Are These Demos About? How to Use These Demos If This Is Your First Time Using Fathom Tutorial: An Extended Example
More informationSimple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationHypothesis testing  Steps
Hypothesis testing  Steps Steps to do a twotailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationWooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis:
Wooldridge, Introductory Econometrics, 4th ed. Chapter 4: Inference Multiple regression analysis: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of
More information0.1 Multiple Regression Models
0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different
More information2. What are the theoretical and practical consequences of autocorrelation?
Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationLinear Regression with One Regressor
Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y
More informationCovariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a databased relative frequency distribution by measures of location and spread, such
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 4448935 email:
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationThe Simple Linear Regression Model: Specification and Estimation
Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on
More informationIn Chapter 2, we used linear regression to describe linear relationships. The setting for this is a
Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationThe Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information
Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F distribution to simultaneously
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two Means
Lesson : Comparison of Population Means Part c: Comparison of Two Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationChapter 8: Interval Estimates and Hypothesis Testing
Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8 Outline Clint s Assignment: Taking Stock Estimate Reliability: Interval Estimate Question o Normal Distribution versus the Student tdistribution:
More information17.0 Linear Regression
17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationEstimation and Confidence Intervals
Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e
More informationBasic Statistcs Formula Sheet
Basic Statistcs Formula Sheet Steven W. ydick May 5, 0 This document is only intended to review basic concepts/formulas from an introduction to statistics course. Only meanbased procedures are reviewed,
More informationHow to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationAn example ANOVA situation. 1Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)
An example ANOVA situation Example (Treating Blisters) 1Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationHypothesis Testing  II
3σ 2σ +σ +2σ +3σ Hypothesis Testing  II Lecture 9 0909.400.01 / 0909.400.02 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S 3σ 2σ +σ +2σ +3σ Review: Hypothesis
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationVariances and covariances
Chapter 4 Variances and covariances 4.1 Overview The expected value of a random variable gives a crude measure for the center of location of the distribution of that random variable. For instance, if the
More informationInference for Regression
Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about
More informationSolutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012.
Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG8). December 15, 12. 1. (3p) The joint distribution of the discrete random variables X and
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationHeteroskedasticity and Weighted Least Squares
Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/
More informationThe Assumption(s) of Normality
The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew
More informationThe Philosophy of Hypothesis Testing, Questions and Answers 2006 Samuel L. Baker
HYPOTHESIS TESTING PHILOSOPHY 1 The Philosophy of Hypothesis Testing, Questions and Answers 2006 Samuel L. Baker Question: So I'm hypothesis testing. What's the hypothesis I'm testing? Answer: When you're
More informationInferences About Differences Between Means Edpsy 580
Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Inferences About Differences Between Means Slide
More information