Chapter 12. Simple Linear Regression and Correlation
|
|
- Basil Atkinson
- 7 years ago
- Views:
Transcription
1 Chapter 12. Simple Linear Regression and Correlation 12.1 The Simple Linear Regression Model 12.2 Fitting the Regression Line 12.3 Inferences on the Slope Rarameter β Inferences on the Regression Line 12.5 Prediction Intervals for Future Response Values 12.6 The Analysis of Variance Table 12.7 Residual Analysis 12.8 Variable Transformations 12.9 Correlation Analysis Supplementary Problems NIPRL 1
2 12.1 The Simple Linear Regression Model Model Definition and Assumptions(1/5) With the simple linear regression model y i =β 0 +β 1 x i +ε i the observed value of the dependent variable y i is composed of a linear function β 0 +β 1 x i of the explanatory variable x i, together with an error term ε i. The error terms ε 1,,ε,ε n are generally taken to be independent observations from a N(0,σ 2 ) distribution, for some error variance σ 2. This implies that the values y 1,,y n are observations from the independent random variables Y i ~ N (β 0 +β 1 x i, σ 2 ) as illustrated in Figure 12.1 NIPRL 2
3 Model Definition and Assumptions(2/5) NIPRL 3
4 Model Definition and Assumptions(3/5) The parameter β 0 is known as the intercept parameter, and the parameter β 0 is known as the intercept parameter, and the parameter β 1 is known as the slope parameter. A third unknown parameter, the error variance σ 2, can also be estimated from the data set. As illustrated in Figure 12.2, the data values (x i, y i ) lie closer to the line y = β 0 +β 1 x as the error variance σ 2 decreases. NIPRL 4
5 Model Definition and Assumptions(4/5) The slope parameter β 1 is of particular interest since it indicates how the expected value of the dependent variable depends upon the explanatory variable x, as shown in Figure 12.3 The data set shown in Figure 12.4 exhibits a quadratic (or at least nonlinear) relationship between the two variables, and it would make no sense to fit a straight line to the data set. NIPRL 5
6 Model Definition and Assumptions(5/5) Simple Linear Regression Model The simple linear regression model y i = β 0 + β 1 x i + ε i fit a straight line through a set of paired data observations (x 1,y 1 ),,(x n, y n ). The error terms ε 1,,ε n are taken to be independent observations from a N(0,σ 2 ) distribution. The three unknown parameters, the intercept parameter β 0, the slope parameter β 1, and the error variance σ 2, are estimated from the data set. NIPRL 6
7 Examples(1/2) Example 3 : Car Plant Electricity Usage The manager of a car plant wishes to investigate how the plant s electricity usage depends upon the plant s production. The linear model y = β + βx 0 1 will allow a month s electrical usage to be estimated as a function of the month s production. NIPRL 7
8 Examples(2/2) NIPRL 8
9 12.2 Fitting the Regression Line Parameter Estimation(1/4) The regression line y = β + βx is fitted to the data points ( x, y ), K,( x, y ) by finding the line that is "closest" to the data points in some sense. As Figure illustrates, the fitted line is chosen to be the line that minimizes the sum of the squares of these vertical deviations Q = ( y ( β + βx )) n i= 1 i 0 1 i and this is referred to as the least squares fit. 2 n n NIPRL 9
10 Parameter Estimation(2/4) With normally distributed error terms, ˆ β and ˆ β are maximum likelihood estimates. n 2 εi i= ( Q) The joint density of the error terms ε, K, ε is Q β 0 1 n 1 2σ e. 2πσ This likelihood is maximized by minizing 2 2 εi = ( yi ( β0 + β1 i )) = n i= 1 i 0 1 i n 0 1 i= 1 i x Q = 2( y ( β + βx )) and Q n = i= 1 2 xi ( yi ( β0 + β1xi )) β the normal equations y = n ˆ β + ˆ β x and i = ˆ β + ˆ β n n n 2 i= 1xy i i 0 i= 1xi 1 i= 1x i NIPRL 10 1 n
11 β n xy ( x )( y ) S n n n i= 1 i i i= 1 i i= 1 i XY 1 = = n 2 n 2 n i=1xi ( i=1xi ) SXX and then n n i= 1yi i= 1xi β 0 = β 1 = y β 1x n n where S = ( x x) = x nx n 2 n 2 2 XX i= 1 i i= 1 i n n 2 ( i= 1xi ) = i= 1xi n and Parameter Estimation(3/4) 2 n n n n n ( i= 1xi )( i= 1yi ) SXY = i= 1( xi x)( yi y) = i= 1xy i i nxy = i= 1xy i i n * For a specific value of the explanatory variable x, this equation * provides a fitted value yˆ = β + β x for the dependent variable y, as illustrated in Figure x * 0 1 NIPRL 11
12 Parameter Estimation(4/4) 2 The error variance can be estimated by considering the deviations between the observed data values y and their fitted values y. Specifically, the sum of squares for error SSE is i σ defined to be the sum of the squares of these deviations n n SSE = ( y y ) = ( y ( β + β x )) 2 2 i= 1 i i i= 1 i 0 1 i n 2 n n = y β y β xy i= 1 i 0 i= 1 i 1 i= 1 i i and the estimate of the error variance is 2 SSE σ = n 2 i NIPRL 12
13 Examples(1/5) Example 3 : Car Plant Electricity Usage For this example n 12 i= 1 12 i= 1 12 i= 1 12 i= 1 12 i= 1 x i y x y i = 12 and = L = = L = = L = i = L = i xy i i = ( ) + L+ ( ) = NIPRL 13
14 Examples(2/5) NIPRL 14
15 i i i i i= 1 i= 1 i= 1 β 1 = n n 2 2 n xi ( xi ) i= 1 i= Examples(3/5) The estimates of the slope parameter and the intercept parameter : β n n n n xy ( x )( y ) ( ) ( ) = = ( ) = y βx = ( ) = The fitted regression line : y = β + βx = x 0 1 $ y = ( ) = NIPRL 15
16 Examples(4/5) Using the model for production values x outside this range is known as extrapolatio n and may give inaccurate results. NIPRL 16
17 Examples(5/5) n n n 2 y i β0 yi β1 xy i i 2 i= 1 i= 1 i= 1 σ = n ( ) ( ) = = 10 σ = = NIPRL 17
18 12.3 Inferences on the Slope Parameter β Inference Procedures(1/4) Inferences on the Slope Parameter β 1 2 ˆ σ β1 Ν( β1, ). S XX A two-sided confidence interval with a confidence level 1 α for the slope parameter in a simple linear regression model is β ( β t se..( β ), β + t se..( β )) 1 1 α / 2, n α / 2, n 2 1 which is σt α / 2, n 2 σtα / 2, n 2 β1 ( β1, β1 + ) S S XX One-sided 1 α confidence level confidence intervals are σt α, n 2 σtα, n 2 β1 (, β1 + ) and β1 ( β1, ) S S XX XX XX NIPRL 18
19 Inference Procedures(2/4) The two-sided hypotheses H : β = b versus H : β b A 1 1 for a fixed value b of interest are tested with the t-statistic β1 b1 t = σ S XX 1 The p-value is p-value = 2 P( X > t ) where the random variable X has a t-distribution with n 2 degrees of freedom. A size α test rejects the null hypothesis if t > t α / 2, n 2. NIPRL 19
20 Inference Procedures(3/4) The one-sided hypotheses H : β b versus H : β < b A 1 1 have a p-value p-value = P( X < t) and a size α test rejects the null hypothesis if t < t α. The one-sided hypotheses H : β b versus H : β > b A 1 1 have a p-value p-value = P( X > t) and a size α test rejects the null hypothesis if t > t α., n 2, n 2 NIPRL Slki Lab. 20
21 Inference Procedures(4/4) An interesting point to notice is that for a fixed value of the error variance σ 2, the variance of the slope parameter estimate decreases as the value of S XX increases. This happens as the values of the explanatory variable x i become more spread out, as illustrated in Figure This result is intuitively reasonable since a greater spread in the values x i provides a greater leverage for fitting the regression line, and therefore the slope parameter estimate β 1 should be more accurate. NIPRL 21
22 Examples(1/2) Example 3 : Car Plant Electricity Usage 12 2 ( x 12 i ) 2 2 i= SXX = xi = = i= σ se..( β1) = = = S XX The t-statistic for testing H : β = 0 β1 t se.. β ( 1) = = The two-sided p-value p value = 2 P( X > 6.37) 0 NIPRL 22
23 Examples(2/2) With t 0.005,10 = 3.169, a 99% two-sided confidence interval for the slope parameter β ( β critical point se..( β ), β + critical point se..( β )) = ( , ) = ( 0.251, 0.747) NIPRL 23
24 12.4 Inferences on the Regression Line Inference Procedures(1/2) Inferences on the Expected Value of the Dependent Variable * A 1 α confidence level two-sided confidence interval for β0 + β1x, the * expected value of the dependent variable for a particular value of the explanatory variable, is * * β * 0 + β1x ( β0 + β1 x tα / 2, n 1 se..( β0 + β1x ), * * β + βx + t se..( β + βx )) where * 1 ( x x) se..( β0 + β1x ) = σ + n S 0 1 α / 2, n * 2 XX x NIPRL 24
25 Inference Procedures(2/2) One-sided confidence intervals are * * * β + βx (, β + βx + t se..( β + βx )) α, n and * * * β + β x ( β + β x t se..( β + β x ), ) α, n β β * Hypothesis tests on 0 + 1x can be performed by comparing the t-statistic ( β t = * * 0 + β1x ) ( β0 + β1x ) * se..( β + βx ) 0 1 with a t-distribution with n 2 degrees of freedom. NIPRL 25
26 Examples(1/2) Example 3 : Car Plant Electricity Usage ( x x ) 2 * * * ( x 4.885) se..( β0 + β1x ) = σ + = n S With β t XX = 2.228, a 95% confidence interval for β + βx 0.025, * 2 1 ( x 4.885) ( , * * 0 + β1x + x + * At 5 x = x + * * 2 1 ( x 4.885) + ) β + 5 β ( ( ) 0.113, ( ) ) = (2.79,3.02) * NIPRL 26
27 Examples(2/2) NIPRL 27
28 12.5 Prediction Intervals for Future Response Values Inference Procedures(1/2) Prediction Intervals for Future Response Values A 1 α confidence level two-sided prediction interval for y, a future value * of the dependent variable for a particular value of the explanatory variable, is x x * * 1 ( x x) y * ( β0 + β1 x t / 2, n 1 1 x α σ + + n S * 2 XX, * 2 * 1 ( x x) β0 + β1 x + tα / 2, n 2σ 1+ + ) n S XX NIPRL 28
29 Inference Procedures(2/2) One-sided confidence intervals are * 2 * 1 ( x x) y * (, β0 + β1 x + t, n 2 1 ) x α σ + + n S and * 2 * 1 ( x x) y * ( β0 + β1 x t, n 1 1, ) x α σ + + n S XX XX NIPRL 29
30 Examples(1/2) Example 3 : Car Plant Electricity Usage With t = 2.228, a 95% confidence interval for y y 0.025,10 13 ( x 4.885) * 2 * * ( x , x * At x = ( x 4.885) + x * 2 * ) y ( ( ) 0.401, ( ) ) = (2.50,3.30) x * NIPRL 30
31 Examples(2/2) NIPRL 31
32 12.6 The Analysis of Variance Table Sum of Squares Decomposition(1/5) NIPRL 32
33 Sum of Squares Decomposition(2/5) NIPRL 33
34 Sum of Squares Decomposition(3/5) Source Degrees of freedom Sum of squares Mean squares F-statistic p-value Regression Error 1 N-2 SSR SSE 2 σ MSR=SSR =MSE=SSE/(n-2) F=MSR/MSE P( F 1,n-2 > F ) 1,n Total n-1 F I G U R E Analysis of variance table for simple linear regression analysis NIPRL 34
35 Sum of Squares Decomposition(4/5) NIPRL 35
36 Sum of Squares Decomposition(5/5) Coefficient of Determination R 2 The total variability in the dependent variable, the total sum of squares SST = ( y y) n i= 1 2 can be partitioned into the variability explained by the regression line, n the regression sum of squares SSR = ( y y) i i= 1 2 i and the variability about the regression line, the error sum of squares n 2 i= 1 yi yi SSE = ( ). The proportion of the total variability accounted for by the regression line is the coefficient of determination 2 SSR SSE 1 R = = 1 = SST SST SSE 1 + SSR which takes a value between zero and one. NIPRL 36
37 Examples(1/1) Example 3 : Car Plant Electricity Usage MSR F = = = MSE SSR R = = = SST NIPRL 37
38 12.7 Residual Analysis Residual Analysis Methods(1/7) The residuals are defined to be e i = yi yi, 1 i n so that they are the differences between the observed values of the dependent variable and ythe corresponding fitted values. A property of the residuals n e = 0 i= 1 i 0 Residual analysis can be used to i Identify data points that are outliers, Check whether the fitted model is appropriate, Check whether the error variance is constant, and Check whether the error terms are normally distributed. y i NIPRL 38
39 Residual Analysis Methods(2/7) A nice random scatter plot such as the one in Figure there are no indications of any problems with the regression analysis Any patterns in the residual plot or any residuals with a large absolute value alert the experimenter to possible problems with the fitted regression model. NIPRL 39
40 Residual Analysis Methods(3/7) A data point (x i, y i ) can be considered to be an outlier if it does not appear to predict well by the fitted model. Residuals of outliers have a large absolute value, as indicated in Figure Note in the figure that e i is used instead of e ˆ i. s [For your interest only] Var e 1 ( x - x) 2 i 2 ( i ) = (1- - ) s. n SXX NIPRL 40
41 Residual Analysis Methods(4/7) If the residual plot shows positive and negative residuals grouped together as in Figure 12.47, then a linear model is not appropriate. As Figure indicates, a nonlinear model is needed for such a data set. NIPRL 41
42 Residual Analysis Methods(5/7) If the residual plot shows a funnel shape as in Figure 12.48, so that the size of the residuals depends upon the value of the explanatory variable x, then the assumption of a constant error variance σ 2 is not valid. NIPRL 42
43 Residual Analysis Methods(6/7) A normal probability plot ( a normal score plot) of the residuals Check whether the error terms ε i appear to be normally distributed. The normal score of the i th smallest residual 3 i 1 8 Φ 1 n + 4 The main body of the points in a normal probability plot lie approximately on a straight line as in Figure is reasonable The form such as in Figure indicates that the distribution is not normal NIPRL 43
44 Residual Analysis Methods(7/7) NIPRL 44
45 Example : Nile River Flowrate Examples(1/2) NIPRL 45
46 Examples(2/2) x = 3.88 $ y = ( ) = 2.77 e = y y = = 1.24 i i i ei 1.24 = = 3.75 σ x = 6.13 e = y y = 5.67 ( ( )) = 1.02 i i i ei 1.02 = = 3.07 σ NIPRL 46
47 12.8 Variable Transformations Intrinsically Linear Models(1/4) NIPRL 47
48 Intrinsically Linear Models(2/4) NIPRL 48
49 Intrinsically Linear Models(3/4) NIPRL 49
50 Intrinsically Linear Models(4/4) NIPRL 50
51 Examples(1/5) Example : Roadway Base Aggregates NIPRL 51
52 Examples(2/5) NIPRL 52
53 Examples(3/5) NIPRL 53
54 Examples(4/5) NIPRL 54
55 Examples(5/5) NIPRL 55
56 12.9 Correlation Analysis The Sample Correlation Coefficient Sample Correlation Coefficient The sample correlation coefficientr for a set of paired data observations ( x, y ) is i i n SXY i= 1( xi x)( yi y) r = = S S ( x x ) ( y y ) n 2 n 2 XX YY i= 1 i i= 1 i n i= 1xy i i nxy = x nx y ny n 2 2 n 2 2 i= 1 i= 1 i i It measures the strength of linear association between two variables and can be thought of as an estimate of the correlation ρ between the two associated random variable X and Y. NIPRL 56
57 Under the assumption that the X and Y random variables have a bivariate normal distribution, a test of the null hypothesis H 0 : ρ = 0 can be performed by comparing the t-statistic r n 2 t = 2 1 r with a t-distribution with n 2 degrees of freedom. In a regression framework, this test is equivalent to testing H : β = NIPRL 57
58 NIPRL 58
59 NIPRL 59
60 Example : Nile River Flowrate Examples(1/1) r R 2 = = = NIPRL 60
CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationHypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationRegression and Correlation
Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationThe Big Picture. Correlation. Scatter Plots. Data
The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationJinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu
Submission for ARCH, October 31, 2006 Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Jed L. Linfield, FSA, MAAA, Health Actuary, Kaiser Permanente,
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More information1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More informationCausal Forecasting Models
CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationA Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationAn analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression
Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship
More information4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationSIMPLE LINEAR REGRESSION
CHAPTER 2 SIMPLE LINEAR REGRESSION 2.1 I NTRO DU CTlO N We start with the simple case of studying the relationship between a response variable Y and a predictor variable XI. Since we have only one predictor
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationPearson s Correlation
Pearson s Correlation Correlation the degree to which two variables are associated (co-vary). Covariance may be either positive or negative. Its magnitude depends on the units of measurement. Assumes the
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationList of Examples. Examples 319
Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationAP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationScatter Plot, Correlation, and Regression on the TI-83/84
Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationtable to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1.
Review Problems for Exam 3 Math 1040 1 1. Find the probability that a standard normal random variable is less than 2.37. Looking up 2.37 on the normal table, we see that the probability is 0.9911. 2. Find
More informationCorrelation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
More informationSimple Linear Regression
STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationVertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th
Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th Standard 3: Data Analysis, Statistics, and Probability 6 th Prepared Graduates: 1. Solve problems and make decisions that depend on un
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationApplying Statistics Recommended by Regulatory Documents
Applying Statistics Recommended by Regulatory Documents Steven Walfish President, Statistical Outsourcing Services steven@statisticaloutsourcingservices.com 301-325 325-31293129 About the Speaker Mr. Steven
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informationSouth Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
More informationSTT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)
STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012) TA: Zhen (Alan) Zhang zhangz19@stt.msu.edu Office hour: (C500 WH) 1:45 2:45PM Tuesday (office tel.: 432-3342) Help-room: (A102 WH) 11:20AM-12:30PM,
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationOne-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationCase Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
More informationCommon Core Unit Summary Grades 6 to 8
Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationSection A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More information