Simple Linear Regression
|
|
- Linette Warren
- 7 years ago
- Views:
Transcription
1 Simple Linear Regression Simple linear regression models the relationship between an independent variable (x) and a dependent variable (y) using an equation that expresses y as a linear function of x, plus an error term: y = a + bx + e y (dependent) a error: ε b x (independent) x is the independent variable y is the dependent variable b is the slope of the fitted line a is the intercept of the fitted line e is the error term
2 Fitting a Line to a Set of Points When we have a data set consisting of an independent and a dependent variable, and we plot these using a scatterplot, to construct our model between the relationship between the variables, we need to select a line that represents the relationship: y (dependent) x (independent) We can choose a line that fits best using a least squares method The least squares line is the line that minimizes the vertical distances between the points and the line, i.e. it minimizes the error term ε when it is considered for all points in the data set
3 Sampling and Regression II We usually operate using sampled data, and while we are building a model of the form: y = a + bx + e from our sample, in doing so we are attempting to estimate a true regression line, describing the relationship between independent variable (x) and dependent variable (y) for the entire population: y = α + βx + ε Multiple samples would yield several similar regression lines, which should approximate the population regression line
4 Least Squares Method The least squares method operates mathematically, minimizing the error term e for all points We can describe the line of best fit we will find using the equation ŷ = a + bx, and you ll recall that from a previous slide that the formula for our linear model was expressed using y = a + bx + e y ŷ We use the value ŷ on the line to estimate the true value, y (y - ŷ) The difference between the two is (y - ŷ) = e ŷ = a + bx This difference is positive for points above the line, and negative for points below it
5 Estimates and Residuals Our simple linear regression models take the form: y = a + bx + e which can alternatively be expressed as: ŷ = a + bx where ŷ is the estimate of y produced by the regression We can rearrange these equations to show: e = y ŷ The errors in the estimation of y using the regression equation are known are residuals, and express for any given value in the data set to what extent the regression line is either underestimating or overestimating the true value of y
6 Minimizing the Error Term In a linear model, the error in estimating the true value of the dependent variable y is expressed by the difference between the true value and the estimated value ŷ, e = (y - ŷ) (i.e. the residuals) Sometimes this difference will be positive (when the line underestimates the value of y) and sometimes it will be negative (when the line overestimates the value of y), because there will be points above and below the line If we were to simply sum these error terms, the positive and negative values would cancel out Instead, we can square the differences and then sum them up to create a useful estimate of the overall error
7 Error Sum of Squares By squaring the differences between y and ŷ, and summing these values for all points in the data set, we calculate the error sum of squares (usually denoted by SSE): n SSE = Σ (y - ŷ) 2 i = 1 The least squares method of selecting a line of best fit functions by finding the parameters of a line (intercept a and slope b) that minimizes the error sum of squares, i.e. it is known as the least squares method because it finds the line that makes the SSE as small as it can possibly be, minimizing the vertical distances between the line and the points
8 Finding Regression Coefficients The equations used to find the values for the slope (b) and intercept (a) of the line of best fit using the least squares method are: b = n Σ (x i - x) (y i -y) i = 1 a = y - bx n Σ (x i -x) 2 i = 1 Where: x i is the i th independent variable value y i is the i th dependent variable value x is the mean value of all the x i values y is the mean value of all the y i values
9 Coefficient of Determination (r 2 ) y ŷ y If we use y to estimate y, the error is (y - y) If we use ŷ to estimate y, the error is (y - ŷ) Thus, (ŷ - y) is the improvement in our model To account for the total improvement for the model, we can calculate this distance and sum it for all points in the data set, first taking the square of the difference (ŷ -y)
10 Coefficient of Determination (r 2 ) The regression sum of squares (SSR) expresses the improvement made in estimating y by using the regression line: n y ŷ y SSR = Σ (ŷ i -y) 2 i = 1 The total sum of squares (SST) expresses the overall variation between the values of y and their mean y: n SST = Σ (y i -y) 2 i = 1 The coefficient of determination (R 2 ) expresses the amount of variation in y explained by the regression line (the strength of the relationship): r 2 = SSR SST
11 Partitioning the Total Sum of Squares We can also think of regression as a way to partition the variation in the values of the dependent variable y We can take the total variation, and divide it into two components: The component explained by the regression line The component that remains unexplained We can characterize the total variability in y using the sum of the squared deviations of the y i values from their mean The total variability is expressed by the total sum of squares: n SST = Σ (y i -y) 2 i = 1
12 Partitioning the Total Sum of Squares We can decompose the total sum of squares into those two components: n n n SST = Σ (y i -y) 2 i = 1 In other words: SST = SSR + SSE and the coefficient of determination expresses the portion of the total variation in y explained by the regression line = Σ (ŷ i -y) 2 i = 1 SST + Σ (y i - ŷ) 2 i = 1 SSE y SSR ŷ y
13 Regression ANOVA Table We can create an analysis of variance table that allows us to display the sums of squares, their degrees of freedom, mean square values (for the regression and error sums of squares), and an F-statistic: Component Regression (SSR) Error (SSE) Total (SST) Sum of Squares n Σ (ŷ i -y) 2 i = 1 n Σ (y i - ŷ) 2 i = 1 n Σ (y i -y) 2 i = 1 df 1 n - 2 n - 1 Mean Square SSR / 1 SSE / (n - 2) F MSSR MSSE
14 Assumptions of Regression With any parametric statistical technique, there are certain assumptions that must be satisfied before we can properly apply simple linear regression: 1. The relationship between the dependent variable (y) and the independent variable (x) is linear. Stated more formally, what this means is an equation of the form y = α + βx + ε exists that describes the relationship between the two variables for the population we have sampled. If the relationship is not linear, we may have to transform one or both of the variables before we can safely use simple linear regression
15 Assumptions of Regression 2. The errors have a mean of zero and a constant variance, i.e. the errors need to distributed evenly on either side of the regression line, and the magnitude of their dispersion about the regression line has to be reasonably constant for all values of x. This is equivalent to stating that we can assume equal variances (homoscedasticity) for all values of x. If the variation in the errors is larger for some values of x than others (e.g. linearly increasing or decreasing as x increases), we may have a situation where a linear model is not appropriate because of some non-linearity in the data
16 Assumptions of Regression 3. The values of residuals must be independent of one another, meaning that the value of one error must not be affected by the value of another error. In a practical sense, this means that you do not want to find a pattern in the distribution of the residuals from a simple linear model, because such a pattern would indicate that the model is not effectively capturing some systematic aspect of the relationship between dependent variable (y) and independent variable (x), and this is often the result of another factor that (of course) cannot be accounted for by this sort of simple model
17 Assumptions of Regression 4. For each value of x, the errors should have a normal distribution centered about the regression: For smaller samples with few values of y for each value of x, this is difficult to demonstrate or check
18 Significance Tests for Regression Parameters In addition to evaluating the overall significance of a regression model by testing the r 2 value using an F-test, we can also test the significance of individual regression parameters using t-tests These t-tests have the regression parameter in some form in the numerator, and the standard error of the regression parameter in the denominator First, we must calculate the standard error of the estimate, also known as the standard deviation of the residuals (s e ): s e = n Σi = 1 (y i - ŷ) 2 (n - 2)
19 Significance Test for Regression Slope We can formulate a t-test to test the significance of the regression slope (b) We will be testing the null hypothesis that the true value of the slope is equal to zero, e.g. H 0 : β = 0, using the following t-test: s b = t test = where s b is the standard deviation of the slope parameter: b s b s e 2 (n - 1) s x 2 and degrees of freedom = (n - 2)
20 Hypothesis Testing - Significance Test for Regression Slope Example Research question: Is the true regression slope significantly different from zero? 1. H 0 : β = 0 (Population regression line slope is zero) 2. H A : β 0 (Pop. regression line slope is not zero) 3. Select α = 0.05, two-tailed because of the wording of H 0 4. In order to compute the t-test statistic, we need to first calculate the standard error of the estimate, which is conveniently equal to the square root of the mean sum of squares for the error term (MSSE): n (y i - ŷ) s 2 e = (n - 2) = MSSE = ~ 0.06 Σi = 1
21 Hypothesis Testing - Significance Test for Regression Slope Example 4. Cont. Using the standard error of the estimate and the standard deviation of the x values, we can now calculate the standard deviation of the slope, the quantity we need for the denominator of the t-test: s e s b = = = (n - 1) s 2 (9)( ) 2 x We can now calculate the t-test statistic: t test = b s b = =
22 Hypothesis Testing - Significance Test for Regression Slope Example 5. We now need to find the critical t-value, first calculating the degrees of freedom: df = (n - 2) = 8 We can now look up the t crit value for our α (0.05 in two tails) and df = 8, t crit = t test > t crit, therefore we reject H 0, and accept H A, finding that the true value of the regression slope β is not equal to zero, thus accumulating some evidence that the variable x is important in understanding y
23 Significance Test for Regression Intercept We can formulate a similar t-test to test the significance of the regression intercept (a) We will be testing the null hypothesis that the true value of the intercept is equal to zero, e.g. H 0 : α = 0, using the following t-test: t test = where s a is the standard deviation of the intercept: a s a s a = s e 2 Σx i 2 and degrees of freedom = (n - 2) nσ(x i -x) 2
24 Hypothesis Testing - Significance Test for Regression Intercept Example Research question: Is the true regression intercept significantly different from zero? 1. H 0 : α = 0 (Population regression line intercept is zero) 2. H A : α 0 (Pop. regression line intercept is not zero) 3. Select α = 0.05, two-tailed because of the wording of H 0 4. In order to compute the t-test statistic, we need to first calculate the standard error of the estimate, which is conveniently equal to the square root of the mean sum of squares for the error term (MSSE): n (y i - ŷ) s 2 e = (n - 2) = MSSE = ~ 0.06 Σi = 1
25 Hypothesis Testing - Significance Test for Regression Intercept Example 4. Cont. In addition to the standard error of the estimate, we must first calculate the sum of squared values of x, and the sum of the squared statistical distances of x before we can calculate the standard deviation of the intercept: TVDI (x) x^2 (x - xbar) (x - xbar)^ Mean Sum Sum 0.259
26 Hypothesis Testing - Significance Test for Regression Intercept Example 4. Cont. We can now compute the standard deviation of the intercept for the denominator of the t-test: s a = s e 2 Σx i 2 nσ(x i -x) 2 = (10)(0.259) = We can now calculate the t-test statistic: t test = a s a = = 9.734
27 Hypothesis Testing - Significance Test for Regression Intercept Example 5. We now need to find the critical t-value, first calculating the degrees of freedom: df = (n - 2) = 8 We can now look up the t crit value for our α (0.05 in two tails) and df = 8, t crit = t test > t crit, therefore we reject H 0, and accept H A, finding that the true value of the regression intercept α is not equal to zero
28 Simple Linear Regression in Excel Excel can calculate regression parameters in two ways: There are built-in functions that can be entered into a cell to specify the calculation of a regression slope or regression intercept: SLOPE(array1, array2) can be used to calculate the slope of the least squares regression line, specifying the y values in array1 and the x values in array2 INTERCEPT(array1, array2) can be used to calculate the intercept of the least squares regression line, specifying the y values in array1 and the x values in array2 There is also a Data Analysis Tool that can be used to calculate the regression parameters
29 Regression Analysis Tool In the Data Analysis window, select the appropriate tool: After clicking OK, you ll be presented with the tool window:
30 Regression Analysis Tool Of course, when specifying the input ranges, you must distinguish between the dependent variable (y) and the independent variable(s) (x); this tool can also be used for multiple linear regression, so more than one x variable can be used Checking boxes in the Residuals portion of the tool will produce other output including calculating the residuals for each value, calculating standardized residuals, and plotting residuals versus independent variables, and line fit plots as well The tool will automatically test the significance of the parameters at the 95% confidence level, but if you check the checkbox and specify another confidence level, it will test the significance of the regression parameters at that level of confidence as well
31 Regression Analysis Tool The basic output the tool produces includes: The coefficient of determination (r 2 ) The standard error of the estimate (e.g. the standard deviation of the residuals), s e An ANOVA table, including the minimum α where F would be significant The regression coefficients produced by the least squares optimization (in the simple case, like this one, the intercept and the slope) The standard error associated with each parameter (e.g. for the regression slope parameter, this is s b, the standard deviation of the slope) The t-statistic and the minimum α where each parameter would be significant
32 Regression Analysis Tool Checking the Residuals checkbox will produce a table of the regression estimates (the ŷ i values) and residuals: Residual Plots creates a scatter plot of the residuals versus x (this is useful for checking assumptions about the residuals): Line Fit Plots creates a scatter plot of the actual and predicted values versus x (this is useful for getting a visual sense of the accuracy of the estimates):
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationOne-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationWEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in
More informationMean = (sum of the values / the number of the value) if probabilities are equal
Population Mean Mean = (sum of the values / the number of the value) if probabilities are equal Compute the population mean Population/Sample mean: 1. Collect the data 2. sum all the values in the population/sample.
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationUNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)
UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationCausal Forecasting Models
CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More informationRegression and Correlation
Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationOne-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate
1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
More informationCOMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk
COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationComparing Nested Models
Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationChapter 9 Descriptive Statistics for Bivariate Data
9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)
More informationThis unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
More information12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationSection 1.1 Linear Equations: Slope and Equations of Lines
Section. Linear Equations: Slope and Equations of Lines Slope The measure of the steepness of a line is called the slope of the line. It is the amount of change in y, the rise, divided by the amount of
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationPoint Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationCopyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5
Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression
More informationMULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis
Journal of tourism [No. 8] MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM Assistant Ph.D. Erika KULCSÁR Babeş Bolyai University of Cluj Napoca, Romania Abstract This paper analysis
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationMULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)
MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
More informationDealing with Data in Excel 2010
Dealing with Data in Excel 2010 Excel provides the ability to do computations and graphing of data. Here we provide the basics and some advanced capabilities available in Excel that are useful for dealing
More informationKSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationBill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationLin s Concordance Correlation Coefficient
NSS Statistical Software NSS.com hapter 30 Lin s oncordance orrelation oefficient Introduction This procedure calculates Lin s concordance correlation coefficient ( ) from a set of bivariate data. The
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.
More informationLeast Squares Regression. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.
Least Squares Regression Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.edu Spring 2006 R Notes 1 Copyright c 2006 Alan T. Arnholt 2 Least Squares
More informationTwo-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
More informationCase Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention
More information2 Sample t-test (unequal sample sizes and unequal variances)
Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing
More informationRandomized Block Analysis of Variance
Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More information