Chapter 8. Simple Linear Regression Analysis (Part-I)

Size: px
Start display at page:

Download "Chapter 8. Simple Linear Regression Analysis (Part-I)"

Transcription

1 Chapter 8 Simple Linear Regression Analysis (Part-I) In Chapter 7, we looked at whether two categorical variables were dependent on each other. You can think of dependence as being similar to a relationship i.e. if one variable changes, the other also changes. In this chapter we will look at relationships between two numeric variables instead of two categorical variables. For example, we will be concerned with questions like - is there a relationship between mortgage interest rates and demand for housing? You might guess that as interest rates decrease, the affordability of a house increases and therefore demand for housing increases. Is there a relationship between new housing demand and demand for furniture? You might guess that as more new houses are built, there will be need for more furniture and hence demand for furniture will rise. Is there a relationship between advertising expense and sales? Clearly, the more you advertise, the more likely you are to sell. In all these examples, we noticed that as one variable increased (or decreased) the other either increased or decreased. In business decision making, it helps if the decision maker knows the relationships between some variables over which he or she has control. For example, if I am the marketing manager, it will help me in my decisions if I know by how much will sales revenue increase if I spend an extra $1,000 in advertising. After all, if sales only increase by $500 when I spend an extra $1,000 in advertising, then I will be better off not spending the extra money in advertising. In regression analysis we try to learn underlying relationships between two (or more) numerical variables. Before we get into regression analysis, let us first learn two very important terms that have to do with relationships between two numeric variables. The two terms are Covariance and Correlation. Both these are essentially measures of relationship between two numeric variables. Just like the mean is a measure of central tendency and standard deviation is a measure of variation, covariance and correlation are measures of relationships between two variables. If I have some data involving two variables, I can easily calculate these measures. I say we can easily calculate these measures because Excel makes it easy. In the old days (when I was first learning Statistics), things were different. Covariance and correlation actually involve huge formulas and my Statistics teacher could not claim that they could be easily calculated. But as far as we are concerned we just need to know the name of the Excel function to calculate these two measures. The Excel function for covariance is =COVAR() and for correlation, it is =CORREL(). Computing these measures becomes as simple as computing the mean or the standard deviation. Let s look at an example in Excel. Let s say we have the following data on advertising expense (in thousands of dollars) and sales (in millions of dollars). Advertising Expense (in thousands) Sales (in millions) Figure 1: Data on Advertising vs. Sales Using =COVAR() and =CORREL() functions we can easily calculate the covariance and correlation. Please see Figure 2. 1

2 Figure 2: the COVAR and CORREL functions of Excel In Figure 2, you can see the =COVAR() and =CORREL() functions in action. Let s discuss these measures now. Note that both covariance and correlation in Figure 2 are positive. Remember this - whenever covariance is positive, correlation is also positive. Similarly whenever covariance is negative, correlation is also negative. A positive value of covariance (and hence of correlation) signifies that the relationship is positive, which means that if the value of one variable increases, the value of the other also increases. Similarly, a negative value of covariance (and hence of correlation) signifies a negative relationship, i.e., when one variable increases, the other decreases or vice versa. Note that if one value decreases, the other also decreases, we have a positive relationship. To have a negative relationship, if one value increases, the other must decrease. What about the magnitudes of covariance and correlation? In our example, the covariance is and the correlation is What information do these magnitudes convey? The value of covariance really depends on the units of measurement chosen. In our example, for instance, we are displaying advertising expense in thousands and sales in millions. Suppose we were displaying both advertising expense and sales in thousands, then the Sales column would have values like 11,000, 17,000 and so on. Consequently, the covariance value will become much larger. I want you to try this out. Try replacing the values in the Sales column so they are displayed in thousands. You will find that the covariance becomes 151,944.44, but the correlation stays the same at So the lesson to be learnt is this: the value of covariance depends on the units of measurement chosen for the variables, but the value of correlation does not. The magnitude of covariance doesn t tell us much. But the magnitude of correlation tells us a lot. How to interpret the magnitude of the correlation coefficient? Note that I used the term correlation coefficient this time. Some people like to say correlation coefficient while others just like to say correlation. It is OK to say the correlation between these two variables is 0.96 as it is to say that the correlation coefficient between these two variables is So what does a value of 0.96 tell us? I will tell you that the correlation coefficient varies between -1 and +1. If your correlation coefficient ever comes out to be either less than -1 or greater than +1, please know that you made a mistake in your calculations. Making mistakes in calculations used to be fairly common in the pre-excel days. With Excel, as long as you specified your ranges correctly, there is no chance of making a calculation error. At any rate, you should know that the correlation coefficient can never exceed 1 and can never ever be less than -1. A correlation coefficient of +1 or -1 implies a perfect correlation. It is rare to find two variables in 2

3 business or social sciences that give a perfect correlation. In the physical sciences you can find examples of perfect correlation. For example the correlation between the area of a square of some width and the area of a circle with the radius equal to the width of the square will give a perfect correlation of +1. A correlation close to +1, such as 0.96 in our example, is supposed to be a very strong correlation. It s a step below perfect correlation but any correlation coefficient value of 0.9 or above is considered very strong. A correlation coefficient of 0.7 to 0.9 is considered strong. A correlation value close to 0, such as less than 0.1 is considered a very weak correlation. What about a correlation value of -0.9 or below? Any value close to -1 is also considered very strong. The relationship just happens to be negative, i.e. when one variable decreases, the other increases, or vice versa. A negative relationship is not the same as weak relationship. A correlation coefficient close to -1 implies a very strong negative relationship. Note that these cutoffs of 0.1 or 0.7 or 0.9 that I am using are fairly arbitrary. You really have to look at the context of the problem to evaluate whether the relationship is strong or weak. Now that we understand covariance and correlation, we are almost ready to learn about regression analysis. But before we jump into regression analysis, let me talk about scatter plots. Scatter plots: A scatter plot is nothing but a graph of the data points of the two numerical variables that we are interested in. Excel makes it very easy to draw scatter plots. For the data in Figure 1 (or 2), a scatter plot looks like in Figure 3. Figure 3: A scatter plot of Advertising expense vs. Sales A scatter plot is basically a visual aid to give us an insight into the relationship between two variables. Just by looking at this graph, you can tell that there is a strong positive relationship between advertising expense and sales because the points tend to be showing an upward trend. Had the points been tending downwards, you could tell that the relationship would be negative. Also you can imagine a line running though these points. An imaginary line running through these points is called a trendline. In Excel, you can draw a trendline in a snap. Just right click on one of the points on the scatter plot and amongst the options that show up, one of the options is Add Trendline. Figure 4 shows a trendline added to the scatter plot of Figure 3. 3

4 Figure 4: Trendline on a scatter plot If the points on the scatter plot are very close around the trend line, we can say that the relationship is strong. The closer the points get to the trendline, the stronger the relationship gets and higher the value of the correlation coefficient. In Figure 4, you can see that all the points are very close to the trendline, which is why the correlation coefficient is very high (0.96). If all the points lie exactly on the trendline, then you will get a perfect correlation of 1.0. So, the slope of the trendline tells us if the relationship is positive or negative. If the slope is positive, the relationship is positive. If the slope is negative, the relationship is negative. If the slope is zero, i.e. the trendline is horizontal, there is no relationship. Think about the last statement about no relationship. When will you have a horizontal trendline in our example? When the sales are constant no matter what the advertising expense is we will have a horizontal trendline. Clearly if sales are not going to change as we change advertising expense, it would be an indication of no relationship. How close the points are to the trendline gives us an indication of how strong the relationship is. The slope of the line does not have to be very high for the relationship to be considered strong. You can have a small slope, but as long as the points on the scatterplot are gathered tightly around the trendline, you will get a high value of correlation. You may have a high value of slope, but if the points are spread far from the trendline, the correlation will be small and the relationship will be considered weak. For example, look at Figure 5. Notice that the slopes of the trendlines between Figures 4 and 5 are about the same, but in Figure 5, the points are more scattered around the trendline than in Figure 4. Notice that as a result of this scattering, the correlation coefficient has fallen to 0.71 from So now we know why this is called a scatter plot. 4

5 Figure 5: A weaker relationship For an example of a smaller slope but a strong relationship, please look at Figure 6. The slope of the trendline is much smaller compared to that in Figures 4 and 5, but because the points are very close to the trendline, the correlation coefficient is very high (0.97) Figure 6: A strong relationship but smaller slope A summary of what has been discussed so far: 1. Covariance and Correlation are two measures of relationship between two numeric variables. 2. Whenever covariance is positive, correlation is also positive, and vice versa. 3. The sign of covariance and correlation tell us the direction of the relationship. i.e. a positive covariance and correlation tell us that the relationship is positive and vice versa. 4. The magnitude of covariance doesn t tell us much story because covariance varies depending on the units of the variable chosen. 5. The magnitude of correlation tells us a lot about the strength of the relationship because a correlation coefficient can only range from -1 and +1. A value of -1 implies a perfect negative relationship and a value of +1 implies a perfect positive relationship. A value of close to either +1 5

6 or -1, such as 0.9 or -0.9 implies a very strong relationship. A value close to 0 implies a very weak relationship. 6. A scatter plot gives a good visual cue about the direction and the strength of a relationship. 7. A positive slope of trendline on a scatterplot implies a positive relationship and vice versa. 8. The strength of a relationship can be determined visually by looking at how tightly the points are gathered around the trendline. Tighter the points around the trendline, the stronger the relationship and vice versa. Now we are ready to discuss Regression Analysis. We have done enough groundwork to understand Regression Analysis. Usually, when we talk about Regression Analysis, we first talk about simple regression and then we talk about multiple regression. Also, we first talk about linear regression and then nonlinear regression. So really, we first talk about a simple linear regression, then multiple linear regression, then simple nonlinear regression and finally multiple nonlinear regression. But in this course, we will not talk about nonlinear regression. So we will first talk about simple linear regression and then multiple linear regression. Simple vs. Multiple Regression In simple regression we are interested in relationships between two variables, an independent variable and a dependent variable. In our example of advertising expense and sales, please convince yourself that advertising expense is the independent variable because it happens first and sales is the dependent variable because its value depends on the amount of advertising. In multiple regression, we deal with several independent variables and one dependent variable. For example, sales revenue, which can be considered a dependent variable, does not merely depend on the advertising expense. It depends on many other factors, such as the product price, product quality, price of competing products and quality of competing products, brand loyalty etc. In multiple regression we study the effect of several independent variables on the dependent variable. Please note that in multiple regression, there are multiple independent variables but only one dependent variable. Intuitively, many causes for a single effect. Linear vs Non-Linear Regression In linear regression, the trendline is a straight line. In nonlinear regression, the trendline may be nonlinear such as U-shaped or S-shaped or some other non-linear shape. There are many variables in which the relationship may be non-linear. For instance, in our example of advertising expense and sales revenue, you can imagine that beyond a certain expenditure in advertising, the sales will not grow any further. So while for certain range of advertising expense the sales may be linearly related, beyond a certain point, the linearity of the relationship may not hold any more. But since we will not be covering any non-linear regression, we will not discuss it any more. I just wanted to give you a feel for what is involved in non-linear regression. Simple Linear Regression So let s finally talk about Simple Linear Regression. Whenever we consider two variables of interest that are related to each other, we identify one of them as being independent and the other as being dependent. This is important. It is impossible to do regression analysis unless we first establish 6

7 our independent and dependent variables. Once we establish this, we are interested in quantifying their relationship. So we obtain a sample of data points and using this sample, we try to estimate or infer the relationship between the two variables. Please note that the relationship we are interested in is for the population of all possible data points for the two variables. But since it is virtually impossible to obtain all possible data points for the two variables (that would be the whole possible population), we work with a sample and obtain the relationship for the sample data and infer the relationship for the population. Note that this is similar to what we learnt earlier about making inferences about a population using sample data. For example when we wanted to estimate the mean of a population, we obtained a sample and then we computed a point estimate and an interval estimate for the population mean. Similarly in regression analysis, we will first obtain the sample data and calculate the point and interval estimates of the slope. The difference is, now our data items will be pairs of numbers, (x,y)s. Regression and Trendline: We have already seen the concept of a scatterplot and a trendline. They give us a visual clue about the linear relationship between two variables. In simple linear regression we basically try to find the mathematical equation of the trendline. Before we talk any further, let me give you a quick review of mathematical equation of a line. A review of mathematical equation of a line: Y 5 Y= X X Figure 7: A straight line between X and Y variables Let s say our independent variable is X and the dependent variable is Y. We always show the independent variable on the horizontal or the X-axis and the dependent variable on the vertical or the Y- axis. Suppose the red line, above is the line for which we want the mathematical equation. This line has two characteristics. The first characteristic is that it has a slope. I.e. if x increases by one unit, the y increases (or decreases) by a certain amount. In Figure 7, we can tell that if x increases by one unit, that y increases by about half a unit. So we say that the slope of this line is roughly ½ (or half). The other characteristic is that it crosses (or intercepts) the y-axis at a certain point. In Figure 7, we can see that the red line intercepts the y-axis at about 2.5. Once we know these two characteristics of a line, we can completely describe a line. The equation of the red line in Figure 7 is: Y = X The equation of a straight line is of the form Y = a + bx, where a is the y-intercept and b is the slope. Knowing a and b about any line will completely describe the line. In other words, if I gave you 7

8 the a and b values for a line that I have in mind, you can uniquely draw the line that I have in mind. There is only one way that a line with a given a and b values can be drawn. Let s suppose I give you a = 4 and b = -1. The line for these two parameters is shown in Figure 8. Y 5 Y= 4 - X X Figure 8: A straight line between X and Y variables where the y-intercept is 4 and slope is -1 Trendlines and regression line: In Figures 4, 5 and 6, we looked at trendlines. These trendlines were drawn using Excel. Please note that if I have a scatter plot and if I ask you to draw a line through the scatter points such that the line represents the trend of the scatter plot, you will draw one of many possible lines. If I asked two people to draw a trendline (without using Excel of course), it is very unlikely that the y-intercept and the slope of both these lines are exactly the same. They will be close, but not exactly the same. If I asked ten different people to draw a trendline, I will get ten different lines. Theoretically there are an infinite number of such lines, one for every possible combination of y-intercept and slope. But given any two such lines drawn by two random people, one will be better than the other. So if I have ten lines, one of them will be better than the other nine. When Excel draws the trendline, it draws a line that cannot be beaten by anyone. It uses a complex algorithm to come up with a trendline which is the best amongst an infinite number of possible lines. How do we define best? For any line, there are deviations from the line to each of the points. Some deviations are positive and some are negative. If we square these deviations, the squared deviations become positive. The sum of all the squared deviations can be considered a measure of fit of the trend line to the scatterplot. Using this measure, we can compare two lines. Smaller this measure, the better the fit. For any line the sum of the squared deviation can be computed using a formula. Definition of "best-fit" for a trendline: Given two lines, whichever line has a smaller sum of the squared deviations is considered better than the other. If for a line, the sum of the squared deviations is the least of all possible lines, that line would be the best of all possible lines. Such a line is called the least squared deviation line or simply least squares line. When Excel draws a trendline on a scatter plot, it draws the least squares line. The least squares line is also called the Regression line. 8

9 In fact, in Excel, after you add a trendline to a scatter plot, you can even request Excel to display the equation of the trendline. Just double click on the trendline and a new dialog box shows up ; towards the bottom of this dialog box you will see the option to Display Equation on Chart. Figure 9 shows the same graph as in Figure 3, except that the trendline and the equation are also displayed. Figure 9: Equation of trendline displayed on the chart (y = x ) Note that when Excel displays the equation of the trendline, it shows the y-intercept as the second term. So in Figure 9, the y-intercept is and the slope is Since the trendline drawn by Excel does not extend to the y-axis, I have superimposed a line over the trendline so that you can see that intercepts the y-axis at about 3.6. Also I have superimposed an oval around the equation to highlight it. Excel does not draw this oval. The slope of implies that if we spend one extra unit of advertising expense (in other words, 1 thousand dollars), that the sales revenue will increase by units or 82,300 dollars because recall that one unit of sales is a million dollars so million dollars is 82,300 dollars. A Summary of what has been discussed since the last summary: 1. Given a scatter plot of points, many lines can be drawn through them. 2. For each such line the sum of the squared deviations from each point can be calculated. The line with the least Sum of the squared deviations is called the least squares line or the regression line. 3. The least square line can be described by the y-intercept and the slope. 4. The trendline created by Excel, is the least squares line. 5. Excel can display the equation of the trendline which also happens to be the regression or the least squares line. 9

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Relationships Between Two Variables: Scatterplots and Correlation

Relationships Between Two Variables: Scatterplots and Correlation Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)

More information

Elements of a graph. Click on the links below to jump directly to the relevant section

Elements of a graph. Click on the links below to jump directly to the relevant section Click on the links below to jump directly to the relevant section Elements of a graph Linear equations and their graphs What is slope? Slope and y-intercept in the equation of a line Comparing lines on

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables

More information

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Unit 9 Describing Relationships in Scatter Plots and Line Graphs Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

CORRELATION ANALYSIS

CORRELATION ANALYSIS CORRELATION ANALYSIS Learning Objectives Understand how correlation can be used to demonstrate a relationship between two factors. Know how to perform a correlation analysis and calculate the coefficient

More information

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6 WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in

More information

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces Or: How I Learned to Stop Worrying and Love the Ball Comment [DP1]: Titles, headings, and figure/table captions

More information

Elasticity. I. What is Elasticity?

Elasticity. I. What is Elasticity? Elasticity I. What is Elasticity? The purpose of this section is to develop some general rules about elasticity, which may them be applied to the four different specific types of elasticity discussed in

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

The Big Picture. Correlation. Scatter Plots. Data

The Big Picture. Correlation. Scatter Plots. Data The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered

More information

Linear functions Increasing Linear Functions. Decreasing Linear Functions

Linear functions Increasing Linear Functions. Decreasing Linear Functions 3.5 Increasing, Decreasing, Max, and Min So far we have been describing graphs using quantitative information. That s just a fancy way to say that we ve been using numbers. Specifically, we have described

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

Describing Relationships between Two Variables

Describing Relationships between Two Variables Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took

More information

Regression and Correlation

Regression and Correlation Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Review of Fundamental Mathematics

Review of Fundamental Mathematics Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools

More information

Dealing with Data in Excel 2010

Dealing with Data in Excel 2010 Dealing with Data in Excel 2010 Excel provides the ability to do computations and graphing of data. Here we provide the basics and some advanced capabilities available in Excel that are useful for dealing

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position Chapter 27: Taxation 27.1: Introduction We consider the effect of taxation on some good on the market for that good. We ask the questions: who pays the tax? what effect does it have on the equilibrium

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Formula for linear models. Prediction, extrapolation, significance test against zero slope. Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation

More information

Solving Quadratic Equations

Solving Quadratic Equations 9.3 Solving Quadratic Equations by Using the Quadratic Formula 9.3 OBJECTIVES 1. Solve a quadratic equation by using the quadratic formula 2. Determine the nature of the solutions of a quadratic equation

More information

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b. PRIMARY CONTENT MODULE Algebra - Linear Equations & Inequalities T-37/H-37 What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

The Point-Slope Form

The Point-Slope Form 7. The Point-Slope Form 7. OBJECTIVES 1. Given a point and a slope, find the graph of a line. Given a point and the slope, find the equation of a line. Given two points, find the equation of a line y Slope

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

Eight things you need to know about interpreting correlations:

Eight things you need to know about interpreting correlations: Research Skills One, Correlation interpretation, Graham Hole v.1.0. Page 1 Eight things you need to know about interpreting correlations: A correlation coefficient is a single number that represents the

More information

PLOTTING DATA AND INTERPRETING GRAPHS

PLOTTING DATA AND INTERPRETING GRAPHS PLOTTING DATA AND INTERPRETING GRAPHS Fundamentals of Graphing One of the most important sets of skills in science and mathematics is the ability to construct graphs and to interpret the information they

More information

Graphical Integration Exercises Part Four: Reverse Graphical Integration

Graphical Integration Exercises Part Four: Reverse Graphical Integration D-4603 1 Graphical Integration Exercises Part Four: Reverse Graphical Integration Prepared for the MIT System Dynamics in Education Project Under the Supervision of Dr. Jay W. Forrester by Laughton Stanley

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Curve Fitting in Microsoft Excel By William Lee

Curve Fitting in Microsoft Excel By William Lee Curve Fitting in Microsoft Excel By William Lee This document is here to guide you through the steps needed to do curve fitting in Microsoft Excel using the least-squares method. In mathematical equations

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Reflection and Refraction

Reflection and Refraction Equipment Reflection and Refraction Acrylic block set, plane-concave-convex universal mirror, cork board, cork board stand, pins, flashlight, protractor, ruler, mirror worksheet, rectangular block worksheet,

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Years after 2000. US Student to Teacher Ratio 0 16.048 1 15.893 2 15.900 3 15.900 4 15.800 5 15.657 6 15.540

Years after 2000. US Student to Teacher Ratio 0 16.048 1 15.893 2 15.900 3 15.900 4 15.800 5 15.657 6 15.540 To complete this technology assignment, you should already have created a scatter plot for your data on your calculator and/or in Excel. You could do this with any two columns of data, but for demonstration

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1 Implicit Functions Defining Implicit Functions Up until now in this course, we have only talked about functions, which assign to every real number x in their domain exactly one real number f(x). The graphs

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable

More information

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions. Unit 1 Number Sense In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions. BLM Three Types of Percent Problems (p L-34) is a summary BLM for the material

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

The fundamental question in economics is 2. Consumer Preferences

The fundamental question in economics is 2. Consumer Preferences A Theory of Consumer Behavior Preliminaries 1. Introduction The fundamental question in economics is 2. Consumer Preferences Given limited resources, how are goods and service allocated? 1 3. Indifference

More information

Acceleration of Gravity Lab Basic Version

Acceleration of Gravity Lab Basic Version Acceleration of Gravity Lab Basic Version In this lab you will explore the motion of falling objects. As an object begins to fall, it moves faster and faster (its velocity increases) due to the acceleration

More information

Chapter 9 Descriptive Statistics for Bivariate Data

Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

More information

Determine If An Equation Represents a Function

Determine If An Equation Represents a Function Question : What is a linear function? The term linear function consists of two parts: linear and function. To understand what these terms mean together, we must first understand what a function is. The

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches) PEARSON S FATHER-SON DATA The following scatter diagram shows the heights of 1,0 fathers and their full-grown sons, in England, circa 1900 There is one dot for each father-son pair Heights of fathers and

More information

Updates to Graphing with Excel

Updates to Graphing with Excel Updates to Graphing with Excel NCC has recently upgraded to a new version of the Microsoft Office suite of programs. As such, many of the directions in the Biology Student Handbook for how to graph with

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

Graphing Parabolas With Microsoft Excel

Graphing Parabolas With Microsoft Excel Graphing Parabolas With Microsoft Excel Mr. Clausen Algebra 2 California State Standard for Algebra 2 #10.0: Students graph quadratic functions and determine the maxima, minima, and zeros of the function.

More information

Polynomial and Rational Functions

Polynomial and Rational Functions Polynomial and Rational Functions Quadratic Functions Overview of Objectives, students should be able to: 1. Recognize the characteristics of parabolas. 2. Find the intercepts a. x intercepts by solving

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such

More information

Section 1.1 Linear Equations: Slope and Equations of Lines

Section 1.1 Linear Equations: Slope and Equations of Lines Section. Linear Equations: Slope and Equations of Lines Slope The measure of the steepness of a line is called the slope of the line. It is the amount of change in y, the rise, divided by the amount of

More information

The Graphical Method: An Example

The Graphical Method: An Example The Graphical Method: An Example Consider the following linear program: Maximize 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2 0, where, for ease of reference,

More information

The KaleidaGraph Guide to Curve Fitting

The KaleidaGraph Guide to Curve Fitting The KaleidaGraph Guide to Curve Fitting Contents Chapter 1 Curve Fitting Overview 1.1 Purpose of Curve Fitting... 5 1.2 Types of Curve Fits... 5 Least Squares Curve Fits... 5 Nonlinear Curve Fits... 6

More information

Introduction to the Smith Chart for the MSA Sam Wetterlin 10/12/09 Z +

Introduction to the Smith Chart for the MSA Sam Wetterlin 10/12/09 Z + Introduction to the Smith Chart for the MSA Sam Wetterlin 10/12/09 Quick Review of Reflection Coefficient The Smith chart is a method of graphing reflection coefficients and impedance, and is often useful

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

0 Introduction to Data Analysis Using an Excel Spreadsheet

0 Introduction to Data Analysis Using an Excel Spreadsheet Experiment 0 Introduction to Data Analysis Using an Excel Spreadsheet I. Purpose The purpose of this introductory lab is to teach you a few basic things about how to use an EXCEL 2010 spreadsheet to do

More information

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1. Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line.

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line. Objective # 6 Finding the slope of a line Material: page 117 to 121 Homework: worksheet NOTE: When we say line... we mean straight line! Slope of a line: It is a number that represents the slant of a line

More information

Because the slope is, a slope of 5 would mean that for every 1cm increase in diameter, the circumference would increase by 5cm.

Because the slope is, a slope of 5 would mean that for every 1cm increase in diameter, the circumference would increase by 5cm. Measurement Lab You will be graphing circumference (cm) vs. diameter (cm) for several different circular objects, and finding the slope of the line of best fit using the CapStone program. Write out or

More information

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Linear Programming. Solving LP Models Using MS Excel, 18

Linear Programming. Solving LP Models Using MS Excel, 18 SUPPLEMENT TO CHAPTER SIX Linear Programming SUPPLEMENT OUTLINE Introduction, 2 Linear Programming Models, 2 Model Formulation, 4 Graphical Linear Programming, 5 Outline of Graphical Procedure, 5 Plotting

More information

USING EXCEL ON THE COMPUTER TO FIND THE MEAN AND STANDARD DEVIATION AND TO DO LINEAR REGRESSION ANALYSIS AND GRAPHING TABLE OF CONTENTS

USING EXCEL ON THE COMPUTER TO FIND THE MEAN AND STANDARD DEVIATION AND TO DO LINEAR REGRESSION ANALYSIS AND GRAPHING TABLE OF CONTENTS USING EXCEL ON THE COMPUTER TO FIND THE MEAN AND STANDARD DEVIATION AND TO DO LINEAR REGRESSION ANALYSIS AND GRAPHING Dr. Susan Petro TABLE OF CONTENTS Topic Page number 1. On following directions 2 2.

More information

Plots, Curve-Fitting, and Data Modeling in Microsoft Excel

Plots, Curve-Fitting, and Data Modeling in Microsoft Excel Plots, Curve-Fitting, and Data Modeling in Microsoft Excel This handout offers some tips on making nice plots of data collected in your lab experiments, as well as instruction on how to use the built-in

More information

Math 132. Population Growth: the World

Math 132. Population Growth: the World Math 132 Population Growth: the World S. R. Lubkin Application If you think growth in Raleigh is a problem, think a little bigger. The population of the world has been growing spectacularly fast in the

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

A Guide to Using Excel in Physics Lab

A Guide to Using Excel in Physics Lab A Guide to Using Excel in Physics Lab Excel has the potential to be a very useful program that will save you lots of time. Excel is especially useful for making repetitious calculations on large data sets.

More information

Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information