Causal Forecasting Models

Size: px
Start display at page:

Download "Causal Forecasting Models"

Transcription

1 CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics

2 Causal Models Used when demand is correlated with some known and measurable environmental factor. Demand (y) is a function of some variables (x1, x2,... xk) Dependent Variable Independent Variables Disposable Diapers ~ f(births, household income) Car Repair Parts ~ f(weather/snow) Promoted Item ~f(discount, placement, advertisements) 2

3 Agenda Simple Linear Regression Regression in Spreadsheets Multiple Linear Regression Model Transformations Model Fit and Validity 3

4 Simple Linear Regression 4

5 Example: Simple Linear Regression Recall from earlier lecture on exponential smoothing Estimating initial parameters for Holt-Winter (level, trend, seasonality) Removed seasonality in order to estimate initial level and trend y i = β 0 + β 1 x i Deseasoned Daily Bagel Demand y = x Time Period (Days) Y i = β 0 + β 1 x i +ε i for i =1,2,...n Observed Unknown EY ( x) = β + β x StdDev( Y x) 0 1 = σ 5

6 Simple Linear Regression The relationship is described in terms of a linear model The data (x i, y i ) are the observed pairs from which we try to estimate the Beta coefficients to find the best fit The error term, ε, is the unaccounted or unexplained portion The error terms are assumed to be iid ~N(0,σ) Deseasoned Daily Bagel Demand y = x Observed demand for period 97 = y 97 = 204 Error (residual) for period 97 = ε 97 = y 97 - ŷ 97 = = 15.6 Estimated demand for period 97 = ŷ 97 = (97) Time Period (Days) 6

7 Simple Linear Regression Residuals or Error Terms Residuals, e i, are the difference of actual minus predicted values Find the b s that minimize the residuals ŷ i = b 0 + b 1 x i for i =1,2,...n e i = y i ŷ i = y i b 0 b 1 x i for i =1,2,...n How should we minimize the residuals? Min sum of errors - shows bias, but not accurate Min sum of absolute error - accurate & shows bias, but intractable Min sum of squares of error shows bias & is accurate ( 2 n e ) n ( ˆ ) n ( 0 1 ) 1 i = y i i 1 i yi y i 1 i b bx = = = = i 2 2 7

8 Simple Linear Regression Ordinary Least Squares (OLS) Regression Finds the coefficients (b 0 and b 1 ) that minimize the sum of the squared error terms. We can use partial derivatives to find the first order optimality condition with respect to each variable. 2 n ( e ) n ( ˆ ) n ( 0 1 ) 1 i = y i i 1 i yi y i 1 i b bx = = = = i b 1 = n i=1 (x i x)( y i y) n i=1 b 0 = y b 1 x (x i x) 2 We know from the data: x = 89.5 y =157.4 Deseasoned Daily Bagel Demand y = x Time Period (Days) 8

9 OLS Regression in Spreadsheet 9

10 Regression By Hand Original data (y in column A, x in column B) b 1 = n i=1 (x i x)( y i y) n i=1 b 0 = y b 1 x (x i x) 2 =B7-$B$22 =A7-$A$22 =C7*D7 =C7^2 =SUM(C2:C21) =SUM(E2:E21) =SUM(D2:D21) =SUM(F2:F21) =F22/E22 =A22-B24*B22 =AVERAGE(A2:A21) =AVERAGE(B2:B21) Regression Equation y = b 0 +b 1 x y= x 10

11 Regression Using LINEST function Original data (y in column A, x in column B) =LINEST(A2:A21,B2:B21,1,1) LINEST(known_y's, known_x's, constant, statistics) b 1 b 0 s b1 s b0 R 2 s e F d f SSR SSE The LINEST is an array function Receives and returns data to multiple cells The equation will be bookended by {} brackets when active While the function is the same in both LibreOffice and Excel, activating it differs slightly. LibreOffice Type the formula into cell D2 and press the keyboard combination Ctrl+Shift+Enter (for Windows & Linux) or command+shift+return (for Mac OS X). Excel Select a range of 2 columns by 5 rows, in this case (D2:E6). Then, in the 'Insert Function' area, type the formula and press the keyboard combination Ctrl+Shift+Enter (for Windows & Linux) or command+shift+return (for Mac OS X). 11

12 Regression Using LINEST b 1 b 0 n = number of observations k = number of explanatory variables (NOT intercept) d f = degrees of freedom (n-k-1) b 0 = estimate of the intercept b 0 = y b 1 x b 1 = estimate of the slope (explanatory variable 1) b 1 = n i=1 (x i x)( y i y) n i=1 (x i x) 2 s b1 R 2 F SSR s b0 s e d f SSE Goodness of fit of the model proportion of the variation in Y which is explained by X Total Sum of Squares (SST) n ( y i y) 2 = ( ŷ i y) 2 + y i ŷ i=1 n i=1 n i=1 ( ) 2 Explained Portion Sum of Squares of Regression (SSR) Unexplained Portion Sum of Squares of the Error (SSE) R 2 = Coefficient of Determination: the ratio of explained to total sum of squares where 0 R 2 1 R 2 = SSR SST = SSR SSR + SSE 12

13 Regression Using LINEST b 1 b 0 s e = standard error of estimate: an estimate of variance of the error term around the regression line. s e = n 2 e i=1 i n k 1 = s b0 = standard error of intercept ( y i ŷ i ) 2 n k 1 n i=1 s b1 = standard error of slope s b1 R 2 F SSR s b0 s e d f SSE 1 s b0 = s e n + x 2 n x i x i=1 ( ) 2 1 s b1 = s e n ( x i x) 2 i=1 How significant is the explanatory variable? Is it different from zero? - Test the null hypothesis H 0 : b 1 =0 with alternate hypothesis H A : b Use two-tailed t-test =TDIST(t_statistic, d f, number tails) - always use 2 tail test - Accepted thresholds for p-value 0.01, 0.05, or 0.10 (meaning we can reject the H 0 with 99%, 95%, and 90% probability respectively) t b1 = b 1 s b1 = =15.33 p-value =TDIST(15.33, 18, 2) = 8.92x10-12 <

14 b 1 b 0 Regression Using LINEST Original data (y in column A, x in column B) s b1 R 2 F SSR s b0 s e d f SSE =LINEST(A2:A21,B2:B21,1,1) LINEST(known_y's, known_x's, constant, statistics) =D2/D3 =TDIST(D9,$E$5,2) 1. How is the overall fit of the model? Look at Coefficient of Determination R 2 No hard rules, but 0.70 is preferred 2. Are the individual variables statistically significant? Use t-test for each explanatory variable Lower p-value is better Generally used threshold values include 0.10, 0.05,

15 Multiple Linear Regression 15

16 Example: Monthly Iced Coffee Sales Develop Forecasting Model #1 Level, trend, & avg. historical temperature Develop OLS regression model Y i = β 0 + β 1 x 1i + β 2 x 2i +ε i DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp) Using LINEST function Follow earlier directions {=LINEST(A2:A25,B2:C25,1,1)} When activating, expand area to five (5) rows by k+1 columns Output shifts for new variables w Top right is always b 0 w Bottom left six cells don t change Output (0.27) , #N/A #N/A 3,175, ,074 #N/A b 2 b 1 b 0 s b2 s b1 s b0 R 2 s e F d f SSR SSE 16

17 Example: Monthly Iced Coffee Sales How is the overall fit of the model? R 2 = 0.78 or 78% Are the individual variables statistically significant? (0.27) , #N/A #N/A 3,175, ,074 #N/A Run t-tests for each variable and the intercept intercept t b0 = b 0 s b0 = =18.60 b 2 b 1 b 0 s b2 s b1 s b0 R 2 F s e d f SSR SSE trend t b1 = b 1 s b1 = = 8.44 n= 24 observations k = 2 variables d f = n-k-1 = = 21 temperature effect t b2 = b 2 s b2 = = P-value =TDIST(18.6, 21, 2) < P-value =TDIST(8.44, 21, 2) < P-value =TDIST(0.27, 21, 2) = Both the intercept and trend coefficients are significant Temperature effect is not, we cannot reject the H 0 What next? Try the model without the temperature effect 17

18 Example: Monthly Iced Coffee Sales Develop Forecasting Model #2 Y i = β 0 + β 1 x 1i +ε i Using LINEST function Level and trend Develop OLS regression model DEMAND = LEVEL + TREND(period) Follow earlier directions {=LINEST(A2:A25,B2:B25,1,1)} b 1 b 0 s b1 s b0 R 2 s e F d f SSR SSE Model fit? R 2 =0.78 Variables? p-value for b 0 and b 1 are both <

19 Example: Monthly Iced Coffee Sales Compare the goodness of fit between models Model 1: w DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp) w R 2 = Model 2: w DEMAND = LEVEL + TREND(period) w R 2 = If Model #2 is better, why is the R 2 lower? R 2 will never get worse (and will usually improve) by adding more variables even bad ones! Need to modify the metric adjusted R 2 w Model 1: adj R 2 = 1 ( )(23/21) = w Model 2: adj R 2 = 1 ( )(23/22) = ( ) adj R 2 =1 1 R 2 " n 1 % $ ' # n k 1& 19

20 Transforming Variables 20

21 Example: Monthly Iced Coffee Sales Develop Forecasting Model #3 Level, trend, & school being open Develop OLS regression model Y i = β 0 + β 1 x 1i + β 3 x 3i +ε i DEMAND = LEVEL + TREND(period) + OPEN_EFFECT(open) Need to create Dummy Variable x 3i = 1 if School is in Session, =0 otherwise Interpret β 3 as increase (decrease) in demand when school is in session Using LINEST function {=LINEST(A2:A25,B2:C25,1,1)} #N/A #N/A #N/A b 3 b 1 b 0 s b3 s b1 s b0 R 2 s e F d f SSR SSE 21

22 Example: Monthly Iced Coffee Sales (#3) How is the overall fit of the model? R 2 = with adj R (better than #1 or #2) Are the individual variables statistically significant? Run t-tests for each variable and the intercept Intercept and trend coefficients are strongly significant, school flag is borderline intercept t b0 = b 0 s b0 = = P-value =TDIST(32.77, 21, 2) < Let s interpret this: trend t b1 = b 1 s b1 = = 8.87 P-value =TDIST(8.87, 21, 2) < n= 24 observations k = 2 variables d f = n-k-1 = = 21 school in session t b3 = b 3 s b3 = =1.69 P-value =TDIST(1.69, 21, 2) = We are forecasting a monthly demand level of 3,156 iced coffees with a monthly trend of ~52 additional cups each month and an increase of ~145 cups whenever school is in session #N/A #N/A #N/A My forecast for sales: w January year 3 = (51.5) (0) = 4444 w February year 3 = (51.5) (1) = 4640 b 3 b 1 b 0 s b3 s b1 s b0 R 2 F s e d f SSR SSE Demand = (t) + 145(if in session) 22

23 Model & Variable Transformations We are using linear regression, so how can we use dummy variables? The model just needs to be linear in the parameters For model #3: y = β 0 +β 1 (period)+β 3 (open_flag) Many transformations can be used: y = β 0 + β 1 x 1 y = β 0 + β 1 x 1 + β 2 x 1 2 y = β 0 + β 1 ln(x 1 ) y = ax b ln( y) = ln(a) + bln(x) y = ax 1 b1 x 2 b2 ln( y) = ln(a) + b 1 ln(x 1 ) + b 2 ln(x 2 ) Transformations and dummy variables allow for many models For example: w x 4i = (x 3i )*(temperature) if sales increase with temperature when school is in session w x 5i = 1 if competing store runs a sale, =0 otherwise w x 6i = x 1i 2, so that we can capture a tapering effect to the linear trend But, be careful on interpretation of results 23

24 Model Fit & Validation

25 Model Validation Basic Checks Goodness of Fit look at the R 2 values Individual coefficients t-tests for p-value Additional Assumption Checks Normality of residuals look at histogram Heteroscedasticity look at scatter plot of residuals w Does the standard deviation of the error terms differ for different values of the independent variables? Autocorrelation is there a pattern over time w Are the residuals not independent? Multi-Collinearity look at correlations w Are the independent variables correlated? w Make sure dummy variables were not over specified Statistics Software Most packages check for all of these More sophisticated tests and remedies Number of Months Residuals Residuals (100) (200) (300) (400) Residuals Temperature (F) Time Periods 25

26 Modeling Results which is best? 8 7 Model: Linear Equation Sales Sales y = 0.4x R² = Time Time Model: Quadratic Equation Model: Cubic Equation Sales y = 0.5x 2-2.1x + 7 R² = Time Sales y = x 3-9.5x x - 7 R² = Time Avoid over-fitting. Objective is to forecast demand for planning purposes. 26

27 Key Points from Lesson 27

28 Key Points Y i = β 0 + β 1 x 1i + β 2 x 2i +ε i Regression finds correlations between A single dependent variable (y) One or more independent variables (x 1, x 2, ) Coefficients are estimates by minimizing the sum of the squares of the errors Always test your model: Goodness of fit (R 2 ) Statistical significance of coefficients (p-value) Some Warnings: Correlation is not causation Avoid over-fitting of data Why not use this instead of exponential smoothing? All data treated the same Amount of data required to store 28

29 CTL.SC1x -Supply Chain & Logistics Fundamentals Questions, Comments, Suggestions? Use the Discussion! Casey Photo courtesy Yankee Golden Retriever Rescue ( MIT Center for Transportation & Logistics

30 Image Sources "Pampers packages ( )" by Elizabeth from Lansing, MI, USA - Pampers packagesuploaded by Dolovis. Licensed under Creative Commons Attribution 2.0 via Wikimedia Commons - File:Pampers_packages_( ).jpg "Kraft Coupon" by Julie & Heidi from West Linn & Gillette, USA - Grocery Coupons - Tearpad shelf display of coupons for Kraft Macaroni & Cheese. Licensed under Creative Commons Attribution-Share Alike 2.0 via Wikimedia Commons - "Damaged car door" by Garitzko - Own work. Licensed under Public domain via Wikimedia Commons

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Outline: Demand Forecasting

Outline: Demand Forecasting Outline: Demand Forecasting Given the limited background from the surveys and that Chapter 7 in the book is complex, we will cover less material. The role of forecasting in the chain Characteristics of

More information

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480 1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Module 6: Introduction to Time Series Forecasting

Module 6: Introduction to Time Series Forecasting Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Forecasting in supply chains

Forecasting in supply chains 1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Categorical Data Analysis

Categorical Data Analysis Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative

More information

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Submission for ARCH, October 31, 2006 Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Jed L. Linfield, FSA, MAAA, Health Actuary, Kaiser Permanente,

More information

03 The full syllabus. 03 The full syllabus continued. For more information visit www.cimaglobal.com PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

03 The full syllabus. 03 The full syllabus continued. For more information visit www.cimaglobal.com PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS 0 The full syllabus 0 The full syllabus continued PAPER C0 FUNDAMENTALS OF BUSINESS MATHEMATICS Syllabus overview This paper primarily deals with the tools and techniques to understand the mathematics

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Forecasting Methods What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Prod - Forecasting Methods Contents. FRAMEWORK OF PLANNING DECISIONS....

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis In this guide I will make use of Microsoft Excel in the examples and explanations. This should not be taken as an endorsement of Microsoft or its products. In fact, there are

More information

Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review

Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review Mgmt 469 Regression Basics You have all had some training in statistics and regression analysis. Still, it is useful to review some basic stuff. In this note I cover the following material: What is a regression

More information

16 : Demand Forecasting

16 : Demand Forecasting 16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical

More information

Introducing the Linear Model

Introducing the Linear Model Introducing the Linear Model What is Correlational Research? Correlational designs are when many variables are measured simultaneously but unlike in an experiment none of them are manipulated. When we

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data In the measurement of the Spin-Lattice Relaxation time T 1, a 180 o pulse is followed after a delay time of t with a 90 o pulse,

More information

Introduction to Linear Regression

Introduction to Linear Regression 14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

More information

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Excel Tutorial Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Working with Data Entering and Formatting Data Before entering data

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

TIME SERIES ANALYSIS & FORECASTING

TIME SERIES ANALYSIS & FORECASTING CHAPTER 19 TIME SERIES ANALYSIS & FORECASTING Basic Concepts 1. Time Series Analysis BASIC CONCEPTS AND FORMULA The term Time Series means a set of observations concurring any activity against different

More information

3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

More information

Regression Analysis (Spring, 2000)

Regression Analysis (Spring, 2000) Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Using Excel for Statistics Tips and Warnings

Using Excel for Statistics Tips and Warnings Using Excel for Statistics Tips and Warnings November 2000 University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID Contents 1. Introduction 3 1.1 Data Entry and

More information

Module 5: Statistical Analysis

Module 5: Statistical Analysis Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

More information

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015 Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

8. Time Series and Prediction

8. Time Series and Prediction 8. Time Series and Prediction Definition: A time series is given by a sequence of the values of a variable observed at sequential points in time. e.g. daily maximum temperature, end of day share prices,

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

MICROSOFT EXCEL 2007-2010 FORECASTING AND DATA ANALYSIS

MICROSOFT EXCEL 2007-2010 FORECASTING AND DATA ANALYSIS MICROSOFT EXCEL 2007-2010 FORECASTING AND DATA ANALYSIS Contents NOTE Unless otherwise stated, screenshots in this book were taken using Excel 2007 with a blue colour scheme and running on Windows Vista.

More information