Causal Forecasting Models

Similar documents

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Simple Linear Regression Inference

Part 2: Analysis of Relationship Between Two Variables

2. Simple Linear Regression

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

17. SIMPLE LINEAR REGRESSION II

Module 5: Multiple Regression Analysis

Exercise 1.12 (Pg )

Regression Analysis: A Complete Example

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Outline: Demand Forecasting

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

1 Simple Linear Regression I Least Squares Estimation

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Univariate Regression

5. Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

5. Multiple regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Introduction to Regression and Data Analysis

Chapter 7: Simple linear regression Learning Objectives

Simple linear regression

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Statistical Functions in Excel

Final Exam Practice Problem Answers

Multiple Linear Regression in Data Mining

Factors affecting online sales

Multiple Linear Regression

A Primer on Forecasting Business Performance

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Regression step-by-step using Microsoft Excel

Module 6: Introduction to Time Series Forecasting

Simple Predictive Analytics Curtis Seare

Simple Methods and Procedures Used in Forecasting

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Forecasting in supply chains

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Simple Regression Theory II 2010 Samuel L. Baker

Difference of Means and ANOVA Problems

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

2013 MBA Jump Start Program. Statistics Module Part 3

GLM I An Introduction to Generalized Linear Models

Econometrics Simple Linear Regression

Categorical Data Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chapter 23. Inferences for Regression

Premaster Statistics Tutorial 4 Full solutions

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Section 1: Simple Linear Regression

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail:

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

One-Way Analysis of Variance (ANOVA) Example Problem

Example: Boats and Manatees

LOGIT AND PROBIT ANALYSIS

2. Linear regression with multiple regressors

Statistical Models in R

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?

Week 5: Multiple Linear Regression

International Statistical Institute, 56th Session, 2007: Phil Everson

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Comparing Nested Models

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

How To Run Statistical Tests in Excel

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

Statistics and Data Analysis

Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review

16 : Demand Forecasting

Introducing the Linear Model

Multiple Regression: What Is It?

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data

Introduction to Linear Regression

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Correlation and Simple Linear Regression

Additional sources Compilation of sources:

TIME SERIES ANALYSIS & FORECASTING

3. Regression & Exponential Smoothing

Regression Analysis (Spring, 2000)

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Using Excel for Statistics Tips and Warnings

Module 5: Statistical Analysis

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

SPSS Guide: Regression Analysis

MULTIPLE REGRESSION EXAMPLE

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

8. Time Series and Prediction

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

MICROSOFT EXCEL FORECASTING AND DATA ANALYSIS

Transcription:

CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics

Causal Models Used when demand is correlated with some known and measurable environmental factor. Demand (y) is a function of some variables (x1, x2,... xk) Dependent Variable Independent Variables Disposable Diapers ~ f(births, household income) Car Repair Parts ~ f(weather/snow) Promoted Item ~f(discount, placement, advertisements) 2

Agenda Simple Linear Regression Regression in Spreadsheets Multiple Linear Regression Model Transformations Model Fit and Validity 3

Simple Linear Regression 4

Example: Simple Linear Regression Recall from earlier lecture on exponential smoothing Estimating initial parameters for Holt-Winter (level, trend, seasonality) Removed seasonality in order to estimate initial level and trend y i = β 0 + β 1 x i Deseasoned Daily Bagel Demand 220 200 y = -213.16 + 4.14x 180 160 140 120 100 80 85 90 95 100 Time Period (Days) Y i = β 0 + β 1 x i +ε i for i =1,2,...n Observed Unknown EY ( x) = β + β x StdDev( Y x) 0 1 = σ 5

Simple Linear Regression The relationship is described in terms of a linear model The data (x i, y i ) are the observed pairs from which we try to estimate the Beta coefficients to find the best fit The error term, ε, is the unaccounted or unexplained portion The error terms are assumed to be iid ~N(0,σ) Deseasoned Daily Bagel Demand 220 200 180 160 140 120 y = -213.16 + 4.14x Observed demand for period 97 = y 97 = 204 Error (residual) for period 97 = ε 97 = y 97 - ŷ 97 = 204 188.4 = 15.6 Estimated demand for period 97 = ŷ 97 = -213.16 + 4.14(97) 188.4 100 80 85 90 95 100 Time Period (Days) 6

Simple Linear Regression Residuals or Error Terms Residuals, e i, are the difference of actual minus predicted values Find the b s that minimize the residuals ŷ i = b 0 + b 1 x i for i =1,2,...n e i = y i ŷ i = y i b 0 b 1 x i for i =1,2,...n How should we minimize the residuals? Min sum of errors - shows bias, but not accurate Min sum of absolute error - accurate & shows bias, but intractable Min sum of squares of error shows bias & is accurate ( 2 n e ) n ( ˆ ) n ( 0 1 ) 1 i = y i i 1 i yi y i 1 i b bx = = = = i 2 2 7

Simple Linear Regression Ordinary Least Squares (OLS) Regression Finds the coefficients (b 0 and b 1 ) that minimize the sum of the squared error terms. We can use partial derivatives to find the first order optimality condition with respect to each variable. 2 n ( e ) n ( ˆ ) n ( 0 1 ) 1 i = y i i 1 i yi y i 1 i b bx = = = = i 2 2 220 b 1 = n i=1 (x i x)( y i y) n i=1 b 0 = y b 1 x (x i x) 2 We know from the data: x = 89.5 y =157.4 Deseasoned Daily Bagel Demand y = -213.16 + 4.14x 200 180 160 140 120 100 80 85 90 95 100 Time Period (Days) 8

OLS Regression in Spreadsheet 9

Regression By Hand Original data (y in column A, x in column B) b 1 = n i=1 (x i x)( y i y) n i=1 b 0 = y b 1 x (x i x) 2 =B7-$B$22 =A7-$A$22 =C7*D7 =C7^2 =SUM(C2:C21) =SUM(E2:E21) =SUM(D2:D21) =SUM(F2:F21) =F22/E22 =A22-B24*B22 =AVERAGE(A2:A21) =AVERAGE(B2:B21) Regression Equation y = b 0 +b 1 x y= 113.92 + 4.14x 10

Regression Using LINEST function Original data (y in column A, x in column B) =LINEST(A2:A21,B2:B21,1,1) LINEST(known_y's, known_x's, constant, statistics) b 1 b 0 s b1 s b0 R 2 s e F d f SSR SSE The LINEST is an array function Receives and returns data to multiple cells The equation will be bookended by {} brackets when active While the function is the same in both LibreOffice and Excel, activating it differs slightly. LibreOffice Type the formula into cell D2 and press the keyboard combination Ctrl+Shift+Enter (for Windows & Linux) or command+shift+return (for Mac OS X). Excel Select a range of 2 columns by 5 rows, in this case (D2:E6). Then, in the 'Insert Function' area, type the formula and press the keyboard combination Ctrl+Shift+Enter (for Windows & Linux) or command+shift+return (for Mac OS X). 11

Regression Using LINEST b 1 b 0 n = number of observations k = number of explanatory variables (NOT intercept) d f = degrees of freedom (n-k-1) b 0 = estimate of the intercept b 0 = y b 1 x b 1 = estimate of the slope (explanatory variable 1) b 1 = n i=1 (x i x)( y i y) n i=1 (x i x) 2 s b1 R 2 F SSR s b0 s e d f SSE Goodness of fit of the model proportion of the variation in Y which is explained by X Total Sum of Squares (SST) n ( y i y) 2 = ( ŷ i y) 2 + y i ŷ i=1 n i=1 n i=1 ( ) 2 Explained Portion Sum of Squares of Regression (SSR) Unexplained Portion Sum of Squares of the Error (SSE) R 2 = Coefficient of Determination: the ratio of explained to total sum of squares where 0 R 2 1 R 2 = SSR SST = SSR SSR + SSE 12

Regression Using LINEST b 1 b 0 s e = standard error of estimate: an estimate of variance of the error term around the regression line. s e = n 2 e i=1 i n k 1 = s b0 = standard error of intercept ( y i ŷ i ) 2 n k 1 n i=1 s b1 = standard error of slope s b1 R 2 F SSR s b0 s e d f SSE 1 s b0 = s e n + x 2 n x i x i=1 ( ) 2 1 s b1 = s e n ( x i x) 2 i=1 How significant is the explanatory variable? Is it different from zero? - Test the null hypothesis H 0 : b 1 =0 with alternate hypothesis H A : b 1 0 - Use two-tailed t-test =TDIST(t_statistic, d f, number tails) - always use 2 tail test - Accepted thresholds for p-value 0.01, 0.05, or 0.10 (meaning we can reject the H 0 with 99%, 95%, and 90% probability respectively) t b1 = b 1 s b1 = 4.14 0.27 =15.33 p-value =TDIST(15.33, 18, 2) = 8.92x10-12 < 0.01 13

b 1 b 0 Regression Using LINEST Original data (y in column A, x in column B) s b1 R 2 F SSR s b0 s e d f SSE =LINEST(A2:A21,B2:B21,1,1) LINEST(known_y's, known_x's, constant, statistics) =D2/D3 =TDIST(D9,$E$5,2) 1. How is the overall fit of the model? Look at Coefficient of Determination R 2 No hard rules, but 0.70 is preferred 2. Are the individual variables statistically significant? Use t-test for each explanatory variable Lower p-value is better Generally used threshold values include 0.10, 0.05, 0.01 14

Multiple Linear Regression 15

Example: Monthly Iced Coffee Sales Develop Forecasting Model #1 Level, trend, & avg. historical temperature Develop OLS regression model Y i = β 0 + β 1 x 1i + β 2 x 2i +ε i DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp) Using LINEST function Follow earlier directions {=LINEST(A2:A25,B2:C25,1,1)} When activating, expand area to five (5) rows by k+1 columns Output shifts for new variables w Top right is always b 0 w Bottom left six cells don t change Output (0.27) 52.65 3,254.81 2.75 6.24 174.85 0.78 208.97 #N/A 36.36 21 #N/A 3,175,996 917,074 #N/A b 2 b 1 b 0 s b2 s b1 s b0 R 2 s e F d f SSR SSE 16

Example: Monthly Iced Coffee Sales How is the overall fit of the model? R 2 = 0.78 or 78% Are the individual variables statistically significant? (0.27) 52.65 3,254.81 2.75 6.24 174.85 0.78 208.97 #N/A 36.36 21 #N/A 3,175,996 917,074 #N/A Run t-tests for each variable and the intercept intercept t b0 = b 0 s b0 = 3255 175 =18.60 b 2 b 1 b 0 s b2 s b1 s b0 R 2 F s e d f SSR SSE trend t b1 = b 1 s b1 = 52.65 6.24 = 8.44 n= 24 observations k = 2 variables d f = n-k-1 = 24-2-1= 21 temperature effect t b2 = b 2 s b2 = 0.27 2.75 = 0.098 P-value =TDIST(18.6, 21, 2) < 0.0001 P-value =TDIST(8.44, 21, 2) < 0.0001 P-value =TDIST(0.27, 21, 2) = 0.9223 Both the intercept and trend coefficients are significant Temperature effect is not, we cannot reject the H 0 What next? Try the model without the temperature effect 17

Example: Monthly Iced Coffee Sales Develop Forecasting Model #2 Y i = β 0 + β 1 x 1i +ε i Using LINEST function Level and trend Develop OLS regression model DEMAND = LEVEL + TREND(period) Follow earlier directions {=LINEST(A2:A25,B2:B25,1,1)} 52.55 3239.89 6.02 86.05 0.78 204.22 76.14 22 3175570 917500 b 1 b 0 s b1 s b0 R 2 s e F d f SSR SSE Model fit? R 2 =0.78 Variables? p-value for b 0 and b 1 are both < 0.0001 18

Example: Monthly Iced Coffee Sales Compare the goodness of fit between models Model 1: w DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp) w R 2 = 0.77594 Model 2: w DEMAND = LEVEL + TREND(period) w R 2 = 0.77584 If Model #2 is better, why is the R 2 lower? R 2 will never get worse (and will usually improve) by adding more variables even bad ones! Need to modify the metric adjusted R 2 w Model 1: adj R 2 = 1 (1-0.77594)(23/21) = 0.754600 w Model 2: adj R 2 = 1 (1-0.77584)(23/22) = 0.765651 ( ) adj R 2 =1 1 R 2 " n 1 % $ ' # n k 1& 19

Transforming Variables 20

Example: Monthly Iced Coffee Sales Develop Forecasting Model #3 Level, trend, & school being open Develop OLS regression model Y i = β 0 + β 1 x 1i + β 3 x 3i +ε i DEMAND = LEVEL + TREND(period) + OPEN_EFFECT(open) Need to create Dummy Variable x 3i = 1 if School is in Session, =0 otherwise Interpret β 3 as increase (decrease) in demand when school is in session Using LINEST function {=LINEST(A2:A25,B2:C25,1,1)} 144.50 51.54 3156.12 85.35 5.81 96.30 0.80276 196.07 #N/A 42.74 21 #N/A 3285765 807305 #N/A b 3 b 1 b 0 s b3 s b1 s b0 R 2 s e F d f SSR SSE 21

Example: Monthly Iced Coffee Sales (#3) How is the overall fit of the model? R 2 = 0.80276 with adj R 2 0.78398 (better than #1 or #2) Are the individual variables statistically significant? Run t-tests for each variable and the intercept Intercept and trend coefficients are strongly significant, school flag is borderline intercept t b0 = b 0 s b0 = 3156 96.3 = 32.77 P-value =TDIST(32.77, 21, 2) < 0.0001 Let s interpret this: trend t b1 = b 1 s b1 = 51.54 5.81 = 8.87 P-value =TDIST(8.87, 21, 2) < 0.0001 n= 24 observations k = 2 variables d f = n-k-1 = 24-2-1= 21 school in session t b3 = b 3 s b3 = 144.50 85.35 =1.69 P-value =TDIST(1.69, 21, 2) = 0.105 We are forecasting a monthly demand level of 3,156 iced coffees with a monthly trend of ~52 additional cups each month and an increase of ~145 cups whenever school is in session. 144.50 51.54 3156.12 85.35 5.81 96.30 0.80276 196.07 #N/A 42.74 21 #N/A 3285765 807305 #N/A My forecast for sales: w January year 3 = 3156 + 25(51.5) + 144.5(0) = 4444 w February year 3 = 3156 + 26(51.5) + 144.5(1) = 4640 b 3 b 1 b 0 s b3 s b1 s b0 R 2 F s e d f SSR SSE Demand = 3156 + 52(t) + 145(if in session) 22

Model & Variable Transformations We are using linear regression, so how can we use dummy variables? The model just needs to be linear in the parameters For model #3: y = β 0 +β 1 (period)+β 3 (open_flag) Many transformations can be used: y = β 0 + β 1 x 1 y = β 0 + β 1 x 1 + β 2 x 1 2 y = β 0 + β 1 ln(x 1 ) y = ax b ln( y) = ln(a) + bln(x) y = ax 1 b1 x 2 b2 ln( y) = ln(a) + b 1 ln(x 1 ) + b 2 ln(x 2 ) Transformations and dummy variables allow for many models For example: w x 4i = (x 3i )*(temperature) if sales increase with temperature when school is in session w x 5i = 1 if competing store runs a sale, =0 otherwise w x 6i = x 1i 2, so that we can capture a tapering effect to the linear trend But, be careful on interpretation of results 23

Model Fit & Validation

Model Validation Basic Checks Goodness of Fit look at the R 2 values Individual coefficients t-tests for p-value Additional Assumption Checks Normality of residuals look at histogram Heteroscedasticity look at scatter plot of residuals w Does the standard deviation of the error terms differ for different values of the independent variables? Autocorrelation is there a pattern over time w Are the residuals not independent? Multi-Collinearity look at correlations w Are the independent variables correlated? w Make sure dummy variables were not over specified Statistics Software Most packages check for all of these More sophisticated tests and remedies Number of Months Residuals Residuals 5 4 3 2 1 0-250 400 300 200 100 - (100) (200) (300) (400) -200-150 -100-50 0 Residuals 50 100 150 200 250 400 300 200 100 0-100 35 45 55 65 75 85-200 -300-400 Temperature (F) 0 2 4 6 8 10 12 14 16 18 20 22 24 Time Periods 25

Modeling Results which is best? 8 7 Model: Linear Equation Sales 6 5 4 3 2 1 Sales 8 6 4 2 y = 0.4x + 4.5 R² = 0.16 0 0 2 4 6 8 Time 0 0 2 4 6 8 Time Model: Quadratic Equation Model: Cubic Equation 8 8 6 6 Sales 4 2 0 y = 0.5x 2-2.1x + 7 R² = 0.36 0 2 4 6 8 Time Sales 4 2 0 y = 1.3333x 3-9.5x 2 + 20.167x - 7 R² = 1 0 2 4 6 8 Time Avoid over-fitting. Objective is to forecast demand for planning purposes. 26

Key Points from Lesson 27

Key Points Y i = β 0 + β 1 x 1i + β 2 x 2i +ε i Regression finds correlations between A single dependent variable (y) One or more independent variables (x 1, x 2, ) Coefficients are estimates by minimizing the sum of the squares of the errors Always test your model: Goodness of fit (R 2 ) Statistical significance of coefficients (p-value) Some Warnings: Correlation is not causation Avoid over-fitting of data Why not use this instead of exponential smoothing? All data treated the same Amount of data required to store 28

CTL.SC1x -Supply Chain & Logistics Fundamentals Questions, Comments, Suggestions? Use the Discussion! Casey Photo courtesy Yankee Golden Retriever Rescue (www.ygrr.org) MIT Center for Transportation & Logistics caplice@mit.edu

Image Sources "Pampers packages (2718297564)" by Elizabeth from Lansing, MI, USA - Pampers packagesuploaded by Dolovis. Licensed under Creative Commons Attribution 2.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/file:pampers_packages_(2718297564).jpg#mediaviewer/ File:Pampers_packages_(2718297564).jpg "Kraft Coupon" by Julie & Heidi from West Linn & Gillette, USA - Grocery Coupons - Tearpad shelf display of coupons for Kraft Macaroni & Cheese. Licensed under Creative Commons Attribution-Share Alike 2.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/file:kraft_coupon.jpg#mediaviewer/file:kraft_coupon.jpg "Damaged car door" by Garitzko - Own work. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/file:damaged_car_door.jpg#mediaviewer/file:damaged_car_door.jpg 30