ing, introduction Sales and operations planning (SOP) forecasting To balance supply with demand and synchronize all operational plans Capture demand data forecasting Balancing of supply, demand, and budgets. Determine the best product mix, optimal inventory targets, postponement strategies, and supply plans Adapt and adjust to changing business conditions: Simulate multiple business scenarios; react quickly to changing conditions through automated exception management Integrated Solution: Drive continuous improvement through integrated performance management Sub-tasks Planning, Advanced Supply Chain Planning, Collaborative Planning, Inventory Optimization, Manufacturing Scheduling, and Global Order Promising. Predict demands and resources to schedule production Sales of computers Energy production Seats in airplane Motivation: Beer game Time series measuring quantity of interest X,X,...,X t forecast values at X t+,x t+,... our predictions are denoted ˆX t+, ˆX t+,... Assume measurements in time series are correlated Motivation Recall Australian Red Wine Sales Australian Wine Sales S e r ie s. 5.. Components Trend (global heating brings more rain) Seasonal variation (swim suits sold every spring) Cyclical variation (beer consumption increase during Soccer Championship) Irregular variation 5.. 5. 8 Trend Seasonal variation Irregular variation Seasonal and cyclical variation Multiplicative: (beer consumption increase hot days) dependent on current sales Additive: (beer consumption during Roskilde Festival) independent of current sales
Models Quality of forecast. S e r ie s Error ε t = ˆX t X t 5. Error total n t= ε t. 5.. Mean squared error n n t= εt Root mean squared error n n t= εt Punish large errors more than small errors 5. 8 Additive model with trend X t = L + T t + S t + I t Multiplicative model with trend where L level of series T t trend time t X t = (L + T t )S t + I t S t seasonal variation time t I t irregular variation time t Model selection Analyze problem, use experience, common sense Automatic model selection For all models For all parameters Evaluate error of forecast up to time t Use best model for future time forecasting: model fitting to time series 5 ES would work well here Overview Moving average, weighted moving average First order exponential smoothing Second order exponential smoothing Trends and seasonal pattern Croston s method Hyndman unified framework Moving Average - - - -8 - - - Typical Behavior for Exponential Smoothing 7 9 5 7 9 55 7 7 85 9 97 9 5 Given observations Level at time t X,X,...,X t L t = m m X t i i= ˆX t+i = L t for i =,,... advantage large m advantage small m average age of data m/ 7 8
Weighted Moving Average Exponential Smoothing Level at time t L t = t i= W i X i where W i is weight attached to each historic point W +W +... +W t = ˆX t+i = L t for i =,,... New data more weight than old data Data have no trend or seasonal pattern L t = L t + α(x t L t ) (correct mistake of last forecast) L t = αx t + ( α)l t (weighted average of last observation and last forecast) ˆX t+i = L t for i =,,....7..5 all forecasting schemes are variants of weighted moving average Weights on past data Substituting L t = αx t + ( α)l t we get L t = αx t + α( α)x t + ( α) L t Repeating substitution L t = αx t + α( α)x t + α( α) X t +α( α) X t +... Weights decrease exponentially (exponential smoothing).... 5 7 8 9 Expo Smooth a=. MoveAve(M=5) Appropriate for mean plus noise Or when mean is wandering around Quite stable processes Note: some authors use α instead of α 9 Exponential Shifting Mean Smoothing + Zero Mean White Noise Exponential Smoothing - Series Mean - Series. Mean - - - - 5 5 7 7 8 8 9 9 - - - - Series. Mean - - 5 5 7 5 7 5 8 8 9 9 7 7 8 8 9 9 Choice of smoothing parameter α RMSE.5..5. RMSE vs Alpha.5..5.....5..7.8.9 Alpha
Exponential Smoothing Exponential Smoothing Actual vs for Various Alpha Series and using Alpha=.9.5.5 -.5 a=. a=. a=.9.5.5 -.5 - - -.5 -.5-7 - 9 5 7 9 55 7 7 85 9 97 5 5 7 7 8 8 9 9 Series and using Alpha=.9 Smoothing parameter α, < α < Large α, adjust more quickly to changes Small α, more averaging, stable Typically α should be.5..5.5 RMSE analysis show larger α, smoothing not appropriate -.5 5 7 8 9 5 Exponential Smoothing Exponential Smoothing on a Trend Exponential Smoothing on a Trend RMSE vs Alpha RMSE.7..5......59.58.57....8 Alpha Series 8 Trend Data..5 5 7 8 9 Exponential smoothing will lag behind a trend 5
Exponential Smoothing on a Trend Exponential Smoothing on a Trend Data have trend but no seasonal pattern Ordinary exponential smoothing L t = αx t + ( α)l t 5 Double exponential smoothing (Holt 957) Trend Series Level: L t = αx t + ( α)(l t + T t ) Trend: T t = β(l t L t ) + ( β)t t (weighted average new trend, trend last time) Starting values (t ) T = X X, L = X : ˆX t+i = L t + it t Smoothing parameters < α < and < β < - 5-5 5 7 7 8 8 9 9 α=. Trend Series Data Single Smoothing Double smoothing 5 5 7 7 8 8 9 9 Overshoots (dot-com, house prices?) 7 8 Exponential Smoothing, Seasonal Pattern Xt =( +.t)(.5,.5,) Exponential Smoothing, Seasonal Pattern Data have trend and seasonal pattern 5.5.5.5 (+.t) Multiplicative Seasonal Series (Winters 9) S t multiplicative seasonal factor time t Season length s.5.5 7 9 5 8 7 9 5 55 Recall Australian Red Wine Sales Australian Wine Sales. 5.. 5.. 5. S e r ie s 58 X t = (L + T t )S t + I t Level: L t = α X t + ( α)(l t + T t ) S t s Trend: T t = β(l t L t ) + ( β)t t Seasonal factor: : ˆX t+i = X t S t = γ + ( γ)s t s L t + T t { (Lt + it t )S t+i s for i =,,...,s (L t + it t )S t+i s for i = s +,s +,...,s... Smoothing parameters < α <, < β < and < γ <. 8 9
Exponential Smoothing, Seasonal Pattern Croston s Method Additive Seasonal Series X t = L + T t + S t + I t Level: L t = α(x t S t s ) + ( α)(l t + T t ) Trend: T t = β(l t L t ) + ( β)t t Seasonal factor: Probability.8.7..5.... Distribution : ˆX t+i = S t + γ(x t L t T t ) + ( γ)s t s { Lt + it t + S t+i s for i =,,...,s L t + it t + S t+i s for i = s +,s +,...,s... 5 7 8 9 Small quantities (e.g. sales of cars) Fine-grained time series (e.g. automatic data collection) Orders in huge quantities (e.g. containers of beer) Croston s Method An intermittent Series.5.5.5.5 7 5 9 5 8 57 7 8 9 9 5 8 7 87 9 5 5 78 9 Exponential smoothing applied (α=.) Exponential Smoothing Applied.9.8 Croston s Method Keep track on Time between non-zero demands size when non-zero Smooth both time between and demand size Combine both for forecasting Definitions X t demand time t ˆX t predicted demand at time t Z t estimate of demand when not zero T t time between non-zero demands q time since last non-zero demand.7..5.... 9 7 55 7 9 9 7 5 8 99 7 5 5 7 89 7 5 97 Exponential smoothing is highest right after non-zero demand is lowest right before non-zero demand
Croston s Method Croston s Method An intermittent Series An intermittent Series.5.5.5.5.5.5 7 5 9 5 8 57 7 8 9 9 5 8 7 87 9 5 5 78 9.5.5 7 5 9 5 8 57 7 8 9 9 5 8 7 87 9 5 5 78 9 Update (if zero demand) Z t T t Update (if not zero demand) Z t T t = Z t = T t q = q + = α(x t Z t ) + ( α)z t = α(q T t ) + ( α)t t q = ˆX t = Z t T t.9.8.7..5.... Recall example Croston s method applied (α=.) Croston's Method Applied to Data 5 5 7 78 89 55 77 88 99 5 5 7 87 98 9 5 75 8 97 5 Hyndman () Seasonal Component Trend N A M Component (none) (additive) (multiplicative) N (none) NN NA NM A (additive) AN AA AM M (multiplicative) MN MA MM D (damped) DN DA DM Damped: trend is damped over long horizons NN Simple exponential smoothing AN Holt s linear method AA Holt-Winter s method (additive) AM Holt-Winter s method (multiplicative) Terminology Y t observed value time t (was X t ) Y t () forecast one step ahead at time t (was ˆX t ) Level l t (was L t ) Trend b t (was T t ) Season s t (was S t ) Season length m (was s) exponential smoothing methods Level l t = αp t + ( α)q t Trend b t = βr t + (φ β)b t Seasonal s t = γt t + ( γ)s t m Values P t,q t,r t,t t vary Smoothing parameters α,β,γ Damping φ 7 8
Hyndman ing based on state space models for exponential smoothing Hyndman ing based on state space models for exponential smoothing 7 Seasonal component Trend N A M component (none) (additive) (multiplicative) Pt = Yt Pt = Yt st m Pt = Yt/st m N Qt = l t Qt = l t Qt = l t (none) Tt = Yt Qt Tt = Yt/Qt φ = φ = φ = Yt(h) = lt Yt(h) = lt + s t+h m Yt(h) = lts t+h m Pt = Yt Pt = Yt st m Pt = Yt/st m A Qt = l t + b t Qt = l t + b t Qt = l t + b t (additive) Rt = lt l t Rt = lt l t Rt = lt l t Tt = Yt Qt Tt = Yt/Qt φ = φ = φ = Yt(h) = lt + hbt Yt(h) = lt + hbt + s t+h m Yt(h) = (lt + hbt)s t+h m Pt = Yt Pt = Yt st m Pt = Yt/st m M Qt = l t b t Qt = l t b t Qt = l t b t (multiplicative) Rt = lt/l t Rt = lt/l t Rt = lt/l t Tt = Yt Qt Tt = Yt/Qt φ = φ = φ = Yt(h) = ltbt h Yt(h) = ltbt h + s t+h m Yt(h) = ltbt hs t+h m Pt = Yt Pt = Yt st m Pt = Yt/st m D Qt = l t + b t Qt = l t + b t Qt = l t + b t (damped) Rt = lt l t Rt = lt l t Rt = lt l t Tt = Yt Qt Tt = Yt/Qt β < φ < β < φ < β < φ < Yt(h) = lt+ Yt(h) = lt+ Yt(h) = [lt+ ( + φ + + φ h )bt ( + φ + + φ h )bt + s t+h m ( + φ + + φ h )bt]s t+h m Table : Formulae for recursive calculations and point forecasts. Trend Seasonal component component N A M (none) (additive) (multiplicative) N µt = l t µt = l t + st m µt = l t st m (none) lt = l t + αεt lt = l t + αεt lt = l t + αεt/st m st = st m + γεt st = st m + γεt/l t µt = l t + b t µt = l t + b t + st m µt = (l t + b t )st m A lt = l t + b t + αεt lt = l t + b t + αεt lt = l t + b t + αεt/st m (additive) bt = b t + αβεt bt = b t + αβεt bt = b t + αβεt/st m st = st m + γεt st = st m + γεt/(l t + b t ) µt = l t b t µt = l t b t + st m µt = l t b t st m M lt = l t b t + αεt lt = l t b t + αεt lt = l t b t + αεt/st m (multiplicative) bt = b t + αβεt/l t bt = b t + αβεt/l t bt = b t + αβεt/(st ml t ) st = st m + γεt st = st m + γεt/(l t b t ) µt = l t + b t µt = l t + b t + st m µt = (l t + b t )st m D lt = l t + b t + αεt lt = l t + b t + αεt lt = l t + b t + αεt/st m (damped) bt = φb t + αβεt bt = φb t + αβεt bt = φb t + αβεt/st m st = st m + γεt st = st m + γεt/(l t + b t ) Table : State space equations for each additive error model in the classification. Multiplicative error models are obtained by replacing ε t by µ tε t in the above equations. State space models Writing (.) (.) in their error-correction form we obtain l t = Q t + α(p t Q t) (.) b t = φb t + β(r t b t ) (.5) s t = s t m + γ(t t s t m). (.) The method with fixed level (constant over time) is obtained by setting α =, the method with fixed trend (drift) is obtained by setting β =, and the method with fixed seasonal pattern is obtained by setting γ =. Note also that the additive trend methods are obtained by letting φ = in the damped trend methods. 9 Two variants of state-space model (additive error, mul- HKSG describe the state space models that underlie the exponential smoothing methods. For each method, there are two models a model with additive errors and a model with multiplicative tiplicative errors. Theerror) pointwise forecasts for the two models are identical, but prediction intervals will differ. Likelihood functions The general model involves a state vector x t = (l t, b t, s t, s t,..., s t (m ) ) and state space equations of the form Model selection Y t = µ t + k(x t )ε t (.) Automatic model selection, parameter optimization Applied to M competition x t = f (x t ) + g(x t )ε t (.) where {ε t} is a Gaussian white noise process with mean zero and variance σ and µ t = Y t (). The model with additive errors has k(x t ) =, so that Y t = µ t + ε t. The model with multiplicative errors has k(x t ) = µ t, so that Y t = µ t( + ε t). Thus, ε t = (Y t µ t)/µ t is a relative error for the multiplicative model. All the methods in Table can be written in the form (.) and (.). The underlying equations are given in Table. The models are not unique. Clearly, any value of k(x t ) will ing is model fitting to time series Linear Trend (regression) Linear Trend and Additive Seasonality Linear Trend and Multiplicative Seasonality Polynomial Logarithmic Exponential Frequency identification All forecasting methods assume season known Identify seasonality of time series Advanced methods Stochastic models Likelihood calculations Prediction intervals Procedures for model selection Neural Networks Trained for model selection Smoothing parameters De-noising Removing noise from time series improves forecast Haar Wavelet transformation Daubechies Wavelet transformation Fourier transformation???
ing is important Acknowledgements s of sales Sales prices of houses and flats Water level in lakes Share prices Which forecast method should we use in Beer game? X t = {,,,,8,8,8,...} Figures from HOLT WINTERS file archive http://www.barbich.net/holt/ Discussion Stochastic programming Deterministic optimization (using forecasts) Data sets Famous M, M, M (April ) competition http://mktg-sun.wharton.upenn.edu/forecast/m-competition.html 5 annual series 75 quarterly series 7 other series 8 monthly series