APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY

APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY DOI: 10.2478/tjeb-2014-0005 Ramin Bashir KHODAPARASTI 1 Samad MOSLEHI 2 To forecast sales as reliably as possible is one of the most important issues in every business trade. Therefore, in recent years different models have been suggested to deal with this issue. One efficient model is the time series model. This study applies a multivariate time series model to forecast Urmia Gray Cement Factory's sales volume and more importantly, to propose an effective model to be used by other cement factories to predict their sales volume. The two independent variables of costs and revenues and the dependent variable of sales were used in the present study. Results of the study indicated the two independent variables had a positive and direct relationship with sales volume forecast. Keywords: Forecasting Models, Autoregressive Process. JEL Classification: F17, M11. 1 Assistant Professor, Urmia University, Iran. 2 MA Student, Payam Noor University, Iran.

1. Introduction To increase sales volume in the market has always been one of the most important issues in every sales market and economy. This is due to the fact that the economic growth of every society or country is directly dependent on its production activities. As a result, the needs of the society on one hand and the right amount of production on the other have always been of great interest for both producers and economic policy-makers. To achieve supply and demand, a lot of planning decisions have been made in different countries. One of these planning decisions is related to the prediction of the sales volume. It is, therefore, an essential means for putting new products on the market, planning production volume, determining the essential stock, and developing a favourable distribution system. Sales forecasting is one of the challenging issues in production and marketing. Overestimation or underestimation of sales volume can prove extremely costly. On one hand, overestimation of sales volume can probably cause the firm's overinvestment in production and stock which will lead to more capital loss and inversely less profit. On the other hand, underestimation of demand and sales volume will cause big losses. Overestimation of sales volume will cause the company to limit its dynamicity, investment, and all of its plans for future activities. When this happens, the business may lose its potential business opportunities and customers and may dissatisfy certain customers because it cannot satisfy their demands. Such a business will be catastrophic. For this reason, sales forecasting entails due care and attention which can be carried out in different ways. The first step in the sales volume forecasting is an accurate estimation of the market stock. The market stock is dependent on the existing customers. The existing customers favour three characteristics: first, they are highly interested in the goods produced by a target company; second, they have decent revenues; and third, they have an easy access to the target market. The second step is the estimation of the total market demand. It should be mentioned that the total market demand is not a fixed number; rather it is a function of certain conditions, such as marketers' activeness and the general status of the market. In general, there are two main methods for sales forecasting: a priori (or theory-driven) vs. posterior (or data-driven). The sales forecasting for some cases in the present study was based upon a data-driven forecasting. 2. Literature Review There are different existing studies on the application of time series models to forecast sales. For instance, Mallin, et. al. (2010) studied the relationship between trust and sales control to develop and test an argument that linked informational uncertainty to the development of managerial trust in the salesperson. They hypothesized that shared goals and length of attachment reduce uncertainty, which has the effect of promoting managerial trust in the salesperson. They also hypothesized that sales control would have a (negative) moderating effect on those uncertainty-trust relationships. They collected data from 100 90

sales managers to measure their sales control strategies, degree of trust, goal congruence, and the relationship tenure with three of their salespeople. And ordinary least squares regression analysis was used. The results of their study indicated that there was a direct and positive relationship between lower uncertainty and managerial trust in the salesperson. Furthermore, the results confirmed that sales control had a negative moderating effect on those relationships. In another study carried out by Curtis, et. al. (2013), a sales forecasting model was used to test the model on a sample of firms in the retail industry. Their model distinguished between sales growth due to an increase in the number of sales-generating units (e.g. opening new stores) and growth due to an increase in the sales rate at the existing units (e.g. the comparable store growth rate). Their model accommodated different trends in the sales, allowing new stores to earn more or less than the existing stores, perhaps because new stores take either a long time to reach maturity or alternatively enjoy an early fad status. Their model used only a few years of firm-specific, publicly available information, yet generated in-sample forecast errors of less than two percent of sales, generated out-ofsample forecast errors that were almost as accurate as analyst revenue forecasts and, when used together with analyst forecasts, resulted in a modified forecast that is significantly more accurate than the analyst forecast alone. In another study Frees and Miller (2004) showed how to forecast sales using a class of linear mixed longitudinal, or panel data models. They argued that forecasts are derived as special cases of best linear unbiased predictions, also known as BLUPs, and hence are optimal indicator of future realizations of the response. Their study showed that the BLUP forecast arises from three components: (1) a predictor based on the conditional mean of the response, (2) a component due to time-varying coefficients, and (3) a serial correlation term. The forecasting techniques, they argued, are applicable in a wide variety of settings. They discussed forecasting in the context of marketing and sales. In particular, they considered a data set of the Wisconsin State Lottery, in which 40 weeks of sales were available for each of 50 postal codes. Using sales data as well as economic and demographic characteristics of each postal code, they forecast sales for each postal code. Box and Jenkins (1976) formalized the ARIMA modeling framework by defining three steps to be carried out in the analysis: identify the model, estimate the coefficients and verify the model. In the identification stage, one uses the identify statement to specify the response series and identify candidate ARIMA models for it. The identify statement reads time series that are to be used in later statements, possibly differencing them, and computes autocorrelations, inverse autocorrelations, partial autocorrelations, and cross-correlations. Stationary tests can be performed to determine if differencing is necessary. In the estimation and diagnostic checking stage, one uses the estimate statement to specify the ARIMA model to fit to the variable specified in the previous identify statement and to 91

estimate the parameters of that model. In the forecasting stage, one uses the forecast statement to forecast future values of the time series and to generate confidence intervals for these forecasts from the ARIMA model produced by the preceding estimate statement. Furthermore, Musazadehgan and Shahrabi (2008) studied the distribution network of clothing and wearing products in Iran. They argued that this network needs to produce their products on the basis of an efficient management system to minimize their costs and maximize their customers' satisfaction. In such a status, in order to respond to such an existing fluctuating market, distributors are forced to use sales forecasting models. However, the application of such models to sell clothing and wearing products is a very complex enterprise due to different limitations. These limitations refer to the supply of too many of such products, especially newer ones, in the market and their short-lividness. There are various models of sales forecasting whose efficiency depends on their goals of application and users' experience and horizons. In their study, they suggested two different forecasting models with two different approaches, both of which combined clustering and classification to improve the sales mechanism. These two models used one same set of data to form the model. For the first model, the K-means cluster algorithm and the decisiontree algorithm C4.5 were mixed and used together. For the second one, due to the uncertainty of the relationship among sales, descriptive characteristics of products, and the complexity, neural network systems were used to cluster and classify the data. Finally, using these two models, the sales forecasting profile was developed. The sales forecasting profile indicated that in spite of the expected results, the decision-tree algorithm provided better results. In comparison with the models discussed above, the VARMA model has the advantage of applying the three variables of costs, revenues, and sales simultaneously. The model can estimate not only the impact of each variable upon another variable individually, but also the impact of the two variables of costs and revenues upon the third one, i.e. sales. Since time was a key decision factor in the present study, the Autoregressive process of strict stationary was used to estimate the trend of all the three variables in certain period of time. 3. Methods To collect the necessary data for the present study, Urmia Gray Cement Factory was randomly selected from among existing factories in Urmia, Iran. The data on the quantities of sales, revenues and costs of the factory were collected and estimated for intervals of three months for the years 1997-2012. The sales quantity of the factory is dependent on the two variables of revenues and costs whose increase or decrease can influence or affect the factory's sales quantity. All of the data collected from the factory were all non-negative; that is to say, they were all collected as non-negative integers ( ). On the basis of the collected data, an estimate of the dependent variable of sales for the following four years 92

was forecast. In this estimate, the two independent variables of costs and revenues were also forecast. To analyze the collected data, a multivariate time series model of forecasting, which include three vectors of sales, costs, and revenues was used. Using appropriate function transformation models, it was tried to minimize the error measurement of the model as much as possible. The parameters were estimated and placed in the time series model and the sales were forecast for the subsequent time spans using the time series equation (Hannan, 1970). 4. Research Model and Results In this section, first the autoregressive model is explained. Next, the moving average model is explained. Finally, how these two models were combined into the autoregressive moving average model. Originally, Yule (1927) proposed stationary autoregressive process of order p (AR (p)) which later became known as Markov process and is used to describe processes, such as the number of sunspots or the behaviour of a sample pendulum. In the representation of an autoregressive process, if only a finite number of weights are nonzero, i.e., =, =,., =, and =0 for >, then the resulting process is said to be an autoregressive process (model) of order, which is denoted as ( ). It is given by: = + + + (1) or (B) =a (2) where (B) = 1 B B (3) Since equals <, the process is always invertible. To be stationary, the roots of ( ) = 0 must lie outside of the unit circle. AR processes are useful in describing situations in which the present value of time series depends on its preceding values plus a random shock. Slutsky (1927) developed the process of random moving average of order ( ( )). In the representation of moving average of a process, if only a finite number of weights are nonzero, i.e., =, =,, = and =0 for >, then the resulting process is 93

said to be a moving average process or model of order and is denoted as ( ). It is given by: = (4) or = (B)a (5) where (B) = 1 θ B θ B (6) Since 1 + + + <, a finite moving average process is always stationary. This moving average process is invertible if the roots of ( ) =0 lie outside of the unit circle. Moving average processes are useful in describing phenomena in which events produce an immediate effect that lasts only for a short period of time. Specifically, a convertible stationary process can be represented as an autoregressive moving average (ARMA) model. The shared problem with these two processes is that they may include several parameters. In fact, a model with a high order should favour a more reliable estimate. In general, the high order of a model lessens the optimality and efficiency of the model. Hence, when a model is made, special care should be taken so that the autoregressive and moving average terms in the model lead to the compound process of autoregressive moving average (ARMA) (see equation 7). (B) = (B)a (7) Where and are the parameters of the model which are always estimated. a refers to a purely random process which has a normal distribution with a mean of zero and the variance of. In many studies, the analysis of the time series data is in multivariate; for instance, when the sales process is studied, costs, prices, revenues and other variables which can have an influence on sales are taken into consideration and studied. Therefore, the autoregressive moving average (ARMA) model is applied for the analysis of multivariate time series data and especially for forecasting purposes. Let =,,,,,, which is a vector process with the real values of m-dimensional stationery, then the vector autoregressive moving average stationery ( (, )) model can be rewritten as the following equation: 94

(B) = (B) (8) (B) = B B (9) (B) = B B (10) Where and are the parameters of the vector model which are estimated with the least squares (Brockwell and Davis, 1987). is the absolute random vector process and has a normal distribution with a vector mean of zero and covariance matrix. 5. Identifying VARMA (p,q) Model In identifying the vector model, the series are transformed and differenced accurately, the autocorrelation ( ) and partial autocorrelation ( ) matrix functions are used (Tiao and Box, 1981). If decreases exponentially or sinuously, stops after the time delay of, then the series will indicate an autoregressive model of order, and the other way around. That is to say, if stops after the time delay of and decreases exponentially or sinuously, the series will indicate a moving average process of order. To study time series models and their effects, Box-Tiao's analysis and modeling time series can be used. In this study, this model was used to analyze the collected costs and sales data for three-month time spans during 1997-2012. Figure 1 represents that the trend of these changes in the time series averages is regular; that is, the series is non-stationary. Figure 1. Sales, Income, Costs Raw Data 95

Using the (1) model, the and data on both dependent and independent variable for 16 different time spans are represented in Figure 2. Since these series are all nonstationary, difference method should be applied. Figure 2 and series 96

Figure 3 represents that with a difference of order one series data are always stationary and applicable to all the three variables in the (, ) model. Figure 3. Sales, Income, Costs First Difference Figure 4 indicates that sales series for 16 time delays, in delays 2, 4, 6,, the quantity of the autocorrelation is greater than the confidence coefficient of ± = ±0.277 (. Furthermore, in the, partial autocorrelation in the delay 2 and 8 is greater than the confidence coefficient of ± = ±0.277. Therefore, these graphs indicate that autoregressive with order of one or two has been used which needs to be re-examined after modeling. Furthermore, costs-revenues series cross correlation ( ) has been illustrated in the graph by which the correlation of sales-costs and sales-revenues relationships can be estimated. And these estimates can be used for time series parameters. The costs-revenues and graphs illustrate that certain time delays are greater than the confidence coefficient. The model whose and quantities vary between upper and lower confidence coefficients can be considered as the most reliable series model. 97

Figure 4 and and of sales with costs and revenues Considering the trend of,, and estimate of their parameters presented in Table 1, the sales model for each of the costs and revenues series is obtained. 98

Table 1 Equation for general process Parameters Estimates Std. Err. 0.101 0.023-0.041 0.098 0.034 0.107-0.298 0.213 0.092 0.084 0.077 0.086 0.314 0.541 0.069 0.860 0.012 0.071 = 0.05 In the first stage, a complete model for the parameters is formed which is represented in Table 1. In the second step, the parameters whose quantities are less than twice the estimated SE are estimated 0, i.e. 2 = ±0.277, in order to achieve the best model as for as possible. Table 2 presents the estimates of parameters and their SE. Table 2 Equation for end process Parameters Estimates Std. Err. 0.101 0.090-0.037 0.072 0.041 0.099 0 0 0.091 0.074 0.073 0.090 0 0 0.067 0.880 0.0180 0.064 = 0.05 In equation 11,,,, and, stand for sales, revenues and costs respectively. Using vector model represented in equation 11, sales quantities for the subsequent years for Urmia Gray Cement Factory can be forecast. Figure 5 presents and revenues and costs series results which are similar to the ones which had already been forecasted. These results were obtained after the modeling. 99

1 0 0 0.101 0.037 0.041 (1 ),, 0 1 0 0 0.091 0.073 (1 ), =, (11) 0 0 1 0 0.067 0.018 (1 ),, However, it should be noticed that this sales forecast model is not highly reliable. This is due to the fact that modeling is a difficult process which needs an accurate and thorough understanding of the topic at issue. 6. Conclusions Forecast of earnings is one of the major tasks for financial statement analysts, and sales forecast is the most important step in the process of predicting earnings. This study applies a multivariate time series model to forecast Urmia Gray Cement Factory's sales volume and more importantly to propose an effective model to be used by other cement factories to predict their sales volume. The analysis of the data which were collected on a time series model from Urmia Gray Cement Factory during 1997-2012 indicated that the two variables of costs and revenues have a direct and positive effect on sales volume. That is to say, there is a contemporaneous relationship between sales quantity on one hand and costs and revenues on the other. On the basis of this relationship, the sales forecasting model, which indicates the effect of two independent variables on one dependent variable, was designed. Using autocorrelation, partial autocorrelation, cross correlation of each of the independent variable with the dependent variable, this model is suggested to estimate the existing parameters in the model with a maximum standard error measurement of 0.277 ( = 0.05). This model can forecast sales for the following years considering these three variables and their relationships. However, it would be interesting to do further research by other factors that influence the sales, such as: population growth rate, unemployment rate, inflation and so on. References Antoniou, A., Guney, Y., & Paudyal, K. (2002). Determinants of Corporate Capital Structure: Evidence from European Countries. Working Paper, University of Durham. Booth, L., Aivazian, V., Demirgűc-Kunt, A., & Maksimovic, V. (2001). Capital Structure in Developing Countries. Journal of Finance, 56(1), 87 130. Chen, J. J. (2004). Determinants of Capital Structure of Chinese-Listed Companies. Journal of Business Research, 57(12), 1341 1351. Cheng, S., & Shiu, C. (2007). Investor Protection and Capital Structure: International Evidence. Journal of Multinational Financial Management, 17(1), 30-44. Box, G. E. P., & Jenkins, G. M. (1976). Time Series Analyses: Forecasting and Control. 2 nd ed., Holden Day, San Francisco. 100

Brockwell, P. J., & Davis, R. A. (1987). Time Series: Theory and Methods. Springer, New York. Curtis, A., & Lundholm, R. J. (2013). Forecasting Sales: A Model and Some Evidence from the Retail Industry, Contemporary Accounting Research, 31(2), 581-608. Frees, E. W., & Miller, T.W. (2004). Sales Forecasting Using Longitudinal Data Models. International Journal of Forecasting, 20, 99-114. Hannan, E. (1970). Multiple Time Series. John Wiley, New York. Mallin, M. L., O Donnell, E. A., & Hu, M. Y. (2010). The Role of Uncertainty and Sales Control in the Development of Sales Manager Trust. Journal of Business and Industrial Marketing, 25(1), 30-42. Musazadehgan, F., & Shahrabi, J. (2008). Mogayeseyeh sisteme pishbiniye foroosh pooshak va mansojat ba do roykarde khoshehbandi-derakhthayeh tasmim va shabakeyeh asabi, Civilica, Iran. Slutsky, E. (1937). The Summation of Random Causes as the Source of Cyclic Processes. Econometrica,5, 105-146. Tiao, G. C., & Box, G. E. P. (1981). Modeling Multiple Time Series with Application. Journal of American Statistical Association,7, 802-816. Yule, G.U. (1927). On a method of Investigating Periodicities in Disturbed Series with Special Reference to Wolfer's Sunspot Numbers. Transactions of the Royal Society, London, A, 226, 267-298. 101