1 Kuzhda Т. Reail sales forecasing wih applicaion he muliple regression [Електронний ресурс] / Т. Kuzhda // Соціальноекономічні проблеми і держава. 01. Вип. 1 (6). С Режим доступу до журн. : hp://sepd.nu.edu.ua/images/sories/pdf/01/1kibrm.pdf. УДК JEL Classificaion: C53 Тетяна Кужда Тернопільський національний технічний університет імені Івана Пулюя ПРОГНОЗУВАННЯ ОБСЯГІВ ПРОДАЖУ ПРОДУКЦІЇ НА ОСНОВІ БАГАТОФАКТОРНОЇ РЕГРЕСІЙНОЇ МОДЕЛІ Анотація. В статті описано метод багатофакторного регресійного моделювання, теоретичний підхід до побудови регресійних моделей, порядок розрахунку кількісного прогнозу залежної змінної під впливом декількох незалежних змінних. Застосовано теоретичний матеріал до прогнозування обсягів продажу продукції під впливом очікуваного доходу споживачів та витрат на рекламну діяльність. Здійснено перевірку отриманої багатофакторної регресійної моделі на статистичну надійність та значущість та розраховано прогноз обсягів продажу продукції на наступний період. Ключові слова: регресійний аналіз, залежна та незалежна змінні, багатофакторна регресійна модель, статистична надійність та значущість, екстраполяція трендів, прогноз обсягів продажу продукції. Татьяна Кужда ПРОГНОЗИРОВАНИЕ ОБЪЕМОВ ПРОДАЖ ПРОДУКЦИИ НА ОСНОВАНИИ МНОГОФАКТОРНОЙ РЕГРЕССИОННОЙ МОДЕЛИ Аннотация. В статье описано метод многофакторного регрессионного моделирования, теоретический подход к построению регрессионных моделей, порядок расчета количественного прогноза зависимой переменной под влиянием нескольких независимых переменных. Использовано теоретический материал к прогнозированию объемов продаж продукции под влиянием ожидаемого дохода потребителей и затрат на рекламную деятельность. Осуществлена проверка полученной многофакторной регрессионной модели на статистическую надежность и значимость, рассчитан прогноз объемов продаж продукции на следующий период. Ключевые слова: регрессионный анализ, зависимая и независимая переменные, многофакторная регрессионная модель, статистическая надежность и значимость, экстраполяция трендов, прогноз объемов продаж продукции. Teyana Kuzhda RETAIL SALES FORECASTI G WITH APPLICATIO THE MULTIPLE REGRESSIO Absrac. The aricle begins wih a formulaion for predicive learning called muliple regression model. Theoreical approach on consrucion of he regression models is described. The Kuzhda, T. (01). Reail sales forecasing wih applicaion he muliple regression. Sosial'noekonomichni problemy i derzhava  SocioEconomic Problems and he Sae [online]. 6 (1), p [Accessed May 01]. Available from: <hp://sepd.nu.edu.ua/images/sories/pdf/01/1kibrm.pdf>. 91
2 ISSN 338 SocioEconomic Problems and he Sae, Vol. 6, No. 1, 01 key informaion of he aricle is he mahemaical formulaion for he forecas linear equaion ha esimaes he muliple regression model. Calculaion he quaniaive value of dependen variable forecas under influence of independen variables is explained. This paper presens he reail sales forecasing wih muliple model esimaion. One of he mos imporan decisions a reailer can make wih informaion obained by he muliple regression. Recenly, a changing reail environmen is causing by an expeced consumer s income and adverising coss. Checking model on he goodness of fi and saisical significance are explored in he aricle. Finally, he quaniaive value of reail sales forecas based on muliple regression model is calculaed. Keywords: regression analysis, dependen and independen variables, muliple regression model, goodness of fi and he saisical significance, rend exrapolaion, reail sales forecas. Inroducion. Regression analysis includes many echniques for modeling and analyzing several variables, when he focus is on he relaionship beween a dependen variable and one or more independen variables. Regression analysis is also used o undersand which among he independen variables are relaed o he dependen variable, and o explore he forms of hese relaionships. Regression analysis can be used o infer causal relaionships beween he independen and dependen variables. Variables which are used o explain oher variables are called explanaory (or independen) variables. A dependen variable is wha you measure in he forecas. The dependen variable responds o he independen variable. I is called dependen because i depends on he independen variable [1]. Regression modeling is he process of consrucion forecasing models based on he relaionship beween a dependen variable and independen variables o make he fuure forecas. Regression modeling is a kind of mulifacor forecasing. The basis of regression modeling is he consrucion of regression models. Regression models are used o predic one variable from one or more oher variables. Regression models provide he scienis wih a powerful ool, allowing predicions abou fuure evens o be made wih informaion abou pas or presen evens. In order o consruc a regression model, boh he informaion which is going o be used o make he predicion and he informaion which is o be prediced mus be obained from a sample of objecs or individuals. The relaionship beween he wo pieces of informaion is hen modeled wih a linear ransformaion. Then in he fuure, only he firs informaion is necessary, and he regression model is used o ransform his informaion ino he prediced. In oher words, i is necessary o have informaion on boh variables before he model can be consruced [1, 6]. Regression models are one of he mos famous examples of economic and saisical models used in he forecasing of socioeconomic processes. Consrucion of he regression models includes he following sages: 1) Selecion of an objec o forecas. The objecs of socioeconomic forecasing are he economic processes (for example, inflaion, demand, supply, exchange rae, ec.), any indicaor describing he company aciviy (for example, producion, price, profi, income, sales, coss, ec.), any indicaor describing he naional economics (for example, gross domesic produc, gross invesmen, naional income, governmen spending, expor, impor, exernal deb, ec.), any indicaor describing he social processes (for example, wage, bonus fund, incenive fund, overime paymens, employmen and unemploymen, emigraion and immigraion, ec.). An objec of forecasing is a dependen variable. ) Selecion of he facors (independen variables) ha explains he changes in he socioeconomic processes. The facors should be in he causal link o he objec of forecasing and all facors mus be quaniaively measured and significan. For example, company s reail sales depend on expeced consumer s income and adverising coss. In his example, company s reail sales are he objec of forecasing (or dependen variable); he expeced consumer s income and adverising coss are facors or independen variables. 3) Daa collecion is he process of obaining useful informaion on key quaniaive characerisics of socioeconomic processes. Saisical informaion necessary o forecasing can be obained from primary and secondary daa sources. Daa processing is any process ha summarizes, 9
3 ISSN 338 Соціальноекономічні проблеми і держава. Вип. 1 (6). 01 analyzes or oherwise convers daa ino usable informaion. Informaion base of forecasing based on regression models is he several inerrelaed ime series wih a feedback relaionship. 4) Selecion of he mahemaical dependence beween he facors or independen variables and dependen variable. Regression models can be described by he following ypes of dependencies: linear, power, logarihmic, ec. In linear regression, daa are modeled using linear funcions, and unknown model parameers are esimaed from he daa. Linear regression is an approach o modeling he relaionship beween wo or more independen variables () and a single dependen variable (Y). The case of one explanaory variable is called simple regression model. More han one explanaory variable is muliple regression models. On pracice is widely used he more general muliple regression model. General muliple regression model can have muliple explanaory variables. Muliple regression model is a flexible mehod of daa analysis ha may be appropriae whenever a quaniaive variable (he dependen variable) is o be examined in relaionship o any oher facors (expressed as independen variables). For example, a muliple regression model migh examine average salaries (dependen variable) as a funcion of age, educaion, gender and experience (independen variables). Muliple regression requires a large number of observaions. The number of periods mus subsanially exceed he number of independen variables you are using in regression. The absolue minimum is ha you have five periods [1, 6]. The forecas linear equaion ha esimaes he muliple regression model look like (1): Y = b0 + b1 1 + b bm m ; (1) where Y is called he exogenous variable, response variable, measured variable, or dependen variable. The decision as o which variable in a daa se is modeled as he dependen variable and which are modeled as he independen variables may be based on a presumpion ha he value of one of he variables is caused by, or direcly influenced by he oher variables; 1,,... m are called endogenous variables, explanaory variables, inpu variables, predicor variables, or independen variables a period ; b 0, b1, b... b m b are he regression coefficiens; 0 measures he changes in Y wih respec o random facors ha are no included in he regression model; b 1 measures he changes in Y wih respec o 1 ; b measures he changes in Y wih respec o ; b т measures he changes in Y wih respec o т. To find he regression coefficiens (b 0, b 1, b, b m ) need o calculae he sysem of normal equaions. The calculaion formulas are complex. For muliple regression, i is almos imperaive o use compuer sofware (Daa Analysis) o he predicion equaion. Corresponding o he muliple regression equaion, sofware finds a forecas equaion by esimaing he model parameers using sample daa. 5) Checking he model on he goodness of fi and he saisical significance based on saisical coefficiens. If a model is reliable and saisical significan, he forecas will be accurae. 6) Calculaion of he independen variables forecass is he process of predicion he independen variables under influence of a ime facor. To find he quaniaive values of independen variables forecass we can use he forecasing based on rend exrapolaion. 7) Calculaion of he forecas based on regression modeling is he process of predicion he quaniaive value of dependen variable under influence of independen variables. Muliple Model Esimaion in Pracice. Applicaion above heoreical informaion for forecasing based on muliple regression is described in example below. Saisical daa on reail sales, expeced consumer s income and adverising coss wihin 10 monhs are given on able 1. We wan o explain how o calculae he reail sales forecas for January based on muliple regression model. 93
4 ISSN 338 SocioEconomic Problems and he Sae, Vol. 6, No. 1, 01 Table 1 Saisics on reail sales, expeced consumer s income and adverising coss Monhs Reail sales, Expeced consumer s Adverising housand dollars income, housand dollars housand dollars Mar 15 0,0 1,5 April 16 0, 1,7 May 18 0,5 1,8 June 130 0,7 13,0 July 131 0,9 13, Augus 133 1, 13,5 Sepember 139 1,5 13,7 Ocober 14,1 13,8 November 145,7 14,0 December 150 3,5 14,4 coss, In his example, company s reail sales are dependen variable Y; he expeced consumer s income and adverising coss are facors or independen variables. To find he reail sales forecas based on regression modeling we need o use he muliple regression model (): Y = b0 + b1 1 + b, () where Y is he forecas of company s reail sales, housand dollars; 1 is he expeced consumer s income a period ; is he adverising coss a period ; b 0 b1,, b are he regression coefficiens. b, b 0 b, The calculaion of coefficiens 1 is long and laborious process. Microsof Excel provides a lo of possibiliies o forecasing based on regression modeling. Saisical daa on reails sales, expeced consumer s income and adverising coss wihin 10 monhs should be presened on Excel spreadshee. Firsly, selec he Daa menu / Daa Analysis / Regression (Figure 1). Fig. 1. Daa menu / Daa Analysis / Regression 94
5 ISSN 338 Соціальноекономічні проблеми і держава. Вип. 1 (6). 01 The following window appears (Figure ). The firs box is he Inpu Y Range. Here, we ell Excel abou our dependen variable (reail sales). The dependen variable mus be a column. To fill Inpu Y Range need click here and ener he cell reference for he range of daa on reail sales. The nex sage is o inpu independen variables. The independen variables mus be a block of daa, if he independen variables are several, or column of daa, if he independen variable is one. In he daase we are using we have wo independen variables: he expeced consumer s income and adverising coss. To fill Inpu Range need click here and ener he cell reference for he block of daa on expeced consumer s income and adverising coss. If he Confidence Level equals o 95%, you can say ha you are 95% sure ha he reail sales forecas will be accurae. Nex we ell Excel where we wan he resuls o be wrien. To fill Oupu range ener he reference for he cell (B13) of he oupu able. So, finally, we click OK. Fig.. Regression window And we ge a lo of oupu. The regression oupu has hree componens: Regression saisics able, ANOVA able, Regression coefficiens able (Figure 3). Figure 3 conains he informaion need o ge he muliple regression model. b Quaniaive values of he coefficiens: 0 b is opposie Inercep ( 0 = 33,96); b1 is opposie Variable 1 ( b 1 = 5,04); b is opposie Variable ( b = 4,38). The muliple regression model need o forecas he reail sales (Y) for January is: Y = 33,96+ 5, , 38. (3) We have he muliple regression model (3) need o forecas he reail sales, bu quaniaive value of he forecas using Daa Analysis we can no ge. The nex sage is checking he muliple regression model (3) on he goodness of fi and he saisical significance. And afer checking he model, we can calculae quaniaive value of he reail sales forecas. Saisical goodness of fi for he muliple regression model can be deermined by he following saisical coefficiens: he correlaion coefficien (r), he coefficien of deerminaion (R ) and adjused coefficien of deerminaion (AR ). 95
6 ISSN 338 SocioEconomic Problems and he Sae, Vol. 6, No. 1, 01 Fig. 3. The regression oupu: Regression saisics able, A OVA able, Regression coefficiens able Coefficien of deerminaion (R ) is a measure o assess how well he muliple regression model explains and predics fuure oucomes. I is expressed as a value beween 0 and 1. A value of one indicaes a perfec fi, and herefore, a very reliable muliple regression model for fuure forecass. A value of zero, on he oher hand, would indicae ha he muliple regression model fails o accuraely forecas he daase [3, 5]. The following poins are acceped guidelines for inerpreing he coefficien of deerminaion: values beween 0 and 0,3 indicae a weak posiive linear relaionship; values beween 0,3 and 0,7 indicae a moderae posiive linear relaionship; values beween 0,7 and 1 indicae a srong posiive linear relaionship. The correlaion coefficien (r), is a measure of he srengh of he relaionship beween wo or more independen variables () and a single dependen variable (Y). One of ways o find his coefficien is he following: correlaion coefficien (r) is he square roo of he coefficien of deerminaion (4): r = R (4) The correlaion coefficien akes on values ranging beween +1 and 1. The following poins are acceped guidelines for inerpreing he correlaion coefficien: 0 indicaes no linear relaionship; +1 indicaes a perfec posiive linear relaionship; 1 indicaes a perfec negaive linear relaionship; values beween 0 and 0,3 (0 and 0,3) indicae a weak posiive (negaive) linear relaionship; values beween 0,3 and 0,7 (0,3 and 0,7) indicae a moderae posiive (negaive) linear relaionship; values beween 0,7 and 1 (0,7 and 1) indicae a srong posiive (negaive) linear relaionship [4, 5]. In a muliple linear regression model, adjused coefficien of deerminaion (AR ) measures he share of he variaion in he dependen variable accouned by he explanaory variables. Adjused coefficien of deerminaion is generally considered o be a more accurae goodnessoffi measure han he coefficien of deerminaion. The adjused R will always be less han or equal o he coefficien of deerminaion (R ). Adjused coefficien of deerminaion is paricularly useful in he feaure selecion sage of model building [4, 5]. 96
7 ISSN 338 Соціальноекономічні проблеми і держава. Вип. 1 (6). 01 Adjused coefficien of deerminaion (RSquare) is compued using he following formula (5): (1 R ) ( n 1) Adjused R = 1 ( n k 1), (5) where R is he coefficien of deerminaion; n is he number of observaions (or periods); k is he number of independen variables. To find he correlaion coefficien and coefficien of deerminaion we need o inerpre Regression saisics able (Figure 3). Table Regression saisics Explanaion Muliple R 0, Correlaion coefficien R Square 0, Coefficien of deerminaion Adjused R Square 0, Adjused coefficien of deerminaion Sandard Error 1, Sandard Error is a measure of error in predicion Observaion 10 Number of observaions used in he regression Correlaion coefficien can be calculaed by he formula (4): r = 0, ,99 Correlaion coefficien r=0,99 may be inerpreed as follows: approximaely 99% (0,99*100%) of he variaion in he dependen variable (reail sales) can be explained by he muliple regression model (3). If he coefficien of deerminaion is greaer han 0,7, as i is in his case, here is a good fi o he daa. The coefficien of deerminaion 0,985 means approximaely 98,5% (0,985*100%) of he variaion in he dependen variable (reail sales) can be explained by he independen variables (he expeced consumer s income and adverising coss). Adjused coefficien of deerminaion by he following formula (5): Adjused R (1 0, ) (10 1) = 1 (10 1) 0,981 Adjused coefficien of deerminaion 0,981 means approximaely 98,1% (0,981*100%) of he variaion in he dependen variable (reail sales) can be explained by he independen variables (he expeced consumer s income and adverising coss). Checking he model on he saisical significance based on ANOVA able (Figure 3), where (SS is he sum of squares, he numeraor of he variance; DF is he denominaor; MS is he mean square of variance; Significance F means he saisical significance of he muliple regression model). ANOVA means an analysis of variance ha consiss of calculaions ha provide informaion abou levels of variabiliy wihin a regression model and form a basis for ess of significance. Significance F means he saisical significance of he muliple regression model. In his example (Figure 3), he value of Significance F is lower han 0,05, hen we can say he muliple regression model is generally accepable and saisical significan o forecas of he reail sales (3,98*107 <0,05). Checking of each coefficien on he saisical significance based on Regression coefficiens able (Figure 3), where column Coefficien gives he quaniaive values of regression coefficiens 97
8 ISSN 338 SocioEconomic Problems and he Sae, Vol. 6, No. 1, 01 b0, b1, b ; column Sandard error gives he sandard errors (i.e. he esimaed sandard deviaion) of regression coefficiens; column Sa gives he compued saisic (is a raio of he deparure of an esimaed parameer from is noional value and is sandard error); column Pvalue gives he probabiliy value for each regression coefficien. If Pvalue is less han 0,05 (5% misake probabiliy), hen he coefficien is saisical significan (95 % probabiliy means he forecas based on muliple regression model is accurae), and if Pvalue is more han 0,05; he coefficien is saisical insignifican. In his example, Pvalue for coefficien b 0 is 0,008 (lower han 0,05), Pvalue for coefficien b 1 is 0,01 (lower han 0,05), Pvalue for coefficien b is 0,17 (higher han 0,05), hen we can say he muliple regression model in generally is saisical significan. Thus, he muliple regression model (3) is saisical significan, he model is useful and reliable o forecas. To find he forecas of he reail sales for January, a firs, we need o calculae he quaniaive values of expeced consumer s income forecas and adverising coss forecas for January. Calculaion of he expeced consumer s income forecas and adverising coss forecas for January is possible using he forecasing based on rend exrapolaion. To do his we need o find he forecas of expeced consumer s income depending on ime () and he forecas of adverising coss depending on ime (). Firsly, we need o calculae he expeced consumer s income forecas based on rend exrapolaion (using a linear equaion). Linear equaion looks like (6): х = а+ b, (6) where х is he expeced consumer s income forecas based on rend exrapolaion (or adverising coss forecas based on rend exrapolaion); a and b are he designae coefficiens; is he ime uni. Coefficien b can be calculaed by he formula (7): b х n х = n _, (7) where n number of periods; is he average value of variable (ime or independen variable); _ х is he average value of dependen variable x (average value of expeced consumer s income or average value of adverising coss). Average value of variable can be calculaed by he formula (8): = n, (8) where n is he number of periods;  is he sum of numbers from 1 o n; Average value of variable x can be calculaed by he formula (9): _ х х = n ; (9) where n is he number of periods; х  is he sum of saisical daa for n periods. Coefficien a can be calculaed by he formula (10): a = х b. (10) 98
9 ISSN 338 Соціальноекономічні проблеми і держава. Вип. 1 (6). 01 To wrie down a linear equaion х1 = а+ b (where х 1 is he expeced consumer s income forecas) and calculae he coefficiens b and a need o find:, x 1 * on able 3. Resuls of calculaions Monhs Expeced consumer s income ( х 1), housand dollars x 1 * Mar 0, April 0, 4 40,4 May 0, ,5 June 0, ,8 July 0, ,5 Augus 1, , Sepember 1, ,5 Ocober, ,8 November, ,3 December 3, , Table 3 Average value of ime () by he formula (8): = n 55 = = 5,5 10. Average expeced consumer s income (x 1 ) by he formula (9): Coefficien b by he formula (7): 13,3 10 х1 = = 1, ,5 1,33 b= (5,5) 0,361 Coefficien a by he formula (10): a= 1,33 0,361 5,5= 19,34 Linear equaion looks like: х1 = a+ b = 19,34+ 0, 361 Forecas of expeced consumer s income for January based on rend exrapolaion: х = 19,34+ 0, ,311 housand dollars. To wrie down a linear equaion х = а+ b (where х is adverising coss forecas) and calculae he coefficiens b and a need o find:, x * on able 4. 99
10 ISSN 338 SocioEconomic Problems and he Sae, Vol. 6, No. 1, 01 Resuls of calculaions Monhs Adverising coss ( х ), housand dollars x * Mar 1, ,5 April 1,7 4 5,4 May 1, ,4 June 13, July 13, Augus 13, Sepember 13, ,9 Ocober 13, ,4 November 14, December 14, , ,6 Table 4 Average value of ime () by he formula (8): = n 55 = = 5,5 10 Average adverising coss (x ) by he formula (9): Coefficien b by he formula (7): Coefficien a by he formula (10): Linear equaion looks like: 133,6 10 х = = 13,36 751,6 10 5,5 13,36 b= (5,5) 0,03 a = 13,36 0,03 5,5= 1,4 х = a+ b = 1,4+ 0, 03 Forecas of adverising coss for January based on rend exrapolaion: х = 1,4+ 0,03 11= 14,473 housand dollars. Reail sales forecas for January based on muliple regression model (formula 3): Y = 33,96+ 5, ,38 = = 33,96+ 5,04 3,311+ 4,38 14, ,04 housand dollars. 100
11 ISSN 338 Соціальноекономічні проблеми і держава. Вип. 1 (6). 01 Thus, he reail sales forecas for January based on muliple regression model equals o 150,04 housand dollars. Conclusion. The muliple regression model was effecive for forecasing reail sales under influence of expeced consumer s income and adverising coss. I can be applied for forecasing oher business daa. Using such models for forecasing reail sales can assis company managers in planning and making decisions more effecively. References: 1. Cohen, J., Cohen, P., Wes, S. G., & Aiken, L. S. (003). Applied muliple regression/correlaion analysis for he behavioral sciences, 3rd Ed. Mahwah, NJ: Lawrence Erlbaum Associaes.. Rogers, David S. A Review of Sales Forecasing Models, Inernaional Journal of Reail and Disribuion Managemen, MCB Universiy Press, Vol. 0, Issue 4, MiningLong Lee & R. Kelley Pace Spaial Disribuion of Reail Sales, The Journal of Reail Esae Finance and Economics, Springer, Vol. 31(1), pages 5369, Augus, Lundholm, Russell J. and McVay, Sarah E., Forecasing Sales: A model and some evidence from he reail indusry (January, 004). 5. Wassana Suwanviji, Chamnein Choonpradub, Niaya McNeil Saisical Model For ShorTerm Forecasing Sparkling Beverage Sales In Souhern Thailand, Inernaional Business & Economics Research Journal, Vol.8, 9, Sepember Samawi, H.M., Ababneh, F.M., On regression analysis using ranked se sample, Journal of Saisical Research. 35 (001), Рецензія: д.е.н., проф. Кирич Н. Б. Received: March, 01 1s Revision: April, 01 Acceped: May,
More information