Joural of Software Egeerg ad Applcatos, 205, 8, 33-42 Publshed Ole March 205 ScRes. http://www.scrp.org/joural/jsea http://dx.do.org/0.4236/jsea.205.8304 Optmzg Software Effort Estmato Models Usg Frefly Algorthm Nazeeh Ghatasheh, Hossam Fars 2, Ibrahm Aljarah 2, Rzk M. H. Al-Sayyed 2 Departmet of Busess Iformato Techology, The Uversty of Jorda, Aqaba, Jorda 2 Departmet of Busess Iformato Techology, The Uversty of Jorda, Amma, Jorda Emal:.ghatasheh@ju.edu.jo, hossam.fars@ju.edu.jo,.aljarah@ju.edu.jo, r.alsayyed@ju.edu.jo Receved 27 February 205; accepted 6 March 205; publshed 8 March 205 Copyrght 205 by authors ad Scetfc Research Publshg Ic. Ths work s lcesed uder the Creatve Commos Attrbuto Iteratoal Lcese (CC BY). http://creatvecommos.org/lceses/by/4.0/ Abstract Software developmet effort estmato s cosdered a fudametal task for software developmet lfe cycle as well as for maagg project cost, tme ad qualty. Therefore, accurate estmato s a substatal factor projects success ad reducg the rsks. I recet years, software effort estmato has receved a cosderable amout of atteto from researchers ad became a challege for software dustry. I the last two decades, may researchers ad practtoers proposed statstcal ad mache learg-based models for software effort estmato. I ths work, Frefly Algorthm s proposed as a metaheurstc optmzato method for optmzg the parameters of three COCOMO-based models. These models clude the basc COCOMO model ad other two models proposed the lterature as extesos of the basc COCOMO model. The developed estmato models are evaluated usg dfferet evaluato metrcs. Expermetal results show hgh accuracy ad sgfcat error mmzato of Frefly Algorthm over other metaheurstc optmzato algorthms cludg Geetc Algorthms ad Partcle Swarm Optmzato. Keywords Software Qualty, Effort Estmato, Metaheurstc Optmzato, Frefly Algorthm. Itroducto Effort estmato of software developmet has bee a crucal task for software egeerg commuty. Relable effort estmato makes t more depedable to schedule project actvtes, allocate resources, estmate costs, ad reduce the probablty of project falures or delays. Accordg to the survey [], most of the projects face overrus of effort or schedules. The survey also clamed that the lack of accurate estmato models s a ma reaso for project overrus. How to cte ths paper: Ghatasheh, N., Fars, H., Aljarah, I. ad Al-Sayyed, R.M.H. (205) Optmzg Software Effort Estmato Models Usg Frefly Algorthm. Joural of Software Egeerg ad Applcatos, 8, 33-42. http://dx.do.org/0.4236/jsea.205.8304
Usually projects seem to be vague at the begg ad become less vague as they progress. At the same tme, each project has ts specal ature that makes t much harder to estmate the requred effort for completo. Due to the ucerta ature of projects, authors [2] [3] suggested developg models that ca adapt to a wde rage of projects. But for the fact that software project data sets are typcally small ad the uderlyg relatos are accurate or mssg, the task of predcto becomes more challegg. Several effort estmato models have bee developed ad mproved over tme for better predcto accuracy ad thus better developmet qualty [] [4]-[8]. Such models rage from complex calculatos ad statstcal aalyss of project parameters, to advaced mache learg approaches. Heurstc optmzato [9] s a method that reles o several attempts to fd a optmal soluto. Heurstc optmzers have bee used software effort estmato [0] as the use of geetc programmg [] for model optmzato. Aother example s the part that Partcle Swarm Optmzato took [2] as a heurstc optmzer. Moreover, the hybrd approaches ecompass a combato of heurstc algorthms lke the use of Geetc Algorthm ad At Coloy [3]. Despte a large umber of expermets o fdg the best predcto model, there s o clear evdece of a hghly accurate or effcet approach. At the same tme t s mportat to develop a predcto method that s less complex ad much more useful. For stace, some predcto models, a large umber of varables that are used to costruct the model do ot reflect or mprove the accuracy of the predcto model. Thus, collectg extra or urelated varables s tme-cosumg wth o sgfcace. It would be more effcet to buld a model wth a mmum umber of varables, hopefully fdg the most mportat ad commo varables for geerc project developmet efforts. Ths work presets a study of how Frefly Algorthm mproves the overall estmato of the software effort estmato. Where the ma cotrbutos are: Provg the sutablty of Frefly Algorthm as predctor towards a geerc predcto model for software effort estmato. The sgfcat mprovemet performace over prevously reported methods. The sutablty of mache learg approaches for effort predcto usg a small umber of put varables ad data set staces. 2. Related Work May of Mache Learg (ML) approaches the lterature have bee appled to mprove the software effort estmato [2]. ML optmzato algorthms that are spred from ature have receved much atteto to fd more accurate estmato for software effort. Nature-spred ML algorthms clude Cuckoo Search [4], Partcle Swarm Optmzato (PSO) [5], Bat Algorthm [6], Frefly Algorthm [7], ad may others. I [8], the authors compared the performace of dfferet soft computg techques such as PSO-Tued COCOMO, Fuzzy Logc wth tradtoal effort estmato structures. Ther results showed that the proposed model outperformed tradtoal effort estmato structures for NASAs software effort data set. I [7], decso trees based algorthm was used to perform the software effort estmato. I addto, the authors preseted a emprcal proof of performace varatos for several approaches that clude Lear Regresso, Artfcal Neural Networks (ANN), ad Support Vector Maches (SVM). Also the authors poted to the sutablty of the expermeted ML approaches the area of effort estmato. From ther performace comparso results wth other tradtoal algorthms, ther results terms of the error rate were better tha other techques. A hybrd approach was adopted [9] for parameter selecto ad model optmzato. The authors used Geetc Algorthms (GA) for optmzg a Support Vector Regresso model. The authors clarfed the mpact of usg GA feature selecto ad parameter optmzato of the effort estmato model. The results of ther approach showed that GA s applcable to mprove the performace of the SVR model compared to other approaches. A geerc framework s proposed [20] for software effort estmato. The framework tres to smulate the huma way of thkg to resolve the effort estmato by adoptg fuzzy rules modelg. Therefore the geerated models take advatage of experts kowledge, teroperable, ad could be appled to varous problems as rsk aalyss or software qualty predcto. ANNs gaed otceable atteto by researchers for effort estmato as llustrated by the revew [2], but t s suffcet to geeralze the applcablty of ANN effort estmato. The authors stated that t s requred to have further thorough vestgato. The authors [22] reled o seve evaluato measures to assess the stablty of 90 software effort predctors over 20 data sets. Accordg 34
to the emprcal results t was foud that aalogy-based methods or regresso trees outperformed terms of stablty. Such coclusos ope the door for extesve research towards a superor ad geerc predcto approach regardg the software effort estmato ssue. 3. Frefly Algorthm Frefly Algorthm (FA) s a multmodal optmzato algorthm, whch belogs to the ature-spred feld, s spred from the behavor of frefles or lghtg bugs [7]. FA was frst troduced by X-She at Cambrdge Uversty 2007 [7]. FA s emprcally prove to tackle problems more aturally ad has the potetal to over-perform other metaheurstc algorthms. FA reles o three basc rules, the frst mples that all frefles are attracted to each other wth dsregard to geder. The secod rule states that attractveess s correlated wth brghtess or lght emsso such that brght fles attract less brght oes, ad for absece of brghter fles the movemet becomes radom. The last ma rule mples that the ladscape of the objectve fucto determes or affects the lght emsso of the fly, such that brghtess s proportoal to the objectve fucto. Algorthm. Pseudo-code of frefly algorthm. Objectve fucto f ( x ) x = ( x,, xd ) T Geerate tal populato of frefles x ( =,2,, ) Lght testy I at f x Defe lght absorpto coeffcet γ whle (t < MaxGeerato) do for = : all frefles do for j = : all frefles do f ( I j > I ) the Move frefly towards j d-dmeso; ed f x s determed by ( ) Attractveess vares wth dstace r va exp[ γ r] Evaluate ew solutos ad update lght testy ed for ed for Rak the frefles ad fd the curret best ed whle Post-process results ad vsualzato The attractveess amog the fles FA has two ma ssues that are; the modelg of attractveess ad the varous lght testes. For a specfc frefly at locato X brghtess I s formulated as I( X ) α f ( X). Whle attractveess β s proportoal to the fles ad s related to the dstace R, j betwee frefles ad j. Equato () shows the verse square of lght testy I( r ) whch I 0 represets the lght testy at the source. 2 I( r) I e γ = r 0 () Assumg a absorpto coeffcet of the evromet γ, testy s represeted Equato (2) whch I 0 s the orgal testy. I0 I( r) = (2) 2 + γ r Geerally the Eucldea dstace s llustrated Equato (3), whch represets the dstace betwee a frefly at locato X ad aother at locato X j. I whch X k, s the k th compoet of the spatal coordate X. ( ) 2,, R = x x = x x (3) j j k j k k = d 35
A frefly attracted to a brghter oe j as llustrated Equato (4) where attracto s represeted by β 2 ( x ) e r j x γ, ad α rad represets the radomess accordg to the radomzato parameter α. j 2 x = x + β 2 ( x ) r j x + α rad γ e j 2 Furthermore, varatos of attractveess are determed by γ whch o ts tur affects the behavor ad covergece speed of FA. 4. Effort Estmato Models Oe of the Famous ad wdely used effort estmato models s the Costructve Cost Models COCOMO ad ts exteso COCOMOII. COCOMO s used as cost, effort, ad schedule estmato model the process of plag ew software developmet actvty, also kow as COCOMO 8. COCOMO was defed betwee the late 970s ad early 980s [23]. Where COCOMOII s a later exteso of the prevously defed model. Ths research work tres to optmze the parameters of three varatos of the COCOMO model. The frst s the basc COCOMO model whch s represeted Equato (5). ( ) b E = a KLOC (5) E s the effort perso-moths, KLOC represets the thousad (K) les of code cluded a software project. Typcally, the coeffcet a ad the expoet b are chose based o COCOMO pre-set parameters that deped o the software project detals. The other two models are extesos of the basc COCOMO model whch are proposed by A. Sheta [24]. Both models cosder the effect of methodologes (ME) as supposed to be learly related to the software effort. These models are represeted Equatos (6) ad (7) ad amed Model I ad Model II respectvely. b ( KLOC) ( ME) E = a + c (6) b ( KLOC) ( ME) E = a + c + d (7) Ths work tres emprcally to optmze the costats a, b, c ad d usg FA, GA ad PSO. 5. Data Set ad Evaluato Measures Ths research cosders a famous ad publc data set order to produce comparable results; amely NASA projects' effort data set. The data set s challegg due to the small umber of staces ad lmted umber of aalyzed varables. However, regardg the objectves of ths research the data set s cosdered to be adequate. The data set s splt to two parts; trag set of about 60% ad testg set of about 30% staces. NASA data set [6] cossts of 8 software projects for whch ths research cosders three ma varables that are the project sze thousad Les of Code (KLOC), Methodology (ME), ad Actual Effort (AE). Trag data set has 3 staces ad the records from 4 tll 8 are for testg the model. Table shows the actual values of the trag ad testg data sets. I order to check the performace of the developed models, the computed measures are the Correlato Coeffcet (R 2 ), the Mea Squares Error (MSE), R = 2 2 ( y ) ( ˆ Y y y) 2 = = ( y y) = = 2 MSE = ( y yˆ ) 2 (9) (4) (8) 36
Table. NASA data set. Project No. KDLOC ME Measured Effort 90.2 30 5.8 2 46.2 20 96 3 46.5 9 79 4 54.5 20 90.8 5 3. 35 39.6 6 67.5 29 98.4 7 2.8 26 8.9 8 0.5 34 0.3 9 2.5 3 28.5 0 3. 26 7 4.2 9 9 2 7.8 3 7.3 3 2. 28 5 4 5 29 8.4 5 78.6 35 98.7 6 9.7 27 5.6 7 2.5 27 23.9 8 00.8 34 38.3 the Mea Absolute Error (MAE), MAE = y ˆ y (0) = the Mea Magtude of Relatve Error (MMRE), ad the Varace-Accouted-For (VAF). y ˆ y MMRE = () y = ( y( t) yˆ ( t) ) var ( y( t) ) var VAF = 00% These performace crtera are used to measure how close the predcted effort to the actual values, where y s the actual value, ŷ s the estmated target value, ad s the umber of staces. 6. Expermets ad Results The expermets apply FA, GA ad PSO for optmzg the coeffcets of the basc COCOMO model, COCOMO Model I ad COCOMO Model II based o the trag part of NASA data set. For FA, the Matlab mplemetato developed by X.-S. Yag [9] s appled. Number of fles, partcles ad populato sze s ufed ad set to 00 all the algorthms. The umber of teratos s set to 500. The rest of the parameters of FA, GA ad PSO are set as lsted Tables 2-4. MAE crtera are used as a objectve fucto whch s show Equato (0). I order to carry out meagful evaluato results, each algorthm s appled 25 tmes the the average of the evaluato results s reported. I each ru, the optmzed models are evaluated based o the testg data usg VAF, MSE, MAE, MMRE, RMSE ad R 2 evaluato metrcs. Carryg out the expermets, the average covergece curves for FA, GA ad PSO are show Fgures -3 respectvely for the three varatos of COCOMO model. (2) 37
Table 2. Frefly algorthm parameter settgs. Parameter Value Maxmum teratos 500 Number of frefles 00 Alpha 0.4 Betam Gamma 0.4 Table 3. GA parameter settgs. Parameter Value Maxmum teratos 500 Populato sze 00 Selecto method Touramet selecto Crossover probablty 80% Mutato probablty 5% Table 4. PSO parameter settgs. Parameter Value Maxmum teratos 500 Partcles 00 Accelerato costat [2., 2.] Ierta weght [0.9, 0.6] Maxmum velocty 00 Fgure. Covergece of FA, GA ad PSO optmzg the basc COCOMO model. 38
Fgure 2. Covergece of FA, GA ad PSO optmzg Model I. Fgure 3. Covergece of FA, GA ad PSO optmzg Model II. 39
The evaluato results for trag ad testg cases are show Tables 5-7. Based o Table 5 ad Table 6 t ca be otced that Frefly outperforms GA ad PSO optmzg the basc COCOMO model ad the Model I by meas of all evaluato metrcs. For the Model II, Frefly ad PSO are very compettve ad have very close results. O the other had GA has the lowest results ad t has the slowest covergece. I summary, FA as a metaheurstc optmzato algorthm over-performs GA ad PSO terms of hgher estmato accuracy for the software effort COCOMO based models. Table 5. Basc COCOMO model. Trag Testg Frefly GA PSO Frefly GA PSO VAF 93.82% 93.72% 93.73% 98.6% 97.97% 97.98% MSE 04.88 07.28 07.5 59.4 63.96 63.68 MAE 7.04 7.03 7.03 5.65 6.06 6.04 MMRE 0.24 0.24 0.24 0. 0.3 0.2 RMSE 0.24 0.36 0.35 7.67 8.00 7.98 R 2 0.9367 0.9352 0.9353 0.978 0.9763 0.9765 Table 6. COCOMO Model I. Trag Testg Frefly GA PSO Frefly GA PSO VAF 96.78% 92.94% 96.96% 98.62% 97.97% 98.52% MSE 56.05 27.70 54.6 47.74 98.7 60.07 MAE 5.42 8.94 5.6 5.56 7.70 5.63 MMRE 0.4 0.53 0.39 0.24 0.29 0.23 RMSE 7.48 0.95 7.36 6.82 9.39 7.72 R 2 0.9662 0.9229 0.9673 0.9823 0.9637 0.9778 Table 7. COCOMO Model II. Trag Testg Frefly GA PSO Frefly GA PSO VAF 96.95% 92.42% 97.48% 98.63% 97.60% 98.70% MSE 53.74 29.37 45.28 45.02 4.79 52.85 MAE 5.36 8.20 4.43 5.57 7.83 5.29 MMRE 0.38 0.40 0.30 0.24 0.27 0.2 RMSE 7.26.05 6.72 6.62 9.86 7.9 R 2 0.9676 0.929 0.9727 0.9833 0.9575 0.9805 40
7. Cocluso ad Future Work Ths work vestgated the effcecy of applyg the Frefly Algorthm as a metaheurstc optmzato techque to optmze the parameters of dfferet effort estmato models. These models are three varatos of the Costructve Cost Model COCOMO whch are the basc COCOMO model, ad other two extesos of the basc model that were proposed prevously the lterature. The optmzed models are assessed accordg to dfferet evaluato crtera ad compared wth models optmzed usg other metaheurstc algorthms whch are Geetc Algorthm ad Partcle Swarm Optmzato. Evaluato results show that developed models usg the Frefly Algorthm have hgher accuracy estmatg software effort. Further future work s teded to overcome the stablty ssues, a more geerc predcto model that s ot hghly affected by the sze ad the type of data set, ad preferably a ehacemet to the Frefly Algorthm tself. Moreover, t would be mportat to work towards a hybrd approach that ecompasses the best characterstcs of dfferet predcto schemes. Refereces [] Molokke, K. ad Jorgese, M. (2003) A Revew of Software Surveys o Software Effort Estmato. 2003 Iteratoal Symposum o Emprcal Software Egeerg, 30 September- October 2003, 223-230. [2] Sog, Q. ad Shepperd, M. (20) Predctg Software Project Effort: A Grey Relatoal Aalyss Based Method. Expert Systems wth Applcatos, 38, 7302-736. http://dx.do.org/0.06/j.eswa.200.2.005 [3] Khatb, V. ad Jawaw, D.N. (20) Software Cost Estmato Methods: A Revew. Joural of Emergg Treds Computg ad Iformato Sceces, 2, 2-29. [4] Afzal, W. ad Torkar, R. (20) O the Applcato of Geetc Programmg for Software Egeerg Predctve Modelg: A Systematc Revew. Expert Systems wth Applcatos, 38, 984-997. http://dx.do.org/0.06/j.eswa.20.03.04 [5] Albrecht, A. ad Gaffey, J.E. (983) Software Fucto, Source Les of Code, ad Developmet Effort Predcto: A Software Scece Valdato. IEEE Trasactos o Software Egeerg, SE-9, 639-648. http://dx.do.org/0.09/tse.983.23527 [6] Baley, J.W. ad Basl, V.R. (98) A Meta-Model for Software Developmet Resource Expedtures. Proceedgs of the 5th Iteratoal Coferece o Software Egeerg, Pscataway, 07-6. [7] Ruchka Malhotra, A.J. (20) Software Effort Predcto Usg Statstcal ad Mache Learg Methods. Iteratoal Joural of Advaced Computer Scece ad Applcatos (IJACSA), 2. [8] Yadav, C.S. ad Sgh, R. (204) Tug of Cocomo Model Parameters for Estmatg Software Developmet Effort Usg GA for Promse Project Data Set. Iteratoal Joural of Computer Applcatos, 90, 37-43. http://dx.do.org/0.520/5542-4367 [9] Wag, F.-S. ad Che, L.-H. (203) Heurstc Optmzato. I: Dubtzky, W., Wolkehauer, O., Cho, K.-H. ad Yokota, H., Eds., Ecyclopeda of Systems Bology, Sprger, New York, 885-885. [0] Uysal, M. (200) Estmato of the Effort Compoet of the Software Projects Usg Heurstc Algorthms. INTECH Ope Access Publsher, Croata. [] Alaa, F. ad Al-Afeef, A. (200) A GP Effort Estmato Model Utlzg Le of Code ad Methodology for NASA Software Projects. IEEE 0th Iteratoal Coferece o Itellget Systems Desg ad Applcatos (ISDA), Caro, 29 November- December 200, 290-295. [2] Bhattacharya, P., Srvastava, P. ad Prasad, B. (202) Software Test Effort Estmato Usg Partcle Swarm Optmzato. I: Satapathy, S., Avadha, P. ad Abraham, A., Eds., Proceedgs of the Iteratoal Coferece o Iformato Systems Desg ad Itellget Applcatos 202 (INDIA 202), Vsakhapatam, Jauary 202, Vol. 32 of Advaces Itellget ad Soft Computg, 827-835. Sprger, Berl ad Hedelberg. [3] Malek, I., Ghaffar, A. ad Masdar, M. (204) A New Approach for Software Cost Estmato wth Hybrd Geetc Algorthm ad At Coloy Optmzato. Iteratoal Joural of Iovato ad Appled Studes, 5, 72-8. [4] Yag, X.-S. ad Deb, S. (2009) Cuckoo Search va Levy Flghts. World Cogress o Nature Bologcally Ispred Computg, NaBIC 2009, Combatore, 9- December 2009, 20-24. [5] Keedy, J. ad Eberhart, R. (995) Partcle Swarm Optmzato. IEEE Iteratoal Coferece o Neural Networks, 4, 942-948. [6] Yag, X.-S. (200) A New Metaheurstc Bat-Ispred Algorthm. I: Gozlez, J., Pelta, D., Cruz, C., Terrazas, G. ad Krasogor, N., Eds., Nature Ispred Cooperatve Strateges for Optmzato (NICSO 200), Vol. 284 of Studes Computatoal Itellgece, 65-74. Sprger, Berl ad Hedelberg. [7] Yag, X.-S. (2009) Frefly Algorthms for Multmodal Optmzato. I: Wataabe, O. ad Zeugma, T., Eds., Sto- 4
chastc Algorthms: Foudatos ad Applcatos, Vol. 5792 of Lecture Notes Computer Scece, 69-78. Sprger, Berl ad Hedelberg. [8] Sheta, A., Re, D. ad Ayesh, A. (2008) Developmet of Software Effort ad Schedule Estmato Models Usg Soft Computg Techques. IEEE Cogress o Evolutoary Computato, CEC 2008 (IEEE World Cogress o Computatoal Itellgece), Hog Kog, 6 Jue 2008, 283-289. [9] Olvera, A.L.I., Braga, P.L., Lma, R.M.F. ad Corélo, M.L. (200) GA-Based Method for Feature Selecto ad Parameters Optmzato for Mache Learg Regresso Appled to Software Effort Estmato. Iformato ad Software Techology, 52, 55-66. http://dx.do.org/0.06/j.fsof.200.05.009 [20] Huag, X., Ho, D., Re, J. ad Capretz, L. (2006) A Soft Computg Framework for Software Effort Estmato. Soft Computg, 0, 70-77. http://dx.do.org/0.007/s00500-004-0442-z [2] Dave, V. ad Dutta, K. (204) Neural Network Based Models for Software Effort Estmato: A Revew. Artfcal Itellgece Revew, 42, 295-307. http://dx.do.org/0.007/s0462-02-9339-x [22] Keug, J., Kocaguel, E. ad Mezes, T. (203) Fdg Cocluso Stablty for Selectg the Best Effort Predctor Software Effort Estmato. Automated Software Egeerg, 20, 543-567. http://dx.do.org/0.007/s055-02-008-5 [23] Boehm, B., Abts, C., Clark, B. ad Deva-Chula, S. (997) COCOMO II Model Defto Maual. Uversty of Souther Calfora, Los Ageles. [24] Sheta, A.F. (2006) Estmato of the COCOMO Model Parameters Usg Geetc Algorthms for NASA Software Projects. Joural of Computer Scece, 2, 8-23. http://dx.do.org/0.3844/jcssp.2006.8.23 42