The Journal of Systems and Software

Size: px
Start display at page:

Download "The Journal of Systems and Software"

Transcription

1 The Journal of Systems and Software 82 (2009) Contents lsts avalable at ScenceDrect The Journal of Systems and Software journal homepage: www. elsever. com/ locate/ jss A study of project selecton and feature weghtng for analogy based software cost estmaton Y.F. L *, M. Xe, T.N. Goh Department of Industral and Systems Engneerng, Natonal Unversty of Sngapore, Sngapore , Sngapore a r t c l e n f o a b s t r a c t Artcle hstory: Receved 14 May 2007 Receved n revsed form 4 June 2008 Accepted 4 June 2008 Avalable onlne 17 June 2008 Keywords: Software cost estmaton Analogy based estmaton Feature weghtng Project selecton Genetc algorthm Artfcal datasets A number of software cost estmaton methods have been presented n lterature over the past decades. Analogy based estmaton (ABE), whch s essentally a case based reasonng (CBR) approach, s one of the most popular technques. In order to mprove the performance of ABE, many prevous studes proposed effectve approaches to optmze the weghts of the project features (feature weghtng) n ts smlarty functon. However, ABE s stll crtczed for the low predcton accuracy, the large memory requrement, and the expensve computaton cost. To allevate these drawbacks, n ths paper we propose the project selecton technque for ABE (PSABE) whch reduces the whole project base nto a small subset that consst only of representatve projects. Moreover, PSABE s combned wth the feature weghtng to form FWPS- ABE for a further mprovement of ABE. The proposed methods are valdated on four datasets (two realworld sets and two artfcal sets) and compared wth conventonal ABE, feature weghted ABE (FWABE), and machne learnng methods. The promsng results ndcate that project selecton technque could sgnfcantly mprove analogy based models for software cost estmaton. Ó 2008 Elsever Inc. All rghts reserved. 1. Introducton * Correspondng author. Tel.: E-mal address: lyanfu@nus.edu.sg (Y.F. L). Software cost estmaton s crtcal for the success of software project management. It affects almost management actvtes ncludng resource allocaton, project bddng, and project plannng (Pendharkar et al., 2005; Auer et al., 2006; Jorgensen and Shepperd, 2007). The mportance of accurate estmaton has led to extensve research efforts to software cost estmaton methods. From a comprehensve revew (Boehm et al., 2000), these methods could be classfed nto the followng sx categores: parametrc models ncludng COCOMO (Boehm, 1981; Huang et al., 2007), SLIM (Putnam and Myers, 1992), and SEER-SEM (Jensen, 1983); expert judgment ncludng Delph technque (Helmer, 1966) and work breakdown structure based methods (Tausworthe, 1980; Jorgensen, 2004); learnng orented technques ncludng machne learnng methods (Heat, 2002; Shn and Goel, 2000; Olvera, 2006) and analogy based estmaton (Shepperd and Schofeld, 1997; Auer et al., 2006; Huang and Chu, 2006); regresson based methods ncludng ordnary least square regresson (Mendes et al., 2005; Costaglola et al., 2005) and robust regresson (Myazak et al., 1994); dynamcs based models (Madachy, 1994); composte methods (Chulan et al., 1999; MacDonell and Shepperd, 2003). The analogy based estmaton (ABE) whch s essentally a case-based reasonng (CBR) approach (Shepperd and Schofeld, 1997) was frst proposed by Sternberg (1977). Due to ts conceptual smplcty and emprcal compettveness, ABE has been extensvely studed and appled (Shepperd and Schofeld, 1997; Walkerden and Jeffery, 1999; Angels and Stamelos, 2000; Mendes et al., 2003; Auer et al., 2006; Huang and Chu, 2006; Chu and Huang, 2007). The basc dea of ABE s smple: when provded a new project for estmaton, compare t wth hstorcal projects to retreve the most smlar projects whch are then used to predct the cost of new project. Generally, the ABE (or CBR) conssts of four parts: a hstorcal project dataset, a smlarty functon, a soluton functon and the assocated retreval rules (Kolodner, 1993). One of the assocated central parts n ABE s the smlarty functon, whch measures the level of smlarty between two dfferent projects. Snce each project feature (or cost drver) has one poston n the smlarty functon and therefore largely determnes whch hstorcal projects should be retreved for fnal predcton, there are several approaches focusng on searchng the approprate weght of each feature, such as Shepperd and Schofeld (1997), Walkerden and Jeffery (1999), Angels and Stamelos (2000), Mendes et al. (2003), Auer et al. (2006), Huang and Chu (2006). However, some dffcultes are stll confronted by ABE methods. Such as the non-normal characterstcs (ncludes skewness, heteroscedastcty and excessve outlers) of the software engneerng datasets (Pckard et al., 2001) and the ncreasng szes of the datasets (Shepperd and Kadoda, 2001). The large and non-normal datasets always lead ABE methods to low predcton accuracy and hgh computatonal expense (Huang et al., 2002). To allevate these drawbacks, many research works n the CBR lterature (Lpowezky, /$ - see front matter Ó 2008 Elsever Inc. All rghts reserved. do: /j.jss

2 242 Y.F. L et al. / The Journal of Systems and Software 82 (2009) ; Babu and Murty, 2001; Huang et al., 2002) have been devoted to the case selecton technque. The objectve of case selecton (CS) s to dentfy and remove redundant and nosy projects. By reducng the whole project base nto a smaller subset that consst only of representatve projects, the CS could save the computng tme searchng for most smlar projects and produce qualty predcton results. Moreover, the smultaneous optmzaton of feature weghtng and case selecton n CBR has been nvestgated n several studes (Kuncheva and Jan, 1999; Rozsypal and Kubat, 2003; Ahn et al., 2006) and sgnfcant mprovements are reported from these studes. From the dscusson above, t s worthwhle to nvestgate case selecton technque n the context of analogy based software cost estmaton. In ths study, we propose genetc algorthm for project selecton for ABE (PSABE) and the smultaneous optmzaton of feature weghts and project selecton for ABE (FWPSABE). The proposed two technques are compared aganst the feature weghtng ABE (ABE), the conventonal ABE and other popular cost estmaton methods ncludng ANN, RBF, SVM and CART. For the consstency of termnology, n rest of ths paper we refer the case selecton as project selecton for ABE. To compare dfferent estmaton methods, the emprcal valdaton s very crucal. Ths has led to many studes use varous real datasets to conduct comparsons of dfferent cost estmaton methods. However most publshed real datasets are relatvely small (Mar et al., 2005) and the small real dataset could be problematc f we would lke to show the sgnfcant dfferences between the estmaton methods. Another drawback of the real world datasets s that the true propertes of them may not be fully known. The artfcally generated datasets (Pckard et al., 2001; Shepperd and Kadoda, 2001; Foss et al., 2003; Myrtvet et al., 2005) wth known characterstcs provde a feasble way to the above problems. Thus, we generate two artfcal datasets and select two well known realworld datasets for controlled experments. The rest of ths paper s organzed as follows: Secton 2 presents a bref overvew on the conventonal ABE method. In Secton 3, the general framework of feature weght and project selecton system for ABE s descrbed. Secton 4 presents the real world datasets and the experments desgn. In Secton 5, the results on two real world data sets are summarzed and analyzed. In Secton 6, two artfcal datasets are generated, experments are conducted on these two datasets, and results are summarzed and analyzed. The fnal secton presents the concluson, and future works. 2. Overvew on analogy based cost estmaton Analogy based method s a pure form of case based reasonng (CBR) wth no expert used. Generally, ABE model comprses of four components: a hstorcal dataset, a smlarty functon, a soluton functon and the assocated retreval rules (Kolodner, 1993). The ABE system process also conssts of four stages: 1. Collect the past projects nformaton and prepare the hstorcal dataset. 2. Select new project s relevant features such as functon ponts (FP) and lnes of source code (LOC), whch are also collected for past projects. 3. Retreval the past projects, estmate the smlartes between new project and the past projects, and fnd the most smlar past projects. The commonly used smlartes are functons of weghted Eucldean dstance and the weghted Manhattan dstance. 4. Predct the cost of the new project from the chosen analogues by the soluton functon. Generally the un-weghted average s used as soluton functon. The hstorcal dataset whch keeps all nformaton of past projects s a key component n ABE system. However, t often contans nosy or redundant projects. By reducng the whole hstorcal dataset nto a smaller but more representatve subset, the project selecton technque postvely affects the conventonal ABE systems. Frst, t reduces the search space, thus more computng resources searchng for most smlar projects are saved. Secondly, t also produces qualty predctons because t may elmnate nose n the hstorcal dataset. In the followng sectons, other components of ABE system ncludng smlar functon, the number of most smlar projects, and soluton functon are presented Smlarty functon The smlarty functon measures the level of smlarty between projects. Among dfferent types of smlarty functons, eucldean smlarty (ES) and manhattan smlarty (MS) based smlartes are wdely accepted (ES: Shepperd and Schofeld, MS: Chu and Huang, 2007). The Eucldean smlarty s based on the Eucldean dstance between two projects: 2vffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 3 ux n Smðp; p0þ ¼ 1= 4t w Dsðf ; f 0 Þ þ d5 d ¼ 0:0001 ¼1 8 0 >< ðf f Þ2 ; f f and f 0 are numerc or ordnal Dsðf ; f 0 Þ¼ 1 f f and f 0 are nomnal and f ¼ f 0 >: 0 f f and f 0 are nomnal and f f 0 ð1þ where p and p 0 denote the projects, f and f 0 denote the th feature value of ther correspondng projects, w = [0,1] s the weght of the th feature, d = s a small constant to prevent the stuaton the denomnator equals 0, and n s the total number of features. The Manhattan smlarty s based on the Manhattan dstance whch s the sum of the absolute dstances for each par of features " # Smðp; p0þ ¼ 1= Xn w Dsðf ; f 0 Þþd d ¼ 0:0001 ¼1 8 0 >< jf f j f f and f 0 are numerc or ordnal Dsðf ; f 0 Þ¼ 1 f f and f 0 are nomnal and f ¼ f 0 >: 0 f f and f 0 are nomnal and f f 0 ð2þ An mportant ssue n the smlarty functons s how to assgn approprate weght w to each feature par, because each feature may have dfferent relevance to the project cost. In the lterature, several approaches were focusng on ths topc: Shepperd and Schofeld (1997) set each weght to be ether 1 or 0 then apply a brute-force approach choosng optmal weghts; Auer et al. (2006) extent Shepperd and Schofeld s approach to the flexble extensve search method. Walkerden and Jeffery (1999) use human judgment to determne the feature weghts; Angels and Stamelos (2000) choose a value generated from statstcal analyss as the feature weghts. More recently, Huang and Chu (2006) propose the genetc algorthm to optmze feature weghts K number of smlar projects Ths parameter refers to the K number of most smlar projects that s close to the project beng estmated. Some studes suggested K = 1 (Walkerden and Jeffery, 1999; Auer et al., 2006; Chu and Huang, 2007). However, we sets K = {1,2,3,4,5} snce many studes recommend K equals to two or three (Shepperd and Schofeld, 1997; Mendes et al., 2003; Jorgensen et al., 2003; Huang and Chu,

3 Y.F. L et al. / The Journal of Systems and Software 82 (2009) ) and K = {1, 2, 3, 4, 5} could cover most of the suggested numbers Soluton functons After K most smlar projects are selected, the fnal predcton for the new project s determned by computng certan statstc based on the selected projects. The soluton functons used n ths study are: the closet analogy (most smlar project) (Walkerden and Jeffery, 1999), the mean of most smlar projects (Shepperd and Schofeld, 1997), the medan of most smlar projects (Angels and Stamelos, 2000) and the nverse dstance weghted mean (Kadoda et al., 2000). The mean s the average of the costs of K most smlar projects, where K > 1. It s a classcal measure of central tendency and treats all most smlar projects as beng equally nfluental on the cost estmates. The medan s the medan of the costs of K most smlar projects, where K > 2. It s another measure of central tendency and a more robust statstc when the number of most smlar projects ncreases (Angels and Stamelos, 2000). The nverse dstance weghted mean (Kadoda et al., 2000) allows more smlar to have more nfluence than less smlar ones. The formula for weghed mean s shown n (3): bc p ¼ XK P n k¼1 ¼1 Smðp; p k Þ Smðp; p k Þ C pk where p denotes the new project beng estmated, p k represents the kth most smlar project, Sm(p, p k ) s the smlarty between project p k and p, C pk s the cost value of the kth most smlar project p k, and K s the total number of most smlar projects. 3. Project selecton and feature weghtng In ths secton, we construct the FWPSABE system (stands for feature weghtng and project selecton analogy based estmaton) whch can perform feature weghtng analogy based estmaton (FWABE) alone, project selecton analogy based estmaton (PSABE) alone, and the smultaneous optmzaton of feature weghts and project selecton (FWPSABE). Genetc algorthm (Holland, 1975) s selected as the optmzaton tool for the FWPSABE system, snce t s a robust global optmzaton technque and has been appled to optmze the model parameters by several cost estmaton papers (Dolado, 2000; Shukla, 2000; Dolado, 2001; Huang and Chu, 2006). The framework and detaled descrpton of FWPSABE system are presented n Secton 3.2. In order to ntroduce the ftness functon n GA operaton, performance metrcs for model accuracy are frstly presented n Secton Performance metrcs To measure the accuraces of cost estmaton methods, three wdely used performance metrcs are consdered: Mean magntude of relatve error (MMRE), medan magntude of relatve error (MdMRE) and PRED (0.25). The MMRE s defned as MMRE ¼ 1 n Xn ¼1 MRE ¼ C C b C MRE where n denotes the number of projects, C denotes the actual cost of the th project, and b C denotes the estmated cost of the th project. Small MMRE value ndcates low level of estmaton error. ð3þ ð4þ However, ths metrc s unbalanced and penalzes overestmaton more than underestmaton. The MdMRE s the medan of all the MREs. MdMRE ¼ medanðmreþ It exhbts a smlar pattern to MMRE but t s more lkely to select the true model especally n the underestmaton cases snce t s less senstve to extreme outlers (Foss et al., 2003). The PRED (0.25) s the percentage of predctons that fall wthn 25% of the actual cost. PREDðqÞ ¼ k n where n denotes the total number of projects and k represents the number of projects whose MRE s less than or equal to q. Normally, q s set to be The PRED (0.25) dentfes the cost estmatons that are generally accurate, whle MMRE s a based and not always relable as a performance metrc. However, MMRE has been the de facto standard n the software cost estmaton lterature. Thus, the MMRE s selected for the ftness functon n GA. More specfcally, for each chromosome generated n GA, MMRE s computed across the tranng dataset. Then GA searches through the parameters space to mnmze MMRE GA for project selecton and feature weghtng The procedure of the project selecton and feature weghtng va genetc algorthm s presented n ths secton. The system conssts of two stages: the frst one s the tranng stage (as shown n Fg. 2) and the second s the testng stage (as shown n Fg. 3). In the tranng stage, a set of tranng projects are presented to the system, the ABE model s confgured by the canddate parameters (feature weghts and selecton codes) to produce the cost predctons, and GA explores the parameters space to mnmze the error (n terms of MMRE) of ABE on the tranng projects by the followng steps:. Encodng. To apply GA for optmzaton, the canddate parameters are coded as a bnary code chromosome. As shown n Fg. 1, each ndvdual chromosome conssts of two parts. The frst part s the codes for feature weghts wth the length of 14 n, where n s the number of features. Snce the feature weghts n ABE model are decmal numbers, the bnary codes have to be transformed nto decmal values before enterng ABE model. As many authors (Mchalewcz, 1996; Ahn et al., 2006) suggested, the features weghts s set as precsely as 1/10,000. Thus, 14 bnary bts are requred to express ths precson level because 8192 = 2 13 < 10, = 16,384. After transformaton, all decmal weght values are normalzed nto the nterval [0,1] by the followng formula (Mchalewcz, 1996): w ¼ w ¼ w0 16; 383 where w 0 s the decmal converson of th feature s bnary weght. For example, the bnary code for feature 1 of the sample chromosome n Fg. 1 s ( ) 2. Its decmal value s (8193) 10 and ts normalzed value s 8193/ 16, The second part of the codes s for project selecton. The value of each bt s set to be ether 0 or 1: 0 means the correspondng project n not selected and 1 means t s selected. The length of frst part s m, and m s the total number of projects n the hstorcal project base.. Populaton generaton. After the encodng of the ndvdual chromosome, the algorthm generates a populaton of chromosomes. For GA process, larger populaton sze often results n hgher chance for ð5þ ð6þ ð7þ

4 244 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Feature Weghtng Project Selecton Feature 1 Feature 2 Feature n Projects Sample Chromosome m Fg. 1. Chromosome for FWPSABE. Randomly generated nput Tranng projects Canddate parameters Smlarty functon Feature weghtng Project selecton Project retreval Hstorcal PB Soluton functon Reduced PB Predcton value Genetc Operatons Selecton/Crossover/Mutaton Termnate? Yes No Canddate parameters Optmal parameters Output to next stage Fg. 2. The tranng stage of FWPSABE. good soluton (Doval et al., 1999). Snce GA s computatonally expensve, a trade-off between the convergence tme and the populaton sze must be made. In general, the mnmum effectve populaton sze grows wth problem sze. Based on prevous works (Huang and Chu, 2006; Chu and Huang, 2007), the sze of the populaton s set to be 10V where V s the total number of nput varables of GA search, whch partally reflects the problem sze.. Ftness functon. Each ndvdual chromosome s evaluated by the ftness functon n GA. As mentoned n Secton 3.1 MMRE s chosen for the ftness functon and GA s desgned to maxmze the ftness functon, as the sake of smplcty we set the ftness functon as the recprocal of MMRE. f ¼ 1 ð8þ MMRE

5 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Optmal parameters Inputs from last stage Testng projects Smlarty functon Feature weghtng Project selecton Project retreval Hstorcal PB Soluton functon Reduced PB Predcton Fg. 3. The testng stage of FWPSABE. v. Ftness evaluaton. After transformng the bnary chromosomes nto the feature weghtng and project selecton parameters (see step ), the procedures of ABE are executed as follows: Gven one tranng project, the smlartes between the tranng project and hstorcal projects are computed by applyng the feature weghts nto the smlarty functons n (1) or (2). Smultaneously, the project selecton part of the chromosome s used to generate the reduced hstorcal project bases (reduced PBs). Then, ABE uses 1 5 most smlar projects (1 NN to 5 NN) matchng to search through the reduced PB for 1 5 most smlar hstorcal projects. Fnally, the ABE model assgns a predcton value to the tranng project by adoptng dfferent soluton functons.the error metrc MMRE, PRED(0.25), and MdMRE are appled to evaluate the predcton performance on the tranng project set. Then, the recprocal of MMRE s used as the ftness value for each parameter combnaton (or chromosome). v. Selecton. The standard roulette wheel s used to select 10V chromosomes from the current populaton. v. Crossover. The selected chromosomes were consecutvely pared. The 1-pont crossover operator wth a probablty of 0.7 was used to produce new chromosomes n each par. The newly created chromosomes consttuted a new populaton. v. Mutaton. Each bt of the chromosomes n the new populaton s chosen to change ts value wth a probablty of 0.1, n a way that a bt 1 s changed to 0 and a bt 0 s changed to 1. v. Eltst strategy. Eltst strategy s used to overcome the defect of the slow convergence rate of GA. The eltst strategy retans good chromosomes and ensures they are not elmnated through the mechansm of crossover and mutaton. Under ths strategy, f the mnmum ftness value of the new populaton s smaller than that of the old populaton, then the new chromosome wth the mnmum ftness value wll be replaced wth the old chromosome wth the maxmum ftness value. x. Stoppng crtera. There are few theoretcal gudelnes for determnng when to termnate the genetc search. By followng the prevous works (Huang and Chu, 2006; Chu and Huang, 2007) on GA combnng wth ABE method, step v to step v are repeated untl the number of generatons equal to or excess 1000V trals or the best ftness value does not change n the past 100V trals. After the stoppng crtera are satsfed, the system moves to the second stage and the optmal parameters or chromosome are entered nto the ABE model for testng. In the above procedure, the populaton sze, crossover rate, mutaton rate and stoppng condton are the controllng parameters of the GA search. However, there are few theores to guld the assgn-

6 246 Y.F. L et al. / The Journal of Systems and Software 82 (2009) ments of these values (Ahn et al., 2006). Hence, we determne the value of these parameters n the lght of prevous studes that combnes ABE and GAs. Most pror studes use 10V chromosomes as the populaton sze, ther crossover rate ranges from 0.5 to 0.7, and the mutaton rate ranges from 0.06 to 0.1 (Ahn et al., 2006; Huang and Chu, 2006; Chu and Huang, 2007). However, because the search space for our GA s larger than these studes, we set the parameters to the hgher bounds of those ranges. Thus, n ths study the populaton sze s 10V, the crossover rate s set at 0.7 and the mutaton rate s 0.1. The second stage s the testng stage. In ths stage system receves the optmzed parameters from the tranng stage to confgure ABE model. The optmal ABE s then appled to the testng projects to evaluate the traned ABE. 4. Datasets and experment desgns In ths secton, two real world software engneerng datasets are frstly utlzed for emprcal evaluaton of our methods. Addtonally, all the cost estmaton methods ncluded n our experments are descrbed n Secton 4.2 and the detaled experments procedure s presented n Secton Dataset preparaton The Albrecht dataset (Albrecht and Gaffney, 1983) ncludes 24 projects developed by usng thrd generaton languages. 18 of the projects were wrtten n COBOL, 4 were wrtten n PL1, and 2 were wrtten n DMS languages. Sx ndependent features of ths dataset are nput count, output count, query count, fle count, functon ponts, and source lnes of code. The dependent feature person hours s recorded n 1000 h. The descrptve statstcs of all the features shown n Table 1. The Desharnas dataset (Desharnas, 1989) ncludes 81 projects and 11 features, 10 ndependent and one dependent. Snce 4 out of 81 projects contan mssng feature values, they have been excluded from the dataset. Ths process results n the 77 complete projects for our study. The ten ndependent features of ths dataset are TeamExp, ManagerExp, YearEnd, Length, Transactons, Enttes, PontsAdjust, Envergure, PontsNonAjust, and Language. The dependent feature person hours s recorded n 1000 h. The descrptve statstcs of all the features are shown n Table 2. Before the experments, all types of features are normalzed nto the nterval [0, 1] n order to elmnate ther dfferent nfluences. In addton, the two real datasets (Albrecht and Desharnas) are randomly splt nto three nearly equal szed sub-sets for tranng and testng. The detal parttons of each dataset are provded n Table 3. The hstorcal dataset s utlzed by ABE model to retreve smlar past projects. The tranng set s treated as the targets for the optmzaton of feature weghts and project subsets. The testng set s exclusvely used to evaluate the optmzed ABE models Cost estmaton methods Four ABE based models are ncluded n our experments. The frst model s the conventonal ABE. The second model s feature Table 1 Descrptve statstcs for Albrecht dataset Feature Mnmum Maxmum Mean Standard devaton Input count Output count Query count Fle count Functon ponts SLOC Person hours Table 2 Descrptve statstcs for Desharnas dataset Feature Mnmum Maxmum Mean Standard devaton TeamExp ManagerExp YearEnd Length Language Transactons Enttes PontsAdjust Envergure PontsNonAjust Person hours Table 3 The partton of real datasets Dataset Sample sze of Albrecht Sample sze of Desharnas Hstorcal 8 25 Tranng 8 25 Testng 8 27 Total weghng analogy based estmaton (FWABE) whch assgns optmal feature weghts va GA (Huang and Chu, 2006). FWABE does not nclude project selecton technque. The thrd model, project selecton analogy based estmaton (PSABE) uses GA to optmze the hstorcal project subsets. PSABE excludes of feature weghtng. The forth model s FWPSABE whch uses GA for smultaneous optmzaton of features weghtng and projects Selecton. The latter two are the proposed by our study. For a comprehensve evaluaton of the proposed models, we compare them wth other popular machne learnng methods ncludng artfcal neural network ANN (Heat, 2002), radal bass functons RBF (Shn and Goel, 2000), support vector machne regresson SVR (Olvera, 2006), and classfcaton and regresson trees CART (Pckard et al., 2001). The best varants of machne learnng methods are obtaned by tranng these methods and tunng ther parameters on the hstorcal datasets and tranng datasets presented n Secton 3.1 respectvely. In ANN model, the number of hdden layers, the number of hdden nodes and the transfer functons are three predefned parameters and they have a major mpact on the predcton performance (Martn et al., 1997). Among these parameters, one hdden layer s often recommended snce multple hdden layers may lead to an over parameterzed ANN structure. Thus, one hdden layer s utlzed n ths study. The search spaces for the number of hdden neurons and hdden layer transfer functons are set to be {1, 3, 5, 7, 9, 10} and {lnear, tan-sgmod, log-sgmod} respectvely. Durng the tranng process, the ANN models wth dfferent parameter confguratons are frstly traned on the hstorcal dataset. Then, all ANN structures are mplemented on the tranng set and the one producng the lowest MMRE value s selected for the comparsons aganst ABE models. For RBF network, the forward selecton strategy s utlzed snce forward selecton has the advantages of flexble number of hdden nodes n advance, the tractable model selecton crtera and the relatvely low computatonal expense (Orr, 1996). In ths case, the regularzaton parameter k s ntroduced. To determne k, the search space s defned as k = {10 j = 10, 9,...,0,...,10}. Smlar to ANN s tranng procedure, all RBFs wth dfferent k values are traned on hstorcal dataset and the one yeldng the lowest MMRE on tranng data s selected for comparsons. For SVR model, the common Gaussan functon K(x,y) = exp { (x y) 2 d 2 } s used as the kernel functon. The predefned parameters d, C and e, are selected from the same search space

7 Y.F. L et al. / The Journal of Systems and Software 82 (2009) {10 = 10, 9,...,0,...,10}. SVR models wth all knds of parameters combnatons ( = 1000 combnatons) are traned on the hstorcal dataset. The combnaton producng the mnmal MMRE on the tranng set s chosen for comparsons. To tran CART model, we frst use the hstorcal set to ft the model and obtan a decson tree T. The tree T then s appled to the tranng set, and returns a vector of cost values computed for the tranng projects. The cost vector s then used to prune the tree T nto a mnmzed sze. The tree wth optmal sze s adopted for comparsons Experment procedure For the purpose of valdatons and comparsons, the followng experments procedures are conducted: Frstly, the performances of FWPSABE are nvestgated by varyng ABE parameters other than feature weghts and project subsets. As mentoned n Secton 2, ABE has three components exclusve of hstorcal project base: smlarty functons, K number of most smlar projects, and the soluton functons. In lne wth the common settngs of these parameters, we defne the search spaces for smlarty functon as {Eucldean dstance, Manhattan dstance}, K number of smlar projects as {1,2,3,4,5}, and soluton functons as {closet analogy, mean, medan, nverse dstance weghted mean} respectvely. All knds of parameter combnatons are executed on both the tranng dataset and the testng. The best confguraton on tranng dataset s selected out for the comparsons wth other cost estmaton methods. Secondly, other ABE based methods are traned by the smlar procedure descrbed n the frst step and the best varants on tranng set are selected as the canddate for comparsons. In addton, the optmzatons of machnes learnng methods are conducted on the tranng dataset by searchng through ther parameter spaces. Thrdly, the tranng and testng results of the best varants of all estmaton methods are summarzed and compared. The experments results and analyss are presented n next secton. 5. Experment results Table 4 presents FWPSABE s results on Albrecht dataset wth dfferent parameter confguratons mentoned n Secton 2. The results show that n general Eucldean dstance acheves slghtly more accurate performances than Manhattan dstance on both the tranng and testng dataset. As to the soluton functon, there s no clear observaton whch functon s most preferable. The choce of K value has some nfluence on the accuraces. The smaller errors mostly appear when K = 3 and K = 4. Among all confguratons, the settng {Eucldean smlarty, K = 4, and mean soluton functon} produces best results on tranng dataset and so t s selected for the comparsons wth other cost estmaton methods. Table 5 summarzes the results of the best varants of all cost estmaton methods on Albrecht dataset. It s observed that the FWPSABE acheves the best testng performance (0.30 for MMRE, 0.63 for PRED(0.25) and 0.27 for MdMRE) among all methods, and followed by PSABE, and FWABE. For a better llustraton, the correspondng testng performs are presented n Fg. 4. The results of FWPSABE wth dfferent confguratons on Desharnas dataset are summarzed n Table 6. The results show that on ths dataset the choce of dfferent smlarty functons has lttle nfluence on both the tranng and testng performances. As to the soluton functons, there s no clear concluson whch soluton functon s the best. The choce of K value has slght nfluence on the accuraces. The smaller errors are acheved by settng K = 3. In all confguratons, the settng {Eucldean smlarty, K = 3, and mean soluton functon} produces best results on tranng dataset Table 4 Results of FWPSABE on Albrecht dataset Smlarty K value Soluton Tranng Testng MMRE PRED(0.25) MdMRE MMRE PRED(0.25) MdMRE Eucldean K =1 CA K =2 Mean IWM K =3 Mean IWM Medan K =4 Mean IWM Medan K =5 Mean IWM Medan Manhattan K =1 CA K =2 Mean IWM K =3 Mean IWM Medan K =4 Mean IWM Medan K =5 Mean IWM Medan

8 248 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Table 5 The results and comparsons on Albrecht dataset Models MMRE PRED(0.25) MdMRE Tranng Testng Tranng Testng Tranng Testng ABE FWABE PSABE FWPSABE SVR ANN RBF CART Fg. 4. The testng results on Albrecht dataset. and so t s selected for the comparsons aganst other cost estmaton methods. Table 7 presents the results of the best varants of all cost estmaton methods on Desharnas dataset. It s shown that the FWPSABE acheves the best testng performance (0.32 for MMRE, 0.44 for PRED(0.25) and 0.29 for MdMRE), and followed by SVR and PSABE. Fg. 5 provdes an llustratve verson of the testng results n Table Artfcal datasets and experments results To compare dfferent cost estmaton methods, the need for emprcal valdaton s very crucal. Ths has led to the collecton of varous real world datasets for experments. Mar et al. (2005) conducted an extensve survey of the real datasets for cost estmaton from 1980 onwards. As reported, most publshed real world datasets are relatvely small for the tests of sgnfcance and the true propertes of them may not be fully known. For example, t mght be dffcult to dstngush dfferent types of dstrbuton n the presence of extreme outlers n a small dataset (Shepperd and Kadoda, 2001). Artfcally generated datasets provde a feasble soluton to the above two dffcultes. Frstly, the researchers can generate reasonable amount of artfcal data to nvestgate the sgnfcant dfferences among the competng technques. Secondly, t provdes the control over the characterstcs of the artfcal dataset. Especally, researchers could desgn a systematc way to vary the propertes for ther research purposes (Pckard et al., 1999). In order to evaluate the proposed methods n a more controlled way, we generate two artfcal datasets for further experments. From each of the two real datasets, we extract a set of characterstcs descrbng ts property, or more specfcally ts non-normalty. The non-normalty consdered n our study ncludes Table 6 Results of FWPSABE on Desharnas dataset Smlarty K value Soluton Tranng Testng MMRE PRED(0.25) MdMRE MMRE PRED(0.25) MdMRE Eucldean K =1 CA K =2 Mean IWM K =3 Mean IWM Medan K =4 Mean IWM Medan K =5 Mean IWM Medan Manhattan K =1 CA K =2 Mean IWM K =3 Mean IWM Medan K =4 Mean IWM Medan K =5 Mean IWM Medan

9 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Table 7 The results and comparsons on Desharnas dataset Models MMRE PRED(0.25) MdMRE Tranng Testng Tranng Testng Tranng Testng ABE FWABE PSABE FWPSABE SVR ANN RBF CART cost, hours 12 x sze, nonadjusted functon ponts Fg. 6. Cost versus sze of Albrecht dataset. 2.5 x Fg. 5. The testng results on Desharnas dataset. cost, hours skeweness, varance nstablty, and excessve outlers (Pckard et al., 2001). We then by usng the two sets of characterstcs generate two sets of artfcal data. Secton 6.1 presents the detals for artfcal datasets generaton Generaton of the artfcal datasets sze, non adjusted functon ponts To explore the non-normal characterstcs of the real world dataset, the cost-sze scatter plot for Albrecht dataset s drawn as Fg. 6. The scatter plot ndcates the slght skewness, moderate outlers, and slght varance nstablty of the Albrecht dataset. The cost-sze scatter plot of the Desharnas dataset s llustrated n Fg. 7 whch shows weak skewness, extreme outlers, and hghly varance nstablty of ths dataset. From the analyss above, software dataset often exhbts a mxture of several non-normal characterstcs such as skewness, varance nstablty, and excessve outlers (Pckard et al., 2001). These characterstcs do not always appear n the same degree. In some cases they are moderately non-normal such as the Albrecht dataset, whle n other cases they are severely non-normal such as the Desharnas dataset. Wthout loss of generalty, we adopted Pckard s way of modelng non-normalty n ths work. Other types of technques for artfcal dataset generaton are also avalable n recent lterature. For more detals, readers can refer to Shepperd and Kadoda (2001), Foss et al. (2003) and Myrtvet et al. (2005). By Pckard s way, we smulate the combnaton of non-normal characterstcs: skeweness, unstable varance and outlers n (7): y ¼ 1000 þ 6x 1 sk þ 3x 2 sk þ 2x 3 sk þ e het The ndependent varables (x 1 sk, x 2 sk, x 3 sk) are generated by Gamma dstrbuted random varables x 0 1, x0 2, and x0 3 wth mean 4 ð9þ Fg. 7. Cost versus sze of Desharnas dataset. and varance 8. And the skewness s explct by the Gamma dstrbutons. In order to vary the scale of the ndependent varables, we then multply the x 0 1 by 10 to create varable x 1sk, the x 0 2 by 3 to create varable x 2 sk and x 0 3 by 20 to create the varable x 3sk. The last term e het n the formula smulates a specal form of unstable varance: heteroscedastcty. The heteroscedastcty occurs where the error term s related to one of the varables n the model and ether ncrease or decreases dependng on the value of the ndependent varable. The error term e het s related to x 1 sk by the relatonshp e het = 0.1 e x 1 sk for the moderate heteroscedastcty, and e het =6 e x 1 sk for the severe heteroscedastcty (Pckard et al., 2001). The outlers are generated by multplyng or dvdng the dependent varable y by a constant. We select 1% of the data to be the outlers. Half of the outlers are obtaned by multplyng whle half of them are got by dvdng. For the moderate outlers, we set the constant value as 2, whle for the severe outlers, 6 s chosen to be the constant. The combnaton of moderate heteroscedastcty and moderate outlers s used to generate the moderate non-normalty dataset (Fg. 8). The jont of severe heteroscedastcty and severe outlers s used to obtan the severe non-normalty dataset (Fg. 9).

10 250 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Y Table 9 The results and comparsons on artfcal moderate non-normalty dataset Models MMRE PRED(0.25) MdMRE Tranng Testng Tranng Testng Tranng Testng ABE FWABE PSABE FWPSABE SVR ANN RBF CART Y X1sk Fg. 8. Y versus x 1 sk of moderate non-normalty dataset. n MMRE at and MdMRE at 0.06 and the second best value 0.98 for PRED(0.25), whle ANN gets the hghest PRED(0.25) value at Compare the predcton error curves n Fg. 4 for Albrecht dataset to the error curves n Fg. 10 for moderate non-normalty set, t s observed that all the methods acheve much better performance on the artfcal dataset and the dfferences among the canddate methods are much smaller on the artfcal dataset. These fndngs mply that estmaton methods n our study may converge to good predcton results on the moderately non-normal dataset wth large sze and FWPSABE s slghtly better than other methods as t elmnate the nose n the hstorcal dataset. Table 10 shows the results on artfcal severe non-normalty dataset. FWPSABE acheves the best performances n MMRE at 0.16 and MdMRE at 0.11 and the second best value 0.80 for PRED(0.25), whle CART obtans the hghest PRED(0.25) value at Compare Fgs. 10, and 11, t s shown that the all methods obtan poorer performances on severe non-normal dataset. Ths X1sk Fg. 9. Y versus x 1 sk of severe non-normalty dataset Experments results on artfcal datasets By usng the equaton mentoned n Secton 6.1, we generate two artfcal datasets, each wth 500 projects. For a better assessment of accuracy, we make the data for testng much larger by dvdng the artfcal datasets nto: hstorcal set wth 50 projects, tranng set wth 50 projects, and the testng set wth 400 projects (see Table 8). We apply all the methods onto the two artfcal datasets by followng the same procedure presented n Secton 4.3. The results and comparsons are summarzed as followng. The results on artfcal moderate non-normalty dataset are n Table 9. It s shown that FWPSABE acheves the best performances Table 8 The partton of artfcal datasets Dataset Sample sze of artfcal moderate non-normalty data Hstorcal Tranng Testng Total Sample sze of artfcal severe non-normalty data Fg. 10. The testng results on artfcal moderate non-normalty dataset. Table 10 The results and comparsons on artfcal severe non-normalty dataset Models MMRE PRED(0.25) MdMRE Tranng Testng Tranng Testng Tranng Testng ABE FWABE PSABE FWPSABE SVR ANN RBF CART

11 Y.F. L et al. / The Journal of Systems and Software 82 (2009) to process the mss values the FWPSABE system starts. Furthermore, only MMRE s used for optmzaton objectve functon, and there s no guarantee that other qualty metrcs such as PRED(0.25) and MdMRE can be optmzed whle optmzng the sngle objectve MMRE. Mult-objectve optmzaton technques can be nvestgated n future works. References Fg. 11. The testng results on artfcal severe non-normalty dataset. observaton ndcates that hgh degree of non-normalty has negatve mpacts on the performance of estmaton methods n our study. 7. Conclusons and future works In ths study, we ntroduce the project selecton technque to refne the hstorcal project database n ABE model. In addton, the smultaneous optmzaton of feature weghts and project selecton (FWPSABE) s proposed to further mprove the performance of ABE. To evaluate of our methods, we apply them on two real-world dataset and two artfcal datasets. The error ndcators for methods evaluatons are MMRE, PRED(0.25), and MdMRE. The promsng results of the proposed FWPSABE system ndcate that t can sgnfcantly mprove the ABE model and enhance ABE as a successful method among software cost estmaton technques. One major concluson of ths paper s that FWPSABE system may produce more accurate predctons than other advanced machnes learnng technques for software cost estmaton. In the lterature, ABE s already regarded as a benchmarkng method for cost estmaton (Shepperd and Schofeld, 1997). Frst, t s not complex for mplementaton and t s more transparent to the users than most machne learnng methods. Moreover, ABE s predcton can update n real tme; once a project s completed, ts nformaton can be easly nserted nto the hstorcal project database. However, many studes reported that n practce ABE has been hndered by the low predcton accuracy. Accordng to the results n ths study, FWPSABE may be useful n practcal stuatons because t has the advantages of ABE and the ablty to produce more accurate cost estmaton results. However, there are stll some lmtatons of study. For example, the two real-world datasets n our experments are qute old though they have been frequently used by many recent studes. Experments on recent and large sze datasets such as ISBSG database are essental for more rgorous evaluatons on our methods. In addton, our methods are only valdated on the projects developed by the tradtonal waterfall based approach. Software projects developed by new type of approaches such as agle methods have addtonal features ndcatng the characterstcs of ther development approaches. The accuraces of FWPSABE for projects under newly development types should be further nvestgated. Moreover, ABE based methods are ntolerant of mssng features. If nformaton of some hstorcal projects s ncomplete, then the data mputaton technques should be taken Ahn, H., Km, K., Han, I., Hybrd genetc algorthms and case-based reasonng systems for customer classfcaton. Expert Systems 23 (3). Albrecht, A.J., Gaffney, J., Software functon, source lnes of code, and development effort predcton. IEEE Transactons on Software Engneerng 9 (6), Angels, L., Stamelos, I., A smulaton tool for effcent analogy based cost estmaton. Emprcal Software Engneerng 5, Auer, M., Trendowcz, A., Graser, B., Haunschmd, E., Bffl, S., Optmal project feature weghts n analogy-based cost estmaton: mprovement and lmtatons. IEEE Transactons on Software Engneerng 32 (2), Babu, T.R., Murty, M.N., Comparson of genetc algorthm based prototype selecton schemes. Pattern Recognton 34, Boehm, B., Software Engneerng Economcs. Prentce-Hall, Englewood Clffs, NJ. Boehm, B., Abts, C., Chulan, S., Software development cost estmaton approaches a survey. Annals of Software Engneerng 10, Chu, N.H., Huang, S.J., The adjusted analogy-based software effort estmaton based on smlarty dstances. Journal of Systems and Software 80 (4), Chulan, S., Boehm, B., Steece, B., Bayesan analyss of emprcal software engneerng cost models. IEEE Transactons on Software Engneerng 25 (4), Costaglola, G., Ferrucc, F., Tortora, G., Vtello, G., Class pont: an approach for the sze estmaton of object-orented systems. IEEE Transactons on Software Engneerng 31 (1), Desharnas, J.M., Analyse statstque de la productvte des projets nformatque a parte de la technque des pont des fonct on, Unversty of Montreal, Masters thess. Dolado, J.J., A valdaton of the component-based method for software sze estmaton. IEEE Transactons on Software Engneerng 26 (10), Dolado, J.J., On the problem of the software cost functon. Informaton and Software Technology 43, Doval, D., Mancords, S., Mtchell, B.S., Automatc clusterng of software systems usng a genetc algorthm. Proceedngs of the 9th Internatonal Workshop Software Technology and Engneerng Practce, Foss, T., Stensrud, E., Ktchenham, B., Myrtvet, I., A smulaton study of the model evaluaton crteron MMRE. IEEE Transactons on Software Engneerng 29 (11). Heat, A., Comparson of artfcal neural network and regresson models for estmatng software development effort. Informaton and Software Technology 44, Helmer, O., Socal Technology. Basc Books, NY. Holland, J., Adaptaton n Natural and Artfcal Systems. Unversty of Mchgan Press, Ann Arbor, MI, USA. Huang, S.J., Chu, N.H., Optmzaton of analogy weghts by genetc algorthm for software effort estmaton. Informaton and Software Technology 48, Huang, Y.S., Chang, C.C., Sheh, J.W., Grmson, E., Prototype optmzaton for nearest-neghbor classfcaton. Pattern Recognton 35, Huang, X.S., Ho, D., Ren, J., Capretz, L.F., Improvng the COCOMO model usng a neuro-fuzzy approach. Appled Soft Computng Journal 7 (1), Jensen, R., An mproved macrolevel software development resource estmaton model. In: Proceedngs of 5th Conference of Internatonal S Parametrc Analysts, pp Jorgensen, M., Top-down and bottom-up expert estmaton of software development effort. Informaton and Software Technology 46, Jorgensen, M., Indahl, U., Sjoberg, D., Software effort estmaton by analogy and regresson toward the mean. Journal of Systems and Software 68 (3), Jorgensen, M., Shepperd, M., A systematc revew of software development cost estmaton studes. IEEE Transactons on Software Engneerng 33 (1), Kadoda, G., Cartwrght, M., Chen, L., Shepperd, M., Experences usng casebased reasonng to predct software project effort. In: Proceedngs EASE 2000 conferences 4th Internatonal Conference on Emprcal Assessment and Evaluaton n Software Engneerng. Staffordshre, U.K. Kolodner, J.L., Case-Based Reasonng. Morgan Kaufmann Publshers Inc. Kuncheva, L.I., Jan, L.C., Nearest neghbor classfer: smultaneous edtng and feature selecton. Pattern Recognton Letters 20, Lpowezky, U., Selecton of the optmal prototype subset for 1-NN classfcaton. Pattern Recognton Letters 19, MacDonell, S.G., Shepperd, M.J., Combnng technques to optmze effort predctons n software project management. Journal of Systems and Software 66,

12 252 Y.F. L et al. / The Journal of Systems and Software 82 (2009) Madachy, R., A Software Project Dynamcs Model for Process Cost, Schedule and Rsk Assessment, Ph.D. Dssertaton, Unversty of Southern Calforna. Mar, C., Shepperd, M., Jorgensen, M An analyss of data sets used to tran and valdate cost predcton systems. PROMISE 05. Hagan, Martn T., Demuth, Howard B., Beale, Mark H., Neural Network Desgn. PWS Publshng Co., Boston, MA. Mendes, E., Watson, I., Trggs, C., Mosley, N., Counsell, S., A comparatve study of cost estmaton models for Web hypermeda applcatons. Emprcal Software Engneerng 8, Mendes, E., Mosley, N., Counsell, S., Investgatng Web sze metrcs for early Web cost estmaton. Journal of Systems and Software 77 (2), Mchalewcz, Z., Genetc Algorthms + Data Structures = Evoluton Programs, thrd ed. Sprnger, Berln. Myazak, Y., Terakado, K., Ozak, K., Nozak, H., Robust regresson for developng software estmaton models. Journal of Systems and Software 27, Myrtvet, I., Stensrud, E., Shepperd, M., Relablty and valdty n comparatve studes of software predcton models. IEEE Transactons on Software Engneerng 31 (5), Olvera, A.L.I., Estmaton of software project effort wth support vector regresson. Neurocomputng 69, Orr, M.J.L Introducton to Radal Bass Functon Network. Techncal Reports, Centre for Cogntve Scence, Unversty of Ednburgh, 2, Buccleuch Place, Ednburgh, Scotland. Pendharkar, P.C., Subramanan, G.H., Rodger, J.A., A probablstc model for predctng software development effort. IEEE Transactons on Software Engneerng 31 (7), Pckard, L., Ktchenham, B., Lnkman, S An nvestgaton analyss technques for software datasets. In: Proceedng of Sxth IEEE Internatonal Software Metrcs Symposum. Pckard, L., Ktchenham, B., Lnkman, S., Usng smulated data sets to compare data analyss technques used for software cost modelng. IEE Proceedng of Software 148 (6), Putnam, L., Myers, W., Measures for Excellence. Yourdon Press Computng Seres. Rozsypal, A., Kubat, M., Selectng representatve examples and attrbutes by a genetc algorthm. Intellgent Data Analyss 7, Shepperd, M., Kadoda, G., Comparng software predcton technques usng smulaton. IEEE Transactons on Software Engneerng 27 (11), Shepperd, M., Schofeld, C., Estmatng software project effort usng analoges. IEEE Transactons on Software Engneerng 23 (12), Shn, M., Goel, A.L., Emprcal data modelng n software engneerng usng radal bass functons. IEEE Transactons on Software Engneerng 26 (6), Shukla, K.K., Neuro-genetc predcton of software development effort. Informaton and Software Technology 42, Sternberg, R., Component processes n analogcal reasonng. Psychologcal Revew 84 (4), Tausworthe, R.C., The work breakdown structure n software project management. Journal of Systems and Software 1 (3), Walkerden, F., Jeffery, R., An emprcal study of analogy-based software effort estmaton. Emprcal Software Engneerng 4,

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Predicting Software Development Project Outcomes *

Predicting Software Development Project Outcomes * Predctng Software Development Project Outcomes * Rosna Weber, Mchael Waller, June Verner, Wllam Evanco College of Informaton Scence & Technology, Drexel Unversty 3141 Chestnut Street Phladelpha, PA 19104

More information

Machine Learning and Software Quality Prediction: As an Expert System

Machine Learning and Software Quality Prediction: As an Expert System I.J. Informaton Engneerng and Electronc Busness, 2014, 2, 9-27 Publshed Onlne Aprl 2014 n MECS (http://www.mecs-press.org/) DOI: 10.5815/jeeb.2014.02.02 Machne Learnng and Software Qualty Predcton: As

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS

SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS Magdalena Rogalska 1, Wocech Bożeko 2,Zdzsław Heduck 3, 1 Lubln Unversty of Technology, 2- Lubln, Nadbystrzycka 4., Poland. E-mal:rogalska@akropols.pol.lubln.pl

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Software project management with GAs

Software project management with GAs Informaton Scences 177 (27) 238 241 www.elsever.com/locate/ns Software project management wth GAs Enrque Alba *, J. Francsco Chcano Unversty of Málaga, Grupo GISUM, Departamento de Lenguajes y Cencas de

More information

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng Envronment Congcong Xong, Long Feng, Lxan Chen A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng

More information

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble 1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns A study on the ablty of Support Vector Regresson and Neural Networks to Forecast Basc Tme Seres Patterns Sven F. Crone, Jose Guajardo 2, and Rchard Weber 2 Lancaster Unversty, Department of Management

More information

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688, dskim@ssu.ac.kr

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688, dskim@ssu.ac.kr Proceedngs of the 41st Internatonal Conference on Computers & Industral Engneerng BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK Yeong-bn Mn 1, Yongwoo Shn 2, Km Jeehong 1, Dongsoo

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

Design and Development of a Security Evaluation Platform Based on International Standards

Design and Development of a Security Evaluation Platform Based on International Standards Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Damage detection in composite laminates using coin-tap method

Damage detection in composite laminates using coin-tap method Damage detecton n composte lamnates usng con-tap method S.J. Km Korea Aerospace Research Insttute, 45 Eoeun-Dong, Youseong-Gu, 35-333 Daejeon, Republc of Korea yaeln@kar.re.kr 45 The con-tap test has the

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information

Sciences Shenyang, Shenyang, China.

Sciences Shenyang, Shenyang, China. Advanced Materals Research Vols. 314-316 (2011) pp 1315-1320 (2011) Trans Tech Publcatons, Swtzerland do:10.4028/www.scentfc.net/amr.314-316.1315 Solvng the Two-Obectve Shop Schedulng Problem n MTO Manufacturng

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Sample Design in TIMSS and PIRLS

Sample Design in TIMSS and PIRLS Sample Desgn n TIMSS and PIRLS Introducton Marc Joncas Perre Foy TIMSS and PIRLS are desgned to provde vald and relable measurement of trends n student achevement n countres around the world, whle keepng

More information

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow Dragan Smć Svetlana Smć Vasa Svrčevć Invocng and Fnancal Forecastng of Tme and Amount of Correspondng Cash Inflow Artcle Info:, Vol. 6 (2011), No. 3, pp. 014-021 Receved 13 Janyary 2011 Accepted 20 Aprl

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Dynamic Resource Allocation for MapReduce with Partitioning Skew

Dynamic Resource Allocation for MapReduce with Partitioning Skew Ths artcle has been accepted for publcaton n a future ssue of ths journal, but has not been fully edted. Content may change pror to fnal publcaton. Ctaton nformaton: DOI 1.119/TC.216.253286, IEEE Transactons

More information

Searching for Interacting Features for Spam Filtering

Searching for Interacting Features for Spam Filtering Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software

More information

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets Improved Mnng of Software Complexty Data on Evolutonary Fltered Tranng Sets VILI PODGORELEC Insttute of Informatcs, FERI Unversty of Marbor Smetanova ulca 17, SI-2000 Marbor SLOVENIA vl.podgorelec@un-mb.s

More information

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 5-9, 2001 LIST-ASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms Optmal Choce of Random Varables n D-ITG Traffc Generatng Tool usng Evolutonary Algorthms M. R. Mosav* (C.A.), F. Farab* and S. Karam* Abstract: Impressve development of computer networks has been requred

More information

Overview of monitoring and evaluation

Overview of monitoring and evaluation 540 Toolkt to Combat Traffckng n Persons Tool 10.1 Overvew of montorng and evaluaton Overvew Ths tool brefly descrbes both montorng and evaluaton, and the dstncton between the two. What s montorng? Montorng

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

A GENETIC ALGORITHM-BASED METHOD FOR CREATING IMPARTIAL WORK SCHEDULES FOR NURSES

A GENETIC ALGORITHM-BASED METHOD FOR CREATING IMPARTIAL WORK SCHEDULES FOR NURSES 82 Internatonal Journal of Electronc Busness Management, Vol. 0, No. 3, pp. 82-93 (202) A GENETIC ALGORITHM-BASED METHOD FOR CREATING IMPARTIAL WORK SCHEDULES FOR NURSES Feng-Cheng Yang * and We-Tng Wu

More information

Construction Rules for Morningstar Canada Target Dividend Index SM

Construction Rules for Morningstar Canada Target Dividend Index SM Constructon Rules for Mornngstar Canada Target Dvdend Index SM Mornngstar Methodology Paper October 2014 Verson 1.2 2014 Mornngstar, Inc. All rghts reserved. The nformaton n ths document s the property

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo.

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo. ICSV4 Carns Australa 9- July, 007 RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL Yaoq FENG, Hanpng QIU Dynamc Test Laboratory, BISEE Chna Academy of Space Technology (CAST) yaoq.feng@yahoo.com Abstract

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

Product Quality and Safety Incident Information Tracking Based on Web

Product Quality and Safety Incident Information Tracking Based on Web Product Qualty and Safety Incdent Informaton Trackng Based on Web News 1 Yuexang Yang, 2 Correspondng Author Yyang Wang, 2 Shan Yu, 2 Jng Q, 1 Hual Ca 1 Chna Natonal Insttute of Standardzaton, Beng 100088,

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt

More information

An MILP model for planning of batch plants operating in a campaign-mode

An MILP model for planning of batch plants operating in a campaign-mode An MILP model for plannng of batch plants operatng n a campagn-mode Yanna Fumero Insttuto de Desarrollo y Dseño CONICET UTN yfumero@santafe-concet.gov.ar Gabrela Corsano Insttuto de Desarrollo y Dseño

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: ruoyu.l@skf.com

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Mooring Pattern Optimization using Genetic Algorithms

Mooring Pattern Optimization using Genetic Algorithms 6th World Congresses of Structural and Multdscplnary Optmzaton Ro de Janero, 30 May - 03 June 005, Brazl Moorng Pattern Optmzaton usng Genetc Algorthms Alonso J. Juvnao Carbono, Ivan F. M. Menezes Luz

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Support vector domain description

Support vector domain description Pattern Recognton Letters 20 (1999) 1191±1199 www.elsever.nl/locate/patrec Support vector doman descrpton Davd M.J. Tax *,1, Robert P.W. Dun Pattern Recognton Group, Faculty of Appled Scence, Delft Unversty

More information

Preventive Maintenance and Replacement Scheduling: Models and Algorithms

Preventive Maintenance and Replacement Scheduling: Models and Algorithms Preventve Mantenance and Replacement Schedulng: Models and Algorthms By Kamran S. Moghaddam B.S. Unversty of Tehran 200 M.S. Tehran Polytechnc 2003 A Dssertaton Proposal Submtted to the Faculty of the

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information