A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining

Size: px
Start display at page:

Download "A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining"

Transcription

1 A Fast Clusterg Algorth to Cluster Very Large Categorcal Data Sets Data Mg Zhexue Huag * Cooperatve Research Cetre for Advaced Coputatoal Systes CSIRO Matheatcal ad Iforato Sceces GPO Box 664, Caberra 260, AUSTRALIA eal:zhexue.huag@cs.csro.au Abstract Parttog a large set of obects to hoogeeous clusters s a fudaetal operato data g. The k-eas algorth s best suted for pleetg ths operato because of ts effcecy clusterg large data sets. However, workg oly o uerc values lts ts use data g because data sets data g ofte cota categorcal values. I ths paper we preset a algorth, called k-odes, to exted the k-eas paradg to categorcal doas. We troduce ew dsslarty easures to deal wth categorcal obects, replace eas of clusters wth odes, ad use a frequecy based ethod to update odes the clusterg process to se the clusterg cost fucto. Tested wth the well kow soybea dsease data set the algorth has deostrated a very good classfcato perforace. Experets o a very large health surace data set cosstg of half a llo records ad 34 categorcal attrbutes show that the algorth s scalable ters of both the uber of clusters ad the uber of records. Itroducto Parttog a set of obects to hoogeeous clusters s a fudaetal operato data g. The operato s eeded a uber of data g tasks, such as usupervsed classfcato ad data suato, as well as segetato of large heterogeeous data sets to saller hoogeeous subsets that ca be easly aaged, separately odelled ad aalysed. Clusterg s a popular approach used to pleet ths operato. Clusterg ethods partto a set of obects to clusters such that obects the sae cluster are ore slar to each other tha obects dfferet clusters accordg to soe defed crtera. Statstcal clusterg ethods (Aderberg 973, Ja ad Dubes 988 use slarty easures to partto obects whereas coceptual clusterg ethods cluster obects accordg to the cocepts obects carry (Mchalsk ad Stepp 983, Fsher 987. The ost dstct characterstc of data g s that t deals wth very large data sets (ggabytes or eve terabytes. Ths requres the algorths used data g to be scalable. However, ost algorths curretly used data g do ot scale well whe appled to very large data sets because they were tally developed for other applcatos tha data g whch volve sall data sets. The study of scalable data g algorths has recetly becoe a data g research focus (Shafer et al I ths paper we preset a fast clusterg algorth used to cluster categorcal data. The algorth, called k- odes, s a exteso to the well kow k-eas algorth (MacQuee 967. Copared to other clusterg ethods the k-eas algorth ad ts varats (Aderberg 973 are effcet clusterg large data sets, thus very sutable for data g. However, ther use s ofte lted to uerc data because these algorths se a cost fucto by calculatg the eas of clusters. Data g applcatos frequetly volve categorcal data. The tradtoal approach to covertg categorcal data to uerc values does ot ecessarly produce eagful results the case where categorcal doas are ot ordered. The k-odes algorth ths paper reoves ths ltato ad exteds the k-eas paradg to categorcal doas whlst preservg the effcecy of the k-eas algorth. I (Huag 997 we have proposed a algorth, called k-prototypes, to cluster large data sets wth xed uerc ad categorcal values. I the k-prototypes algorth we defe a dsslarty easure that takes to accout both uerc ad categorcal attrbutes. Assue s s the dsslarty easure o uerc attrbutes defed by the squared Eucldea dstace ad s c s the dsslarty easure o categorcal attrbutes defed as the uber of satches of categores betwee two obects. We defe the dsslarty easure betwee two obects as s + γs c, where γ s a weght to balace the two parts to avod favourg ether type of attrbute. The clusterg process of the k-prototypes algorth s slar to the k-eas algorth except that a ew ethod s used to update the categorcal attrbute values of cluster * The author wshes to ackowledge that ths work was carred out wth the Cooperatve Research Cetre for Advaced Coputatoal Systes (ACSys establshed uder the Australa Goveret s Cooperatve Research Cetres Progra.

2 prototypes. A proble usg that algorth s to choose a proper weght. We have suggested the use of the average stadard devato of uerc attrbutes as a gude choosg the weght. The k-odes algorth preseted ths paper s a splfcato of the k-prototypes algorth by oly takg categorcal attrbutes to accout. Therefore, weght γ s o loger ecessary the algorth because of the dsappearace of s. If uerc attrbutes are volved a data set, we categorse the usg a ethod as descrbed (Aderberg 973. The bggest advatage of ths algorth s that t s scalable to very large data sets. Tested wth a health surace data set cosstg of half a llo records ad 34 categorcal attrbutes, ths algorth has show a capablty of clusterg the data set to 00 clusters about a hour usg a sgle processor of a Su Eterprse 4000 coputer. Ralabodray (995 preseted aother approach to usg the k-eas algorth to cluster categorcal data. Ralabodray s approach eeds to covert ultple category attrbutes to bary attrbutes (usg 0 ad to represet ether a category abset or preset ad to treat the bary attrbutes as uerc the k-eas algorth. If t s used data g, ths approach requres to hadle a large uber of bary attrbutes because data sets data g ofte have categorcal attrbutes wth hudreds or thousads of categores. Ths wll evtably crease both coputatoal ad space costs of the k- eas algorth. The other drawback s that the cluster eas, gve by real values betwee 0 ad, do ot dcate the characterstcs of the clusters. Coparatvely, the k-odes algorth drectly works o categorcal attrbutes ad produces the cluster odes, whch descrbe the clusters, thus very useful to the user terpretg the clusterg results. Usg Gower s slarty coeffcet (Gower 97 ad other dsslarty easures (Gowda ad Dday 99 oe ca use a herarchcal clusterg ethod to cluster categorcal or xed data. However, the herarchcal clusterg ethods are ot effcet processg large data sets. Ther use s lted to sall data sets. The rest of the paper s orgased as follows. Categorcal data ad ts represetato are descrbed Secto 2. I Secto 3 we brefly revew the k-eas algorth ad ts portat propertes. I Secto 4 we dscuss the k-odes algorth. I Secto 5 we preset soe experetal results o two real data sets to show the classfcato perforace ad coputatoal effcecy of the k-odes algorth. We suarse our dscussos ad descrbe our future work pla Secto 6. 2 Categorcal Data Categorcal data as referred to ths paper s the data descrbg obects whch have oly categorcal attrbutes. The obects, called categorcal obects, are a splfed verso of the sybolc obects defed (Gowda ad Dday 99. We cosder all uerc (quattatve attrbutes are categorsed ad do ot cosder categorcal attrbutes that have cobatoal values, e.g., Laguagesspoke (Chese, Eglsh. The followg two subsectos defe the categorcal attrbutes ad obects accepted by the algorth. 2. Categorcal Doas ad Attrbutes Let A, A 2,, A be attrbutes descrbg a space Ω ad DOM(A, DOM(A 2,, DOM(A the doas of the attrbutes. A doa DOM(A s defed as categorcal f t s fte ad uordered, e.g., for ay a, b DOM(A, ether a = b or a b. A s called a categorcal attrbute. Ω s a categorcal space f all A, A 2,, A are categorcal. A categorcal doa defed here cotas oly sgletos. Cobatoal values lke (Gowda ad Dday 99 are ot allowed. A specal value, deoted by ε, s defed o all categorcal doas ad used to represet ssg values. To splfy the dsslarty easure we do ot cosder the coceptual cluso relatoshps aog values a categorcal doa lke (Kodratoff ad Tecuc 988 such that car ad vehcle are two categorcal values a doa ad coceptually a car s also a vehcle. However, such relatoshps ay exst real world databases. 2.2 Categorcal Obects Lke (Gowda ad Dday 99 a categorcal obect X Ω s logcally represeted as a coucto of attrbutevalue pars [A = x ] [A 2 = x 2 ] [A = x ], where x DOM(A for. A attrbute-value par [A = x ] s called a selector (Mchalsk ad Stepp 983. Wthout abguty we represet X as a vector [x, x 2,, x ]. We cosder every obect Ω has exactly attrbute values. If the value of attrbute A s ot avalable for a obect X, the A = ε. Let X = {X, X 2,, X } be a set of categorcal obects ad X Ω. Obect X s represeted as [x,, x,2,, x, ]. We wrte X = X k f x, = x k, for. The relato X = X k does ot ea that X, X k are the sae obect the real world database. It eas the two obects have equal categorcal values attrbutes A, A 2,, A. For exaple, two patets a data set ay have equal values attrbutes Sex, Dsease ad Treatet. However, they are dstgushed the hosptal database by other attrbutes such as ID ad Address whch were ot selected for clusterg.

3 Assue X cossts of obects whch p obects are dstct. Let N be the cardalty of the Cartesa product DOM(A x DOM(A 2 x x DOM(A. We have p N. However, ay be larger tha N, whch eas there are duplcates X. 3 The K-eas Algorth The k-eas algorth (MacQuee 967, Aderberg 973 s bult upo four basc operatos: ( selecto of the tal k eas for k clusters, (2 calculato of the dsslarty betwee a obect ad the ea of a cluster, (3 allocato of a obect to the cluster whose ea s earest to the obect, (4 Re-calculato of the ea of a cluster fro the obects allocated to t so that the tra cluster dsslarty s sed. Except for the frst operato, the other three operatos are repeatedly perfored the algorth utl the algorth coverges. The essece of the algorth s to se the cost fucto k E = y, l d( X, Ql l= where s the uber of obects a data set X, X X, Q l s the ea of cluster l, ad y, l s a eleet of a partto atrx Y x l as (Had 98. d s a dsslarty easure usually defed by the squared Eucldea dstace. There exst a few varats of the k-eas algorth whch dffer selecto of the tal k eas, dsslarty calculatos ad strateges to calculate cluster eas (Aderberg 973, Bobrowsk ad Bezdek 99. The sophstcated varats of the k-eas algorth clude the well-kow ISODATA algorth (Ball ad Hall 967 ad the fuzzy k-eas algorths (Rusp 969, 973. Most k-eas type algorths have bee proved coverget (MacQuee 967, Bezdek 980, Sel ad Isal 984. The k-eas algorth has the followg portat propertes.. It s effcet processg large data sets. The coputatoal coplexty of the algorth s O(tk, where s the uber of attrbutes, s the uber of obects, k s the uber of clusters, ad t s the uber of teratos over the whole data set. Usually, k,, t <<. I clusterg large data sets the k-eas algorth s uch faster tha the herarchcal clusterg algorths whose geeral coputatoal coplexty s O( 2 (Murtagh It ofte terates at a local optu (MacQuee 967, Sel ad Isal 984. To fd out the global optu, techques such as deterstc aealg (Krkpatrck et al. 983, Rose et al. 990 ad geetc algorths (Goldberg 989, Murthy ( ad Chowdhury 996 ca be corporated wth the k-eas algorth. 3. It works oly o uerc values because t ses a cost fucto by calculatg the eas of clusters. 4. The clusters have covex shapes (Aderberg 973. Therefore, t s dffcult to use the k-eas algorth to dscover clusters wth o-covex shapes. Oe dffculty usg the k-eas algorth s to specfy the uber of clusters. Soe varats lke ISODATA clude a procedure to search for the best k at the cost of soe perforace. The k-eas algorth s best suted for data g because of ts effcecy processg large data sets. However, workg oly o uerc values lts ts use data g because data sets data g ofte have categorcal values. Developet of the k-odes algorth to be dscussed the ext secto was otvated by the desre to reove ths ltato ad exted ts use to categorcal doas. 4 The K-odes Algorth The k-odes algorth s a splfed verso of the k- prototypes algorth descrbed (Huag 997. I ths algorth we have ade three aor odfcatos to the k-eas algorth,.e., usg dfferet dsslarty easures, replacg k eas wth k odes, ad usg a frequecy based ethod to update odes. These odfcatos are dscussed below. 4. Dsslarty Measures Let X, Y be two categorcal obects descrbed by categorcal attrbutes. The dsslarty easure betwee X ad Y ca be defed by the total satches of the correspodg attrbute categores of the two obects. The saller the uber of satches s, the ore slar the two obects. Forally, where d( X, Y = δ( x, y = 0 ( x δ( x, y = ( x = y y d(x,y gves equal portace to each category of a attrbute. If we take to accout the frequeces of categores a data set, we ca defe the dsslarty easure as ( x + y d 2 ( X, Y = δ( x, y χ (4 = where x, y are the ubers of obects the data set that have categores x ad y for attrbute. Because x y (2 (3

4 d ( X, Y s slar to the ch-square dstace χ 2 (Greeacre 984, we call t ch-square dstace. Ths dsslarty easure gves ore portace to rare categores tha frequet oes. Eq. (4 s useful dscoverg uder-represeted obect clusters such as fraudulet clas surace databases. 4.2 Mode of a Set Let X be a set of categorcal obects descrbed by categorcal attrbutes A, A 2,, A. Defto: A ode of X s a vector Q = [q, q 2,, q ] Ω that ses D( Q, X = d( X, Q where X = {X, X 2,, X } ad d ca be ether defed as Eq. (2 or Eq. (4. Here, Q s ot ecessarly a eleet of X. 4.3 Fd a Mode for a Set Let ck, be the uber of obects havg category c k, c k, attrbute A ad f r ( A = ck, X = the relatve frequecy of category c k, X. Theore: The fucto D(Q,X s sed ff f ( A = q X f ( A = c X for q c k, for all =... r r k, The proof of the theore s gve the Appedx. The theore defes a way to fd Q fro a gve X, ad therefore s portat because t allows to use the k- eas paradg to cluster categorcal data wthout losg ts effcecy. The theore ples that the ode of a data set X s ot uque. For exaple, the ode of set {[a, b], [a, c], [c, b], [b, c]} ca be ether [a, b] or [a, c]. 4.4 The k-odes Algorth Let {S, S 2,, S k } be a partto of X, where S l for l k, ad {Q,Q 2,,Q k } the odes of {S, S 2,, S k }. The total cost of the partto s defed by k E = y, d( X, Q l= l l where y,l s a eleet of a partto atrx Y as x l (Had 98 ad d ca be ether defed as Eq. (2 or Eq. (4. Slar to the k-eas algorth, the obectve of clusterg X s to fd a set {Q, Q 2,, Q k } that ca se E. Although the for of ths cost fucto s the sae as Eq. (, d s dfferet. Eq. (6 ca be sed by the k-odes algorth below. (5 (6 The k-odes algorth cossts of the followg steps (refer to (Huag 997 for the detaled descrpto of the algorth:. Select k tal odes, oe for each cluster. 2. Allocate a obect to the cluster whose ode s the earest to t accordg to d. Update the ode of the cluster after each allocato accordg to the Theore. 3. After all obects have bee allocated to clusters, retest the dsslarty of obects agast the curret odes. If a obect s foud such that ts earest ode belogs to aother cluster rather tha ts curret oe, reallocate the obect to that cluster ad update the odes of both clusters. 4. Repeat 3 utl o obect has chaged clusters after a full cycle test of the whole data set. Lke the k-eas algorth the k-odes algorth also produces locally optal solutos that are depedet o the tal odes ad the order of obects the data set. I Secto 5 we use a real exaple to show how approprate tal ode selecto ethods ca prove the clusterg results. I our curret pleetato of the k-odes algorth we clude two tal ode selecto ethods. The frst ethod selects the frst k dstct records fro the data set as the tal k odes. The secod ethod s pleeted the followg steps.. Calculate the frequeces of all categores for all attrbutes ad store the a category array the descedg order of frequecy as show Fgure. Here, c, deotes category of attrbute ad f(c, f(c +, where f(c, s the frequecy of category c,. c c c c c c c c c c c c4, c4, 3 c5, 3,, 2, 3, 4 2, 2, 2 2, 3 2, 4 3, 3, 3 3, 4 Fgure. The category array of a data set wth 4 attrbutes havg 4, 2, 5, 3 categores respectvely. 2. Assg the ost frequet categores equally to the tal k odes. For exaple Fgure, assue k = 3. We assg Q = [q, =c,, q,2 =c 2,2, q,3 =c 3,3, q,4 =c,4 ], Q 2 = [q 2, =c 2,, q 2,2 =c,2, q 2,3 =c 4,3, q 2,4 =c 2,4 ] ad Q 3 = [q 3, =c 3,, q 3,2 =c 2,2, q 3,3 =c,3, q 3,4 =c 3,4 ]. 3. Start wth Q. Select the record ost slar to Q ad substtute Q wth the record as the frst tal

5 ode. The select the record ost slar to Q 2 ad substtute Q 2 wth the record as the secod tal ode. Cotue ths process utl Q k s substtuted. I these selectos Q l Q t for l t. Step 3 s take to avod the occurrece of epty clusters. The purpose of ths selecto ethod s to ake the tal odes dverse, whch ca result better clusterg results (see Secto Experetal Results We used the well kow soybea dsease data to test classfcato perforace of the algorth ad aother large data set selected fro a health surace database to test coputatoal effcecy of the algorth. The secod data set cossts of half a llo records, each beg descrbed by 34 categorcal attrbutes. 5. Tests o Soybea Dsease Data 5.. Test Data Sets The soybea dsease data s oe of the stadard test data sets used the ache learg couty. It has ofte bee used to test coceptual clusterg algorths (Mchalsk ad Stepp 983, Fsher 987. We chose ths data set to test our algorth because of ts publcty ad because all ts attrbutes ca be treated as categorcal wthout categorsato. The soybea data set has 47 observatos, each beg descrbed by 35 attrbutes. Each observato s detfed by oe of the 4 dseases -- Daporthe Ste Caker, Charcoal Rot, Rhzoctoa Root Rot, ad Phytophthora Rot. Except for Phytophthora Rot whch has 7 observatos, all other dseases have 0 observatos each. Eq. (2 was used the tests because all dsease classes are alost equally dstrbuted. Of the 35 attrbutes we oly selected 2 because the other 4 have oly oe category. To study the effect of record order, we created 00 test data sets by radoly reorderg the 47 observatos. By dog ths we were also selectg dfferet records for the tal odes usg the frst selecto ethod. All dsease detfcatos were reoved fro the test data sets Clusterg Results We used the k-odes algorth to cluster each test data set to 4 clusters wth the two tal ode selecto ethods ad produced 200 clusterg results. For each clusterg result we used a sclassfcato atrx to aalyse the correspodece betwee clusters ad the dsease classes of the observatos. Two sclassfcato atrces for the test data sets ad 9 are show Fgure 2. The captal letters D, C, R, P the frst colu of the atrces represet the 4 dsease classes. I fgure 2(a there s oe to oe correspodece betwee clusters ad dsease classes, whch eas the observatos the sae dsease classes were clustered to the sae clusters. Ths represets a coplete recovery of the 4 dsease classes fro the test data set. I Fgure 2(b two observatos of the dsease class P were sclassfed to cluster whch was doated by the observatos of the dsease class R. However, the observatos the other two dsease classes were correctly clustered to clusters 3 ad 4. Ths clusterg result ca also be cosdered good. Cluster Cluster 2 Cluster 3 Cluster 4 D 0 C 0 R 0 P 7 (a Cluster Cluster 2 Cluster 3 Cluster 4 D 0 C 0 R 0 P 2 5 (b Fgure 2. Two sclassfcato atrces. (a Correspodece betwee clusters of test data set ad dsease classes. (b Correspodece betwee clusters of test data set 9 ad dsease classes. If we use the uber of sclassfed observatos as a easure of a clusterg result, we ca suarse the 200 clusterg results Table. The frst colu the table gves the uber of sclassfed observatos. The secod ad thrd colus show the ubers of clusterg results. Table. Msclassfed Frst Selecto Method Secod Selecto Method Observatos > If we cosder the uber of sclassfed observatos less tha 6 as a good clusterg result, the 45 good results were produced wth the frst selecto ethod ad 64 good results wth the secod selecto ethod. Both selecto ethods produced ore tha 0 coplete recovery results (0 sclassfcato. These results dcate that f we radoly choose oe test data set, we have a 45% chace to obta a good clusterg result wth the frst selecto ethod ad a 64% chace wth the secod selecto ethod.

6 Table 2 shows the relatoshps betwee the clusterg results ad the clusterg costs (values of Eq. (6. The ubers brackets are the ubers of clusterg results havg the correspodg clusterg cost values. All total satches of bad clusterg results are greater tha those of good clusterg results. The al total satch uber these tests s 94 whch s lkely the global u. These relatoshps dcate that we ca use the clusterg cost values fro several rus to choose a good clusterg result f the orgal classfcato of data s ukow. We dd the sae tests usg a k-eas algorth whch s based o the versos 3 ad 5 of subroute KMEAN (Aderberg 973. I these tests we sply treated all attrbutes as uerc ad used the squared Eucldea dstace as the dsslarty easure. The tal eas were selected by the frst ethod. Of 00 clusterg results we oly got 4 good oes of whch 2 had a coplete recovery. Coparg the cost values of the 4 good clusterg results wth other clusterg results, we foud that the clusterg results ad the cost values are ot related. Therefore, a good clusterg result caot be selected accordg to ts cost value. Table 2. Msclassfed Observatos Total satches for ethod Total satches for ethod (3 94(4 94(7 94(7, 97( 2 94(2 94(25,95( 3 95(2,97(, 20( 95(6,96(2,97( 4 95(2,96(3,97(2 95(4,96(,97( 5 97(2 97( > produce dscratve characterstcs of clusters slar to those (Mchalsk ad Stepp Tests o a Large Data Set The purpose of ths experet was to test the scalablty of the k-odes algorth clusterg very large real world data sets. We selected a large data set fro a health surace database. The data set cossts of records, each beg descrbed by 34 categorcal attrbutes whch 4 have ore tha 000 categores each. We tested two scalabltes of the algorth usg ths large data set. The frst oe s the scalablty of the algorth agast the uber of clusters for a gve uber of obects ad the secod s the scalablty agast the uber of obects for a gve uber of clusters. Fgures 3 ad 4 show the results produced usg a sgle processor of a Su Eterprse 4000 coputer. The plots the fgures represet the average te perforace of 5 depedet rus. Real ru te secods Nuber of clusters Fgure 3. Scalablty to the uber of clusters clusterg records Table 3. No. of classes No. of rus Mea cost Std Dev Real ru te secods The effect of tal odes o clusterg results s show Table 3. The frst colu s the uber of dsease classes the tal odes have ad the secod s the correspodg uber of rus wth the uber of dsease classes the tal odes. Ths table dcates that the ore dverse the dsease classes are the tal odes, the better the clusterg results. The tal odes selected by the secod ethod have 3 dsease types, therefore ore good cluster results were produced tha by the frst ethod. Fro the odes ad category dstrbutos of dfferet attrbutes dfferet clusters the algorth ca also Nuber of records 000 Fgure 4. Scalablty to the uber of records clustered to 00 clusters. These results are very ecouragg because they show clearly a lear crease te as both the uber of clusters ad uber of records crease. Clusterg half a llo obects to 00 clusters took about a hour, whch s qute acceptable. Copared wth the results of clusterg data wth xed values (Huag 997, ths algorth s uch faster tha ts prevous verso because t eeds ay less teratos to coverge.

7 The above soybea dsease data tests dcate that a good clusterg result should be selected fro ultple rus of the algorth over the sae data set wth dfferet record orders ad/or dfferet tal odes. Ths ca be doe practce by rug the algorth parallel o a parallel coputg syste. Other parts of the algorth such as the operato to allocate a obect to a cluster ca also be parallelsed to prove the perforace. 6 Suary ad Future Work The bggest advatage of the k-eas algorth data g applcatos s ts effcecy clusterg large data sets. However, ts use s lted to uerc values. The k-odes algorth preseted ths paper has reoved ths ltato whlst preservg ts effcecy. The k-odes algorth has ade the followg extesos to the k-eas algorth:. replacg eas of clusters wth odes, 2. usg ew dsslarty easures to deal wth categorcal obects, ad 3. usg a frequecy based ethod to update odes of clusters. These extesos allow us to use the k-eas paradg drectly to cluster categorcal data wthout eed of data coverso. Aother advatage of the k-odes algorth s that the odes gve characterstc descrptos of clusters. These descrptos are very portat to the user terpretg clusterg results. Because data g deals wth very large data sets, scalablty s a basc requreet to the data g algorths. Our experetal results have deostrated that the k-odes algorth s deed scalable to very large ad coplex data sets ters of both the uber of records ad the uber of clusters. I fact the k-odes algorth s faster tha the k-eas algorth because our experets have show that the forer ofte eeds less teratos to coverge tha the later. Our future work pla s to develop ad pleet a parallel k-odes algorth to cluster data sets wth llos of obects. Such a algorth s requred a uber of data g applcatos, such as parttog very large heterogeeous sets of obects to a uber of saller ad ore aageable hoogeeous subsets that ca be ore easly odelled ad aalysed, ad detectg uder-represeted cocepts, e.g., fraud a very large uber of surace clas. Ackowledgets The author s grateful to Dr Markus Heglad at The Australa Natoal Uversty, Mr Peter Mle ad Dr Graha Wllas at CSIRO for ther coets o the paper. Refereces Aderberg, M. R. (973 Cluster Aalyss for Applcatos, Acadec Press. Ball, G. H. ad Hall, D. J. (967 A Clusterg Techque for Suarzg Multvarate Data, Behavoral Scece, 2, pp Bezdek, J. C. (980 A Covergece Theore for the Fuzzy ISODATA Clusterg Algorths, IEEE Trasactos o Patter Aalyss ad Mache Itellgece, 2(8, pp. -8. Bobrowsk, L. ad Bezdek, J. C. (99 c-meas Clusterg wth the l ad l Nors, IEEE Trasactos o Systes, Ma ad Cyberetcs, 2(3, pp Fsher, D. H. (987 Kowledge Acqusto Va Icreetal Coceptual Clusterg, Mache Learg, 2(2, pp Goldberg, D. E. (989 Geetc Algorths Search, Optsato, ad Mache Learg, Addso-Wesley. Gowda, K. C. ad Dday, E. (99 Sybolc Clusterg Usg a New Dsslarty Measure, Patter Recogto, 24(6, pp Gower, J. C. (97 A Geeral Coeffcet of Slarty ad Soe of ts Propertes, BoMetrcs, 27, pp Greeacre, M. J. (984 Theory ad Applcatos of Correspodece Aalyss, Acadec Press. Had, D. J. (98 Dscrato ad Classfcato, Joh Wley & Sos. Huag, Z. (997 Clusterg Large Data Sets wth Mxed Nuerc ad Categorcal Values, I Proceedgs of The Frst Pacfc-Asa Coferece o Kowledge Dscovery ad Data Mg, Sgapore, World Scetfc. Ja, A. K. ad Dubes, R. C. (988 Algorths for Clusterg Data, Pretce Hall. Krkpatrck, S., Gelatt, C. D. ad Vecch, M. P. (983 Optsato by Sulated Aealg, Scece, 220(4598, pp Kodratoff, Y. ad Tecuc, G. (988 Learg Based o Coceptual Dstace, IEEE Trasactos o Patter Aalyss ad Mache Itellgece, 0(6, pp MacQuee, J. B. (967 Soe Methods for Classfcato ad Aalyss of Multvarate Observatos, I Proceedgs of the 5 th Berkeley Syposu o Matheatcal Statstcs ad Probablty, pp Mchalsk, R. S. ad Stepp, R. E. (983 Autoated Costructo of Classfcatos: Coceptual Clusterg Versus Nuercal Taxooy, IEEE Trasactos o Patter Aalyss ad Mache Itellgece, 5(4, pp Murtagh, F. (992 Coets o Parallel Algorths for Herarchcal Clusterg ad Cluster Valdty, IEEE Trasactos o Patter Aalyss ad Mache Itellgece, 4(0, pp

8 Murthy, C. A. ad Chowdhury, N. (996 I Search of Optal Clusters Usg Geetc Algorths, Patter Recogto Letters, 7, pp Ralabodray, H. (995 A Coceptual Verso of the k-meas Algorth, Patter Recogto Letters, 6, pp Rose, K., Gurewtz, E. ad Fox, G. (990 A Deterstc Aealg Approach to Clusterg, Patter Recogto Letters,, pp Rusp, E. R. (969 A New Approach to Clusterg, Iforato Cotrol, 9, pp Rusp, E. R. (973 New Experetal Results Fuzzy Clusterg, Iforato Sceces, 6, pp Sel, S. Z. ad Isal, M. A. (984 K-Meas-Type Algorths: A Geeralzed Covergece Theore ad Characterzato of Local Optalty, IEEE Trasactos o Patter Aalyss ad Mache Itellgece, 6(, pp Shafer, J., Agrawal, R. ad Metha, M. (996 SPRINT: A Scalable Parallel Classfer for Data Mg, I Proceedgs of the 22 d VLDB Coferece, Bobay, Ida, pp Appedx The theore Secto 4.3 ca be proved as follows (A stads for DOM(A here: c k, Let f r ( A = ck, X = be the relatve frequecy of category c k, of attrbute A, where s the total uber of obects X ad ck, the uber of obects havg category c k,. For the dsslarty easure d( x, y = δ( x, y, we wrte = d( X, Q = δ( x, q = = ( δ( x, q = q = ( = = ( f ( A = q X =,, r Because ( f ( A = q X 0 for, r d( X, Q s sed ff every ( f ( A = q X s r al. Thus, f ( A = q X ust be axal. r d χ 2 For the dsslarty easure ( x + y ( x, y = δ( x, y, we wrte = d ( X, Q 2 x x y ( x + q, = δ( x,, q = x, q = ( + δ( x,, q = = q x, = δ( x,, q + δ( x,, q = = q = = Now we have δ( x,, q x, = x, c = f r ( A = ck, X f r ( A = q X k = ck, c where c s the uber of categores A ad ck, the uber of obects havg category c k,. Cosequetly, we get d ( X, Q = ( f r ( A = ck, X + ( c χ 2 = q = Because q q ( f ( A = q X 0 ad ( s a r c = costat for a gve X, d ( X, Q s sed ff every q χ 2 ( f ( A = q X s al. Thus, r f ( A = q X ust be axal. r

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time Joural of Na Ka, Vol. 0, No., pp.5-9 (20) 5 A Study of Urelated Parallel-Mache Schedulg wth Deteroratg Mateace Actvtes to Mze the Total Copleto Te Suh-Jeq Yag, Ja-Yuar Guo, Hs-Tao Lee Departet of Idustral

More information

Fuzzy Task Assignment Model of Web Services Supplier in Collaborative Development Environment

Fuzzy Task Assignment Model of Web Services Supplier in Collaborative Development Environment , pp.199-210 http://dx.do.org/10.14257/uesst.2015.8.6.19 Fuzzy Task Assget Model of Web Servces Suppler Collaboratve Developet Evroet Su Ja 1,2, Peg Xu-ya 1, *, Xu Yg 1,3, Wag Pe-e 2 ad Ma Na- 4,2 1. College

More information

A Comparative Study for Email Classification

A Comparative Study for Email Classification A Coparatve Study for Eal Classfcato Seogwook You ad Des McLeod Uversty of Souther Calfora, Los Ageles, CA 90089 USA Abstract - Eal has becoe oe of the fastest ad ost ecoocal fors of coucato. However,

More information

Numerical Comparisons of Quality Control Charts for Variables

Numerical Comparisons of Quality Control Charts for Variables Global Vrtual Coferece Aprl, 8. - 2. 203 Nuercal Coparsos of Qualty Cotrol Charts for Varables J.F. Muñoz-Rosas, M.N. Pérez-Aróstegu Uversty of Graada Facultad de Cecas Ecoócas y Epresarales Graada, pa

More information

6.7 Network analysis. 6.7.1 Introduction. References - Network analysis. Topological analysis

6.7 Network analysis. 6.7.1 Introduction. References - Network analysis. Topological analysis 6.7 Network aalyss Le data that explctly store topologcal formato are called etwork data. Besdes spatal operatos, several methods of spatal aalyss are applcable to etwork data. Fgure: Network data Refereces

More information

Developing a Fuzzy Search Engine Based on Fuzzy Ontology and Semantic Search

Developing a Fuzzy Search Engine Based on Fuzzy Ontology and Semantic Search 0 IEEE Iteratoal Coferece o Fuzzy Systes Jue 7-30, 0, Tape, Tawa Developg a Fuzzy Search Ege Based o Fuzzy Otology ad Seatc Search Le-Fu La Chao-Ch Wu Pe-Yg L Dept. of Coputer Scece ad Iforato Egeerg Natoal

More information

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN Wojcech Zelńsk Departmet of Ecoometrcs ad Statstcs Warsaw Uversty of Lfe Sceces Nowoursyowska 66, -787 Warszawa e-mal: wojtekzelsk@statystykafo Zofa Hausz,

More information

Measuring the Quality of Credit Scoring Models

Measuring the Quality of Credit Scoring Models Measur the Qualty of Credt cor Models Mart Řezáč Dept. of Matheatcs ad tatstcs, Faculty of cece, Masaryk Uversty CCC XI, Edurh Auust 009 Cotet. Itroducto 3. Good/ad clet defto 4 3. Measur the qualty 6

More information

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN Colloquum Bometrcum 4 ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN Zofa Hausz, Joaa Tarasńska Departmet of Appled Mathematcs ad Computer Scece Uversty of Lfe Sceces Lubl Akademcka 3, -95 Lubl

More information

Polyphase Filters. Section 12.4 Porat 1/39

Polyphase Filters. Section 12.4 Porat 1/39 Polyphase Flters Secto.4 Porat /39 .4 Polyphase Flters Polyphase s a way of dog saplg-rate coverso that leads to very effcet pleetatos. But ore tha that, t leads to very geeral vewpots that are useful

More information

APPENDIX III THE ENVELOPE PROPERTY

APPENDIX III THE ENVELOPE PROPERTY Apped III APPENDIX III THE ENVELOPE PROPERTY Optmzato mposes a very strog structure o the problem cosdered Ths s the reaso why eoclasscal ecoomcs whch assumes optmzg behavour has bee the most successful

More information

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are : Bullets bods Let s descrbe frst a fxed rate bod wthout amortzg a more geeral way : Let s ote : C the aual fxed rate t s a percetage N the otoal freq ( 2 4 ) the umber of coupo per year R the redempto of

More information

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation Securty Aalyss of RAPP: A RFID Authetcato Protocol based o Permutato Wag Shao-hu,,, Ha Zhje,, Lu Sujua,, Che Da-we, {College of Computer, Najg Uversty of Posts ad Telecommucatos, Najg 004, Cha Jagsu Hgh

More information

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology I The Name of God, The Compassoate, The ercful Name: Problems' eys Studet ID#:. Statstcal Patter Recogto (CE-725) Departmet of Computer Egeerg Sharf Uversty of Techology Fal Exam Soluto - Sprg 202 (50

More information

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev The Gompertz-Makeham dstrbuto by Fredrk Norström Master s thess Mathematcal Statstcs, Umeå Uversty, 997 Supervsor: Yur Belyaev Abstract Ths work s about the Gompertz-Makeham dstrbuto. The dstrbuto has

More information

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki IDENIFICAION OF HE DYNAMICS OF HE GOOGLE S RANKING ALGORIHM A. Khak Sedgh, Mehd Roudak Cotrol Dvso, Departmet of Electrcal Egeerg, K.N.oos Uversty of echology P. O. Box: 16315-1355, ehra, Ira sedgh@eetd.ktu.ac.r,

More information

The Analysis of Development of Insurance Contract Premiums of General Liability Insurance in the Business Insurance Risk

The Analysis of Development of Insurance Contract Premiums of General Liability Insurance in the Business Insurance Risk The Aalyss of Developmet of Isurace Cotract Premums of Geeral Lablty Isurace the Busess Isurace Rsk the Frame of the Czech Isurace Market 1998 011 Scetfc Coferece Jue, 10. - 14. 013 Pavla Kubová Departmet

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

10.5 Future Value and Present Value of a General Annuity Due

10.5 Future Value and Present Value of a General Annuity Due Chapter 10 Autes 371 5. Thomas leases a car worth $4,000 at.99% compouded mothly. He agrees to make 36 lease paymets of $330 each at the begg of every moth. What s the buyout prce (resdual value of the

More information

of the relationship between time and the value of money.

of the relationship between time and the value of money. TIME AND THE VALUE OF MONEY Most agrbusess maagers are famlar wth the terms compoudg, dscoutg, auty, ad captalzato. That s, most agrbusess maagers have a tutve uderstadg that each term mples some relatoshp

More information

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog, Frst ad Correspodg Author

More information

Average Price Ratios

Average Price Ratios Average Prce Ratos Morgstar Methodology Paper August 3, 2005 2005 Morgstar, Ic. All rghts reserved. The formato ths documet s the property of Morgstar, Ic. Reproducto or trascrpto by ay meas, whole or

More information

Models for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information

Models for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information JOURNAL OF SOFWARE, VOL 5, NO 3, MARCH 00 75 Models for Selectg a ERP System wth Itutostc rapezodal Fuzzy Iformato Guwu We, Ru L Departmet of Ecoomcs ad Maagemet, Chogqg Uversty of Arts ad Sceces, Yogchua,

More information

Numerical Methods with MS Excel

Numerical Methods with MS Excel TMME, vol4, o.1, p.84 Numercal Methods wth MS Excel M. El-Gebely & B. Yushau 1 Departmet of Mathematcal Sceces Kg Fahd Uversty of Petroleum & Merals. Dhahra, Saud Araba. Abstract: I ths ote we show how

More information

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree , pp.277-288 http://dx.do.org/10.14257/juesst.2015.8.1.25 A New Bayesa Network Method for Computg Bottom Evet's Structural Importace Degree usg Jotree Wag Yao ad Su Q School of Aeroautcs, Northwester Polytechcal

More information

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity Computer Aded Geometrc Desg 19 (2002 365 377 wwwelsevercom/locate/comad Optmal mult-degree reducto of Bézer curves wth costrats of edpots cotuty Guo-Dog Che, Guo-J Wag State Key Laboratory of CAD&CG, Isttute

More information

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability Egeerg, 203, 5, 4-8 http://dx.do.org/0.4236/eg.203.59b003 Publshed Ole September 203 (http://www.scrp.org/joural/eg) Mateace Schedulg of Dstrbuto System wth Optmal Ecoomy ad Relablty Syua Hog, Hafeg L,

More information

The simple linear Regression Model

The simple linear Regression Model The smple lear Regresso Model Correlato coeffcet s o-parametrc ad just dcates that two varables are assocated wth oe aother, but t does ot gve a deas of the kd of relatoshp. Regresso models help vestgatg

More information

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract Preset Value of Autes Uder Radom Rates of Iterest By Abraham Zas Techo I.I.T. Hafa ISRAEL ad Uversty of Hafa, Hafa ISRAEL Abstract Some attempts were made to evaluate the future value (FV) of the expected

More information

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering Moder Appled Scece October, 2009 Applcatos of Support Vector Mache Based o Boolea Kerel to Spam Flterg Shugag Lu & Keb Cu School of Computer scece ad techology, North Cha Electrc Power Uversty Hebe 071003,

More information

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK Fractal-Structured Karatsuba`s Algorthm for Bary Feld Multplcato: FK *The authors are worg at the Isttute of Mathematcs The Academy of Sceces of DPR Korea. **Address : U Jog dstrct Kwahadog Number Pyogyag

More information

DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT

DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT ESTYLF08, Cuecas Meras (Meres - Lagreo), 7-9 de Septembre de 2008 DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT José M. Mergó Aa M. Gl-Lafuete Departmet of Busess Admstrato, Uversty of Barceloa

More information

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data ANOVA Notes Page Aalss of Varace for a Oe-Wa Classfcato of Data Cosder a sgle factor or treatmet doe at levels (e, there are,, 3, dfferet varatos o the prescrbed treatmet) Wth a gve treatmet level there

More information

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R = Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS Objectves of the Topc: Beg able to formalse ad solve practcal ad mathematcal problems, whch the subjects of loa amortsato ad maagemet of cumulatve fuds are

More information

1. The Time Value of Money

1. The Time Value of Money Corporate Face [00-0345]. The Tme Value of Moey. Compoudg ad Dscoutg Captalzato (compoudg, fdg future values) s a process of movg a value forward tme. It yelds the future value gve the relevat compoudg

More information

AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM ON CLOUD SERVICE PROVIDER BASED ON GENETIC

AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM ON CLOUD SERVICE PROVIDER BASED ON GENETIC Joural of Theoretcal ad Appled Iformato Techology 0 th Aprl 204. Vol. 62 No. 2005-204 JATIT & LLS. All rghts reserved. ISSN: 992-8645 www.jatt.org E-ISSN: 87-395 AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM

More information

Chapter Eight. f : R R

Chapter Eight. f : R R Chapter Eght f : R R 8. Itroducto We shall ow tur our atteto to the very mportat specal case of fuctos that are real, or scalar, valued. These are sometmes called scalar felds. I the very, but mportat,

More information

Report 52 Fixed Maturity EUR Industrial Bond Funds

Report 52 Fixed Maturity EUR Industrial Bond Funds Rep52, Computed & Prted: 17/06/2015 11:53 Report 52 Fxed Maturty EUR Idustral Bod Fuds From Dec 2008 to Dec 2014 31/12/2008 31 December 1999 31/12/2014 Bechmark Noe Defto of the frm ad geeral formato:

More information

Common p-belief: The General Case

Common p-belief: The General Case GAMES AND ECONOMIC BEHAVIOR 8, 738 997 ARTICLE NO. GA97053 Commo p-belef: The Geeral Case Atsush Kaj* ad Stephe Morrs Departmet of Ecoomcs, Uersty of Pesylaa Receved February, 995 We develop belef operators

More information

Credibility Premium Calculation in Motor Third-Party Liability Insurance

Credibility Premium Calculation in Motor Third-Party Liability Insurance Advaces Mathematcal ad Computatoal Methods Credblty remum Calculato Motor Thrd-arty Lablty Isurace BOHA LIA, JAA KUBAOVÁ epartmet of Mathematcs ad Quattatve Methods Uversty of ardubce Studetská 95, 53

More information

Green Master based on MapReduce Cluster

Green Master based on MapReduce Cluster Gree Master based o MapReduce Cluster Mg-Zh Wu, Yu-Chag L, We-Tsog Lee, Yu-Su L, Fog-Hao Lu Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of

More information

A Parallel Transmission Remote Backup System

A Parallel Transmission Remote Backup System 2012 2d Iteratoal Coferece o Idustral Techology ad Maagemet (ICITM 2012) IPCSIT vol 49 (2012) (2012) IACSIT Press, Sgapore DOI: 107763/IPCSIT2012V495 2 A Parallel Trasmsso Remote Backup System Che Yu College

More information

De-Duplication Scheduling Strategy in Real-Time Data Warehouse

De-Duplication Scheduling Strategy in Real-Time Data Warehouse Sed Orders for Reprts to reprts@bethascece.ae he Ope Cyberetcs & Systecs Joural, 25, 9, 37-43 37 Ope Access De-Duplcato Schedulg Strategy Real-e Data Warehouse Hu Lu, Je Sog 2,*, JBoWu 2, ad Yu-B Bao 3

More information

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil ECONOMIC CHOICE OF OPTIMUM FEEDER CABE CONSIDERING RISK ANAYSIS I Camargo, F Fgueredo, M De Olvera Uversty of Brasla (UB) ad The Brazla Regulatory Agecy (ANEE), Brazl The choce of the approprate cable

More information

Online Appendix: Measured Aggregate Gains from International Trade

Online Appendix: Measured Aggregate Gains from International Trade Ole Appedx: Measured Aggregate Gas from Iteratoal Trade Arel Burste UCLA ad NBER Javer Cravo Uversty of Mchga March 3, 2014 I ths ole appedx we derve addtoal results dscussed the paper. I the frst secto,

More information

Speeding up k-means Clustering by Bootstrap Averaging

Speeding up k-means Clustering by Bootstrap Averaging Speedg up -meas Clusterg by Bootstrap Averagg Ia Davdso ad Ashw Satyaarayaa Computer Scece Dept, SUNY Albay, NY, USA,. {davdso, ashw}@cs.albay.edu Abstract K-meas clusterg s oe of the most popular clusterg

More information

Automated Event Registration System in Corporation

Automated Event Registration System in Corporation teratoal Joural of Advaces Computer Scece ad Techology JACST), Vol., No., Pages : 0-0 0) Specal ssue of CACST 0 - Held durg 09-0 May, 0 Malaysa Automated Evet Regstrato System Corporato Zafer Al-Makhadmee

More information

Load Balancing via Random Local Search in Closed and Open systems

Load Balancing via Random Local Search in Closed and Open systems Load Balacg va Rado Local Search Closed ad Ope systes A. Gaesh Dept. of Matheatcs Uversty of Brstol, UK a.gaesh@brstol.ac.u A. Proutere Mcrosoft Research Cabrdge, UK aproute@crosoft.co S. Llethal Stats

More information

Study on prediction of network security situation based on fuzzy neutral network

Study on prediction of network security situation based on fuzzy neutral network Avalable ole www.ocpr.com Joural of Chemcal ad Pharmaceutcal Research, 04, 6(6):00-06 Research Artcle ISS : 0975-7384 CODE(USA) : JCPRC5 Study o predcto of etwork securty stuato based o fuzzy eutral etwork

More information

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0 Chapter 2 Autes ad loas A auty s a sequece of paymets wth fxed frequecy. The term auty orgally referred to aual paymets (hece the ame), but t s ow also used for paymets wth ay frequecy. Autes appear may

More information

Research on the Evaluation of Information Security Management under Intuitionisitc Fuzzy Environment

Research on the Evaluation of Information Security Management under Intuitionisitc Fuzzy Environment Iteratoal Joural of Securty ad Its Applcatos, pp. 43-54 http://dx.do.org/10.14257/sa.2015.9.5.04 Research o the Evaluato of Iformato Securty Maagemet uder Itutostc Fuzzy Evromet LI Feg-Qua College of techology,

More information

Constrained Cubic Spline Interpolation for Chemical Engineering Applications

Constrained Cubic Spline Interpolation for Chemical Engineering Applications Costraed Cubc Sple Iterpolato or Chemcal Egeerg Applcatos b CJC Kruger Summar Cubc sple terpolato s a useul techque to terpolate betwee kow data pots due to ts stable ad smooth characterstcs. Uortuatel

More information

Project 3 Weight analysis

Project 3 Weight analysis The Faculty of Power ad Aeroautcal Egeerg Arcraft Desg Departet Project 3 Weght aalyss Ths project cossts of two parts. Frst part cludes fuselage teror (cockpt) coceptual desg. Secod part cludes etoed

More information

Suspicious Transaction Detection for Anti-Money Laundering

Suspicious Transaction Detection for Anti-Money Laundering Vol.8, No. (014), pp.157-166 http://dx.do.org/10.1457/jsa.014.8..16 Suspcous Trasacto Detecto for At-Moey Lauderg Xgrog Luo Vocatoal ad techcal college Esh Esh, Hube, Cha es_lxr@16.com Abstract Moey lauderg

More information

On Error Detection with Block Codes

On Error Detection with Block Codes BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 3 Sofa 2009 O Error Detecto wth Block Codes Rostza Doduekova Chalmers Uversty of Techology ad the Uversty of Gotheburg,

More information

An Effectiveness of Integrated Portfolio in Bancassurance

An Effectiveness of Integrated Portfolio in Bancassurance A Effectveess of Itegrated Portfolo Bacassurace Taea Karya Research Ceter for Facal Egeerg Isttute of Ecoomc Research Kyoto versty Sayouu Kyoto 606-850 Japa arya@eryoto-uacp Itroducto As s well ow the

More information

A Bayesian Combination Forecasting Model for Retail Supply Chain Coordination

A Bayesian Combination Forecasting Model for Retail Supply Chain Coordination A Bayesa Cobato Forecastg Model or Retal Supply Cha Coordato W.J. Wag* ad Q. Xu Glorous Su School o Busess ad Maageet, Doghua Uversty Shagha, P.R.Cha *wejew@dhu.edu.c ABSTRACT Retalg plays a portat part

More information

POSTRACK: A Low Cost Real-Time Motion Tracking System for VR Application

POSTRACK: A Low Cost Real-Time Motion Tracking System for VR Application POSTRACK: A Low Cost Real-Te Moto Trackg Syste for VR Applcato Jaeyog Chug, Nagyu K, Gerard Joughyu K, ad Cha-Mo Park VR Laboratory, Departet of Coputer Scece ad Egeerg, Pohag Uversty of Scece ad Techology

More information

ISyE 512 Chapter 7. Control Charts for Attributes. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

ISyE 512 Chapter 7. Control Charts for Attributes. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison ISyE 512 Chapter 7 Cotrol Charts for Attrbutes Istructor: Prof. Kabo Lu Departmet of Idustral ad Systems Egeerg UW-Madso Emal: klu8@wsc.edu Offce: Room 3017 (Mechacal Egeerg Buldg) 1 Lst of Topcs Chapter

More information

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

The impact of service-oriented architecture on the scheduling algorithm in cloud computing Iteratoal Research Joural of Appled ad Basc Sceces 2015 Avalable ole at www.rjabs.com ISSN 2251-838X / Vol, 9 (3): 387-392 Scece Explorer Publcatos The mpact of servce-oreted archtecture o the schedulg

More information

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time. Computatoal Geometry Chapter 6 Pot Locato 1 Problem Defto Preprocess a plaar map S. Gve a query pot p, report the face of S cotag p. S Goal: O()-sze data structure that eables O(log ) query tme. C p E

More information

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li Iteratoal Joural of Scece Vol No7 05 ISSN: 83-4890 Proecto model for Computer Network Securty Evaluato wth terval-valued tutostc fuzzy formato Qgxag L School of Software Egeerg Chogqg Uversty of rts ad

More information

How do bookmakers (or FdJ 1 ) ALWAYS manage to win?

How do bookmakers (or FdJ 1 ) ALWAYS manage to win? How do bookakers (or FdJ ALWAYS aage to w? Itroducto otatos & varables Bookaker's beeft eected value 4 4 Bookaker's strateges5 4 The hoest bookaker 6 4 "real lfe" bookaker 6 4 La FdJ 8 5 How ca we estate

More information

Fault Tree Analysis of Software Reliability Allocation

Fault Tree Analysis of Software Reliability Allocation Fault Tree Aalyss of Software Relablty Allocato Jawe XIANG, Kokch FUTATSUGI School of Iformato Scece, Japa Advaced Isttute of Scece ad Techology - Asahda, Tatsuokuch, Ishkawa, 92-292 Japa ad Yaxag HE Computer

More information

STOCK INVESTMENT MANAGEMENT UNDER UNCERTAINTY. Madalina Ecaterina ANDREICA 1 Marin ANDREICA 2

STOCK INVESTMENT MANAGEMENT UNDER UNCERTAINTY. Madalina Ecaterina ANDREICA 1 Marin ANDREICA 2 "AROACHES IN ORGANISATIONA MANAGEMENT" 15-16 Noveber 01, BCHAREST, ROMANIA STOCK INVESTMENT MANAGEMENT NDER NCERTAINTY Madaa Ecatera ANDREICA 1 Mar ANDREICA ABSTRACT Ths paper presets a stock vestet aageet

More information

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM 28-30 August, 2013 Sarawak, Malaysa. Uverst Utara Malaysa (http://www.uum.edu.my ) ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM Rosshary Abd. Rahma 1 ad Razam Raml 2 1,2 Uverst Utara

More information

An Evaluation of Naïve Bayesian Anti-Spam Filtering Techniques

An Evaluation of Naïve Bayesian Anti-Spam Filtering Techniques Proceedgs of the 2007 IEEE Workshop o Iformato Assurace Uted tates Mltary Academy, West Pot, Y 20-22 Jue 2007 A Evaluato of aïve Bayesa At-pam Flterg Techques Vkas P. Deshpade, Robert F. Erbacher, ad Chrs

More information

Simple Linear Regression

Simple Linear Regression Smple Lear Regresso Regresso equato a equato that descrbes the average relatoshp betwee a respose (depedet) ad a eplaator (depedet) varable. 6 8 Slope-tercept equato for a le m b (,6) slope. (,) 6 6 8

More information

A 360 Degree Feedback Model for Performance Appraisal Based on Fuzzy AHP and TOPSIS

A 360 Degree Feedback Model for Performance Appraisal Based on Fuzzy AHP and TOPSIS Iteratoal Joural of Ecooy, aaeet ad Socal Sceces, () Noveber 03, Paes: 969-976 TI Jourals Iteratoal Joural of Ecooy, aaeet ad Socal Sceces www.tourals.co ISSN 306-776 A 360 Deree Feedback odel for Perforace

More information

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS I Ztou, K Smaïl, S Delge, F Bmbot To cte ths verso: I Ztou, K Smaïl, S Delge, F Bmbot. A COMPARATIVE STUDY BETWEEN POLY- CLASS AND MULTICLASS

More information

Performance Measurement Model of Multi-Source Data Fusion Based on Network Situation Awareness

Performance Measurement Model of Multi-Source Data Fusion Based on Network Situation Awareness Leag GUO agx CHEN Chao GAO We XIONG Huazhog Uversty of Scece ad Techology Ar orce Radar Acadey Perforace Measureet Model of Mult-Source Data uso Based o Networ Stuato Awareess Abstract. I order to solve

More information

A particle swarm optimization to vehicle routing problem with fuzzy demands

A particle swarm optimization to vehicle routing problem with fuzzy demands A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg, Ye-me Qa A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg 1,Ye-me Qa 1 School of computer ad formato

More information

n. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom.

n. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom. UMEÅ UNIVERSITET Matematsk-statstska sttutoe Multvarat dataaalys för tekologer MSTB0 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multvarat dataaalys för tekologer B, 5 poäg.

More information

Integrating Production Scheduling and Maintenance: Practical Implications

Integrating Production Scheduling and Maintenance: Practical Implications Proceedgs of the 2012 Iteratoal Coferece o Idustral Egeerg ad Operatos Maagemet Istabul, Turkey, uly 3 6, 2012 Itegratg Producto Schedulg ad Mateace: Practcal Implcatos Lath A. Hadd ad Umar M. Al-Turk

More information

Settlement Prediction by Spatial-temporal Random Process

Settlement Prediction by Spatial-temporal Random Process Safety, Relablty ad Rs of Structures, Ifrastructures ad Egeerg Systems Furuta, Fragopol & Shozua (eds Taylor & Fracs Group, Lodo, ISBN 978---77- Settlemet Predcto by Spatal-temporal Radom Process P. Rugbaapha

More information

Bayesian Network Representation

Bayesian Network Representation Readgs: K&F 3., 3.2, 3.3, 3.4. Bayesa Network Represetato Lecture 2 Mar 30, 20 CSE 55, Statstcal Methods, Sprg 20 Istructor: Su-I Lee Uversty of Washgto, Seattle Last tme & today Last tme Probablty theory

More information

CHAPTER 2. Time Value of Money 6-1

CHAPTER 2. Time Value of Money 6-1 CHAPTER 2 Tme Value of Moey 6- Tme Value of Moey (TVM) Tme Les Future value & Preset value Rates of retur Autes & Perpetutes Ueve cash Flow Streams Amortzato 6-2 Tme les 0 2 3 % CF 0 CF CF 2 CF 3 Show

More information

ON SLANT HELICES AND GENERAL HELICES IN EUCLIDEAN n -SPACE. Yusuf YAYLI 1, Evren ZIPLAR 2. yayli@science.ankara.edu.tr. evrenziplar@yahoo.

ON SLANT HELICES AND GENERAL HELICES IN EUCLIDEAN n -SPACE. Yusuf YAYLI 1, Evren ZIPLAR 2. yayli@science.ankara.edu.tr. evrenziplar@yahoo. ON SLANT HELICES AND ENERAL HELICES IN EUCLIDEAN -SPACE Yusuf YAYLI Evre ZIPLAR Departmet of Mathematcs Faculty of Scece Uversty of Akara Tadoğa Akara Turkey yayl@sceceakaraedutr Departmet of Mathematcs

More information

The Application of Intuitionistic Fuzzy Set TOPSIS Method in Employee Performance Appraisal

The Application of Intuitionistic Fuzzy Set TOPSIS Method in Employee Performance Appraisal Vol.8, No.3 (05), pp.39-344 http://dx.do.org/0.457/uesst.05.8.3.3 The pplcato of Itutostc Fuzzy Set TOPSIS Method Employee Performace pprasal Wag Yghu ad L Welu * School of Ecoomcs ad Maagemet, Shazhuag

More information

An IG-RS-SVM classifier for analyzing reviews of E-commerce product

An IG-RS-SVM classifier for analyzing reviews of E-commerce product Iteratoal Coferece o Iformato Techology ad Maagemet Iovato (ICITMI 205) A IG-RS-SVM classfer for aalyzg revews of E-commerce product Jaju Ye a, Hua Re b ad Hagxa Zhou c * College of Iformato Egeerg, Cha

More information

How To Balance Load On A Weght-Based Metadata Server Cluster

How To Balance Load On A Weght-Based Metadata Server Cluster WLBS: A Weght-based Metadata Server Cluster Load Balacg Strategy J-L Zhag, We Qa, Xag-Hua Xu *, Ja Wa, Yu-Yu Y, Yog-Ja Re School of Computer Scece ad Techology Hagzhou Daz Uversty, Cha * Correspodg author:xhxu@hdu.edu.c

More information

USEFULNESS OF BOOTSTRAPPING IN PORTFOLIO MANAGEMENT

USEFULNESS OF BOOTSTRAPPING IN PORTFOLIO MANAGEMENT USEFULNESS OF BOOTSTRAPPING IN PORTFOLIO MANAGEMENT Radovaov Bors Faculty of Ecoomcs Subotca Segedsk put 9-11 Subotca 24000 E-mal: radovaovb@ef.us.ac.rs Marckć Aleksadra Faculty of Ecoomcs Subotca Segedsk

More information

Report 19 Euroland Corporate Bonds

Report 19 Euroland Corporate Bonds Rep19, Computed & Prted: 17/06/2015 11:38 Report 19 Eurolad Corporate Bods From Dec 1999 to Dec 2014 31/12/1999 31 December 1999 31/12/2014 Bechmark 100% IBOXX Euro Corp All Mats. TR Defto of the frm ad

More information

Banking (Early Repayment of Housing Loans) Order, 5762 2002 1

Banking (Early Repayment of Housing Loans) Order, 5762 2002 1 akg (Early Repaymet of Housg Loas) Order, 5762 2002 y vrtue of the power vested me uder Secto 3 of the akg Ordace 94 (hereafter, the Ordace ), followg cosultato wth the Commttee, ad wth the approval of

More information

Classic Problems at a Glance using the TVM Solver

Classic Problems at a Glance using the TVM Solver C H A P T E R 2 Classc Problems at a Glace usg the TVM Solver The table below llustrates the most commo types of classc face problems. The formulas are gve for each calculato. A bref troducto to usg the

More information

Performance Attribution. Methodology Overview

Performance Attribution. Methodology Overview erformace Attrbuto Methodology Overvew Faba SUAREZ March 2004 erformace Attrbuto Methodology 1.1 Itroducto erformace Attrbuto s a set of techques that performace aalysts use to expla why a portfolo's performace

More information

Discrete-Event Simulation of Network Systems Using Distributed Object Computing

Discrete-Event Simulation of Network Systems Using Distributed Object Computing Dscrete-Evet Smulato of Network Systems Usg Dstrbuted Object Computg Welog Hu Arzoa Ceter for Itegratve M&S Computer Scece & Egeerg Dept. Fulto School of Egeerg Arzoa State Uversty, Tempe, Arzoa, 85281-8809

More information

Regression Analysis. 1. Introduction

Regression Analysis. 1. Introduction . Itroducto Regresso aalyss s a statstcal methodology that utlzes the relato betwee two or more quattatve varables so that oe varable ca be predcted from the other, or others. Ths methodology s wdely used

More information

Relaxation Methods for Iterative Solution to Linear Systems of Equations

Relaxation Methods for Iterative Solution to Linear Systems of Equations Relaxato Methods for Iteratve Soluto to Lear Systems of Equatos Gerald Recktewald Portlad State Uversty Mechacal Egeerg Departmet gerry@me.pdx.edu Prmary Topcs Basc Cocepts Statoary Methods a.k.a. Relaxato

More information

RUSSIAN ROULETTE AND PARTICLE SPLITTING

RUSSIAN ROULETTE AND PARTICLE SPLITTING RUSSAN ROULETTE AND PARTCLE SPLTTNG M. Ragheb 3/7/203 NTRODUCTON To stuatos are ecoutered partcle trasport smulatos:. a multplyg medum, a partcle such as a eutro a cosmc ray partcle or a photo may geerate

More information

Identification of Coherent Groups of Generators Based on Fuzzy Algorithm

Identification of Coherent Groups of Generators Based on Fuzzy Algorithm Proceedgs of the 4 th Iteratoal Mddle East Power Systes Coferece (MEPCON 0), Caro Uversty, Egypt, Deceber 9-, 00, Paper ID 303. Idetfcato of Coheret Groups of Geerators Based o Fuzzy Algorth Mahd M. M.

More information

CH. V ME256 STATICS Center of Gravity, Centroid, and Moment of Inertia CENTER OF GRAVITY AND CENTROID

CH. V ME256 STATICS Center of Gravity, Centroid, and Moment of Inertia CENTER OF GRAVITY AND CENTROID CH. ME56 STTICS Ceter of Gravt, Cetrod, ad Momet of Ierta CENTE OF GITY ND CENTOID 5. CENTE OF GITY ND CENTE OF MSS FO SYSTEM OF PTICES Ceter of Gravt. The ceter of gravt G s a pot whch locates the resultat

More information

Curve Fitting and Solution of Equation

Curve Fitting and Solution of Equation UNIT V Curve Fttg ad Soluto of Equato 5. CURVE FITTING I ma braches of appled mathematcs ad egeerg sceces we come across epermets ad problems, whch volve two varables. For eample, t s kow that the speed

More information

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011 Cyber Jourals: Multdscplary Jourals cece ad Techology, Joural of elected Areas Telecommucatos (JAT), Jauary dto, 2011 A ovel rtual etwork Mappg Algorthm for Cost Mmzg ZHAG hu-l, QIU Xue-sog tate Key Laboratory

More information

The Time Value of Money

The Time Value of Money The Tme Value of Moey 1 Iversemet Optos Year: 1624 Property Traded: Mahatta Islad Prce : $24.00, FV of $24 @ 6%: FV = $24 (1+0.06) 388 = $158.08 bllo Opto 1 0 1 2 3 4 5 t ($519.37) 0 0 0 0 $1,000 Opto

More information

Compressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring

Compressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring Compressve Sesg over Strogly Coected Dgraph ad Its Applcato Traffc Motorg Xao Q, Yogca Wag, Yuexua Wag, Lwe Xu Isttute for Iterdscplary Iformato Sceces, Tsghua Uversty, Bejg, Cha {qxao3, kyo.c}@gmal.com,

More information

Fast, Secure Encryption for Indexing in a Column-Oriented DBMS

Fast, Secure Encryption for Indexing in a Column-Oriented DBMS Fast, Secure Ecrypto for Idexg a Colum-Oreted DBMS Tgja Ge, Sta Zdok Brow Uversty {tge, sbz}@cs.brow.edu Abstract Networked formato systems requre strog securty guaratees because of the ew threats that

More information

Three Dimensional Interpolation of Video Signals

Three Dimensional Interpolation of Video Signals Three Dmesoal Iterpolato of Vdeo Sgals Elham Shahfard March 0 th 006 Outle A Bref reve of prevous tals Dgtal Iterpolato Bascs Upsamplg D Flter Desg Issues Ifte Impulse Respose Fte Impulse Respose Desged

More information

We present a new approach to pricing American-style derivatives that is applicable to any Markovian setting

We present a new approach to pricing American-style derivatives that is applicable to any Markovian setting MANAGEMENT SCIENCE Vol. 52, No., Jauary 26, pp. 95 ss 25-99 ess 526-55 6 52 95 forms do.287/msc.5.447 26 INFORMS Prcg Amerca-Style Dervatves wth Europea Call Optos Scott B. Laprse BAE Systems, Advaced

More information

Chapter 3 0.06 = 3000 ( 1.015 ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization

Chapter 3 0.06 = 3000 ( 1.015 ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization Chapter 3 Mathematcs of Face Secto 4 Preset Value of a Auty; Amortzato Preset Value of a Auty I ths secto, we wll address the problem of determg the amout that should be deposted to a accout ow at a gve

More information