A Comparative Study of Data Clustering Techniques

Size: px
Start display at page:

Download "A Comparative Study of Data Clustering Techniques"

Transcription

1 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES A Comparatve Study of Data Clusterng Technques Khaled Hammouda Prof. Fakhreddne Karray Unversty of Waterloo, Ontaro, Canada Abstract Data clusterng s a process of puttng smlar data nto groups. A clusterng algorthm parttons a data set nto several groups such that the smlarty wthn a group s larger than among groups. Ths paper revews four of the most representatve off-lne clusterng technques: K-means clusterng, Fuzzy C- means clusterng, Mountan clusterng, and Subtractve clusterng. The technques are mplemented and tested aganst a medcal problem of heart dsease dagnoss. Performance and accuracy of the four technques are presented and compared. Index Terms data clusterng, k-means, fuzzy c-means, mountan, subtractve. D I. INTRODUCTION ATA CLUSTERING s consdered an nterestng approach for fndng smlartes n data and puttng smlar data nto groups. Clusterng parttons a data set nto several groups such that the smlarty wthn a group s larger than that among groups []. The dea of data groupng, or clusterng, s smple n ts nature and s close to the human way of thnkng; whenever we are presented wth a large amount of data, we usually t to summarze ths huge number of data nto a small number of groups or categores n order to further facltate ts analyss. Moreover, most of the data collected n many problems seem to have some nherent propertes that l themselves to natural groupngs. Nevertheless, fndng these groupngs or tryng to categorze the data s not a smple task for humans unless the data s of low dmensonalty K. M. Hammouda, Department of Systems Desgn Engneerng, Unversty of Waterloo, Waterloo, Ontaro, Canada NL 3G (two or three dmensons at maxmum.) Ths s why some methods n soft computng have been proposed to solve ths knd of problem. Those methods are called Data Clusterng Methods and they are the subject of ths paper. Clusterng algorthms are used extensvely not only to organze and categorze data, but are also useful for data compresson and model constructon. By fndng smlartes n data, one can represent smlar data wth fewer symbols for example. Also f we can fnd groups of data, we can buld a model of the problem based on those groupngs. Another reason for clusterng s to dscover relevance knowledge n data. Francsco Azuaje et al. [] mplemented a Case Based Reasonng (CBR) system based on a Growng Cell Structure (GCS) model. Data can be stored n a knowledge base that s ndexed or categorzed by cases; ths s what s called a Case Base. Each group of cases s assgned to a certan category. Usng a Growng Cell Structure (GCS) data can be added or removed based on the learnng scheme used. Later when a query s presented to the model, the system retreves the most relevant cases from the case base depng on how close those cases are to the query. In ths paper, four of the most representatve off-lne clusterng technques are revewed: K-means (or Hard C-means) Clusterng, Fuzzy C-means Clusterng, Mountan Clusterng, and Subtractve Clusterng. These technques are usually used n conjuncton wth radal bass functon networks (RBFNs) and Fuzzy Modelng. Those four technques are mplemented and tested aganst a medcal dagnoss problem for heart dsease. The results

2 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES are presented wth a comprehensve comparson of the dfferent technques and the effect of dfferent parameters n the process. The remander of the paper s organzed as follows. Secton II presents an overvew of data clusterng and the underlyng concepts. Secton III presents each of the four clusterng technques n detal along wth the underlyng mathematcal foundatons. Secton IV ntroduces the mplementaton of the technques and goes over the results of each technque, followed by a comparson of the results. A bref concluson s presented n Secton V. The MATLAB code lstng of the four clusterng technques can be found n the appx. II. DATA CLUSTERING OVERVIEW As mentoned earler, data clusterng s concerned wth the parttonng of a data set nto several groups such that the smlarty wthn a group s larger than that among groups. Ths mples that the data set to be parttoned has to have an nherent groupng to some extent; otherwse f the data s unformly dstrbuted, tryng to fnd clusters of data wll fal, or wll lead to artfcally ntroduced parttons. Another problem that may arse s the overlappng of data groups. Overlappng groupngs sometmes reduce the effcency of the clusterng method, and ths reducton s proportonal to the amount of overlap between groupngs. Usually the technques presented n ths paper are used n conjuncton wth other sophstcated neural or fuzzy models. In partcular, most of these technques can be used as preprocessors for determnng the ntal locatons for radal bass functons or fuzzy fthen rules. The common approach of all the clusterng technques presented here s to fnd cluster centers that wll represent each cluster. A cluster center s a way to tell where the heart of each cluster s located, so that later when presented wth an nput vector, the system can tell whch cluster ths vector belongs to by measurng a smlarty metrc between the nput vector and al the cluster centers, and determnng whch cluster s the nearest or most smlar one. Some of the clusterng technques rely on knowng the number of clusters apror. In that case the algorthm tres to partton the data nto the gven number of clusters. K-means and Fuzzy C-means clusterng are of that type. In other cases t s not necessary to have the number of clusters known from the begnnng; nstead the algorthm starts by fndng the frst large cluster, and then goes to fnd the second, and so on. Mountan and Subtractve clusterng are of that type. In both cases a problem of known cluster numbers can be appled; however f the number of clusters s not known, K-means and Fuzzy C-means clusterng cannot be used. Another aspect of clusterng algorthms s ther ablty to be mplemented n on-lne or offlne mode. On-lne clusterng s a process n whch each nput vector s used to update the cluster centers accordng to ths vector poston. The system n ths case learns where the cluster centers are by ntroducng new nput every tme. In off-lne mode, the system s presented wth a tranng data set, whch s used to fnd the cluster centers by analyzng all the nput vectors n the tranng set. Once the cluster centers are found they are fxed, and they are used later to classfy new nput vectors. The technques presented here are of the off-lne type. A bref overvew of the four technques s presented here. Full detaled dscusson wll follow n the next secton. The frst technque s K-means clusterng [6] (or Hard C-means clusterng, as compared to Fuzzy C-means clusterng.) Ths technque has been appled to a varety of areas, ncludng mage and speech data compresson, [3, 4] data preprocessng for system modelng usng radal bass functon networks, and task decomposton n heterogeneous neural network archtectures [5]. Ths algorthm reles on fndng cluster centers by tryng to mnmze a cost functon of dssmlarty (or dstance) measure. The second technque s Fuzzy C-means clusterng, whch was proposed by Bezdek n 973 [] as an mprovement over earler Hard C- means clusterng. In ths technque each data pont belongs to a cluster to a degree specfed by a membershp grade. As n K-means clusterng, Fuzzy C-means clusterng reles on mnmzng a cost functon of dssmlarty measure.

3 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 3 The thrd technque s Mountan clusterng, proposed by Yager and Flev []. Ths technque bulds calculates a mountan functon (densty functon) at every possble poston n the data space, and chooses the poston wth the greatest densty value as the center of the frst cluster. It then destructs the effect of the frst cluster mountan functon and fnds the second cluster center. Ths process s repeated untl the desred number of clusters have been found. The fourth technque s Subtractve clusterng, proposed by Chu []. Ths technque s smlar to mountan clusterng, except that nstead of calculatng the densty functon at every possble poston n the data space, t uses the postons of the data ponts to calculate the densty functon, thus reducng the number of calculatons sgnfcantly. III. DATA CLUSTERING TECHNIQUES In ths secton a detaled dscusson of each technque s presented. Implementaton and results are presented n the followng sectons. A. K-means Clusterng The K-means clusterng, or Hard C-means clusterng, s an algorthm based on fndng data clusters n a data set such that a cost functon (or an objecton functon) of dssmlarty (or dstance) measure s mnmzed []. In most cases ths dssmlarty measure s chosen as the Eucldean dstance. A set of n vectors x j, j =,, n, are to be parttoned nto c groups G, =,, c. The cost functon, based on the Eucldean dstance between a vector x n group j and the k correspondng cluster center c, can be defned by: c c J = J = xk c, () = = k, xk G where J = k, x k G x k c s the cost functon wthn group. The parttoned groups are defned by a c n bnary membershp matrx U, where the element u j s f the j th data pont x j belongs to group, and 0 otherwse. Once the cluster centers c are fxed, the mnmzng u for Equaton () can be derved as follows: u j f, for each k, j j k = x c x c 0 otherwse. Whch means that x j belongs to group f c s the closest center among all centers. On the other hand, f the membershp matrx s fxed,.e. f u j s fxed, then the optmal center c that mnmze Equaton () s the mean of all vectors n group : where G s the sze of j () c = x k, (3) G k, x G k G, or G n = u j= j. The algorthm s presented wth a data set x, =,, n ; t then determnes the cluster centers c and the membershp matrx U teratvely usng the followng steps: Step : Intalze the cluster center c, =,, c. Ths s typcally done by randomly selectng c ponts from among all of the data ponts. Step : Determne the membershp matrx U by Equaton (). Step 3: Compute the cost functon accordng to Equaton (). Stop f ether t s below a certan tolerance value or ts mprovement over prevous teraton s below a certan threshold. Step 4: Update the cluster centers accordng to Equaton (3). Go to step. The performance of the K-means algorthm deps on the ntal postons of the cluster centers, thus t s advsable to run the algorthm several tmes, each wth a dfferent set of ntal cluster centers. A dscusson of the

4 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 4 mplementaton ssues s presented later n ths paper. B. Fuzzy C-means Clusterng Fuzzy C-means clusterng (FCM), reles on the basc dea of Hard C-means clusterng (HCM), wth the dfference that n FCM each data pont belongs to a cluster to a degree of membershp grade, whle n HCM every data pont ether belongs to a certan cluster or not. So FCM employs fuzzy parttonng such that a gven data pont can belong to several groups wth the degree of belongngness specfed by membershp grades between 0 and. However, FCM stll uses a cost functon that s to be mnmzed whle tryng to partton the data set. The membershp matrx U s allowed to have elements wth values between 0 and. However, the summaton of degrees of belongngness of a data pont to all clusters s always equal to unty: c uj =, j =,, n. (4) = The cost functon for FCM s a generalzaton of Equaton (): where c c n m c = = j j = = j= J ( U, c,, c ) J u d, (5) u j s between 0 and ; c s the cluster center of fuzzy group ; d j = c x j s the Eucldean dstance between the th cluster center and the th m, s a j data pont; and [ ) weghtng exponent. The necessary condtons for Equaton (5) to reach ts mnmum are and u j n m u j j x = j = n m u j = j c, (6) = c d j d k= kj /( m ). (7) The algorthm works teratvely through the precedng two condtons untl the no more mprovement s notced. In a batch mode operaton, FCM determnes the cluster centers c and the membershp matrx U usng the followng steps: Step : Intalze the membershp matrx U wth random values between 0 and such that the constrants n Equaton (4) are satsfed. Step : Calculate c fuzzy cluster centers c, =,, c, usng Equaton (6). Step 3: Compute the cost functon accordng to Equaton (5). Stop f ether t s below a certan tolerance value or ts mprovement over prevous teraton s below a certan threshold. Step 4: Compute a new U usng Equaton (7). Go to step. As n K-means clusterng, the performance of FCM deps on the ntal membershp matrx values; thereby t s advsable to run the algorthm for several tmes, each startng wth dfferent values of membershp grades of data ponts. C. Mountan Clusterng The mountan clusterng approach s a smple way to fnd cluster centers based on a densty measure called the mountan functon. Ths method s a smple way to fnd approxmate cluster centers, and can be used as a preprocessor for other sophstcated clusterng methods. The frst step n mountan clusterng nvolves formng a grd on the data space, where the ntersectons of the grd lnes consttute the potental cluster centers, denoted as a set V. The second step entals constructng a mountan functon representng a data densty measure. The heght of the mountan functon at a pont v V s equal to N v x m( v ) = exp, (8) = σ where x s the th data pont and σ s an applcaton specfc constant. Ths equaton states

5 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 5 that the data densty measure at a pont v s affected by all the ponts x n the data set, and ths densty measure s nversely proportonal to the dstance between the data ponts x and the pont under consderaton v. The constant σ determnes the heght as well as the smoothness of the resultant mountan functon. The thrd step nvolves selectng the cluster centers by sequentally destructng the mountan functon. The frst cluster center c s determned by selectng the pont wth the greatest densty measure. Obtanng the next cluster center requres elmnatng the effect of the frst cluster. Ths s done by revsng the mountan functon: a new mountan functon s formed by subtractng a scaled Gaussan functon centered at c : v c mnew ( v) = m( v) m( c )exp (9) β The subtracted amount elmnates the effect of the frst cluster. Note that after subtracton, the new mountan functon mnew ( v ) reduces to zero at v = c. After subtracton, the second cluster center s selected as the pont havng the greatest value for the new mountan functon. Ths process contnues untl a suffcent number of cluster centers s attaned. D. Subtractve Clusterng The problem wth the prevous clusterng method, mountan clusterng, s that ts computaton grows exponentally wth the dmenson of the problem; that s because the mountan functon has to be evaluated at each grd pont. Subtractve clusterng solves ths problem by usng data ponts as the canddates for cluster centers, nstead of grd ponts as n mountan clusterng. Ths means that the computaton s now proportonal to the problem sze nstead of the problem dmenson. However, the actual cluster centers are not necessarly located at one of the data ponts, but n most cases t s a good approxmaton, especally wth the reduced computaton ths approach ntroduces. Snce each data pont s a canddate for cluster centers, a densty measure at data pont x s defned as n x x j D = exp, (0) j= ( ra / ) where r a s a postve constant representng a neghborhood radus. Hence, a data pont wll have a hgh densty value f t has many neghborng data ponts. The frst cluster center xc s chosen as the pont havng the largest densty value D c. Next, the densty measure of each data pont x s revsed as follows: x x c D = D Dc exp () ( rb / ) where r b s a postve constant whch defnes a neghborhood that has measurable reductons n densty measure. Therefore, the data ponts near the frst cluster center c x wll have sgnfcantly reduced densty measure. After revsng the densty functon, the next cluster center s selected as the pont havng the greatest densty value. Ths process contnues untl a suffcent number of clusters s attanted. IV. IMPLEMENTATION AND RESULTS Havng ntroduced the dfferent clusterng technques and ther basc mathematcal foundatons, we now turn to the dscusson of these technques on the bass of a practcal study. Ths study nvolves the mplementaton of each of the four technques ntroduced prevously, and testng each one of them on a set of medcal data related to heart dsease dagnoss problem. The medcal data used conssts of 3 nput attrbutes related to clncal dagnoss of a heart dsease, and one output attrbute whch ndcates whether the patent s dagnosed wth the heart dsease or not. The whole data set conssts of 300 cases. The data set s parttoned nto two data sets: two-thrds of the data for tranng, and one-

6 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 6 Performance Test Runs measure No. of teratons RMSE Accuracy 78.0% 78.0% 80.0% 78.0% 60.0% 5.0% 5.0% 80.0% 80.0% 78.0% Regresson Lne Slope Table. K-means Clusterng Performance Results thrd for evaluaton. The number of clusters nto whch the data set s to be parttoned s two clusters;.e. patents dagnosed wth the heart dsease, and patents not dagnosed wth the heart dsease. Because of the hgh number of dmensons n the problem (3-dmensons), no vsual representaton of the clusters can be presented; only -D or 3-D clusterng problems can be vsually nspected. We wll rely heavly on performance measures to evaluate the clusterng technques rather than on vsual approaches. As mentoned earler, the smlarty metrc used to calculate the smlarty between an nput vector and a cluster center s the Eucldean dstance. Snce most smlarty metrcs are senstve to the large ranges of elements n the nput vectors, each of the nput varables must be 0, ;.e. normalzed to wthn the unt nterval [ ] the data set has to be normalzed to be wthn the unt hypercube. Each clusterng algorthm s presented wth the tranng data set, and as a result two clusters are produced. The data n the evaluaton set s then tested aganst the found clusters and an analyss of the results s conducted. The followng sectons present the results of each clusterng technque, followed by a comparson of the four technques. MATLAB code for each of the four technques can be found n the appx. A. K-means Clusterng As mentoned n the prevous secton, K- means clusterng works on fndng the cluster centers by tryng to mnmze a cost functon J. It alternates between updatng the membershp matrx and updatng the cluster centers usng Equatons () and (3), respectvely, untl no further mprovement n the cost functon s notced. Snce the algorthm ntalzes the cluster centers randomly, ts performance s affected by those ntal cluster centers. So several runs of the algorthm s advsed to have better results. Evaluatng the algorthm s realzed by testng the accuracy of the evaluaton set. After the cluster centers are determned, the evaluaton data vectors are assgned to ther respectve clusters accordng to the dstance between each vector and each of the cluster centers. An error measure s then calculated; the root mean square error (RMSE) s used for ths purpose. Also an accuracy measure s calculated as the percentage of correctly classfed vectors. The algorthm was tested for 0 tmes to determne the best performance. Table lsts the results of those runs. Fgure shows a plot of the cost functon over tme for the best test case. Cost Functon K-means clusterng cost functon hstory Iteraton Fgure. K-means clusterng cost functon plot

7 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 7 Performance Weghtng exponent m measure No. of teratons RMSE Accuracy 78.0% 78.0% 77.0% 78.0% 79.0% 77.0% 77% 77% Regresson Lne Slope Table. Fuzzy C-means Clusterng Performance Results A 0.5 Best Lnear Ft: A = (0.604) T + (0.4) R = T Data Ponts A = T Best Lnear Ft Fgure. Regresson Analyss of K-means Clusterng To further measure how accurately the dentfed clusters represent the actual classfcaton of data, a regresson analyss s performed of the resultant clusterng aganst the orgnal classfcaton. Performance s consdered better f the regresson lne slope s close to. Fgure shows the regresson analyss of the best test case. As seen from the results, the best case acheved 80% accuracy and an RMSE of Ths relatvely moderate performance s related to the hgh dmensonalty of the problem; havng too much dmensons t to dsrupt the couplng of data and ntroduces overlappng n some of these dmensons that reduces the accuracy of clusterng. It s notced also that the cost functon converges rapdly to a mnmum value as seen from the number of teratons n each test run. However, ths has no effect on the accuracy measure. B. Fuzzy C-means Clusterng FCM allows for data ponts to have dfferent degrees of membershp to each of the clusters; thus elmnatng the effect of hard membershp ntroduced by K-means clusterng. Ths approach employs fuzzy measures as the bass for membershp matrx calculaton and for cluster centers dentfcaton. As t s the case n K-means clusterng, FCM starts by assgnng random values to the membershp matrx U, thus several runs have to be conducted to have hgher probablty of gettng good performance. However, the results showed no (or nsgnfcant) varaton n performance or accuracy when the algorthm was run for several tmes. For testng the results, every vector n the evaluaton data set s assgned to one of the clusters wth a certan degree of belongngness (as done n the tranng set). However, because the output values we have are crsp values (ether or 0), the evaluaton set degrees of membershp are defuzzfed to be tested aganst the actual outputs. The same performance measures appled n K-means clusterng wll be used here; however only the effect of the weghtng exponent m s analyzed, snce the effect of random ntal membershp grades has nsgnfcant effect on the fnal cluster centers. Table lsts the results of the tests wth the effect of varyng the weghtng exponent m. It s notced that very low or very hgh values for m reduces the accuracy; moreover hgh values t to ncrease the tme taken by the algorthm to fnd the clusters. A value of seems adequate for ths problem snce

8 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 8 Performance Test Runs measure RMSE Accuracy 68.0% 78.0% 68.0% 76.0% 70.0% 68.0% 68.0% 7.0% 5.0% 78.0% Regresson Lne Slope Table 3. Mountan Clusterng Performance Results t has good accuracy and requres less number of teratons. Fgure 3 shows the accuracy and number of teratons aganst the weghtng factor Performance of Fuzzy C-means Clusterng Accuracy Iteratons m Fgure 3. Fuzzy C-means Clusterng Performance In general, the FCM technque showed no mprovement over the K-means clusterng for ths problem. Both showed close accuracy; moreover FCM was found to be slower than K-means because of fuzzy calculatons. C. Mountan Clusterng Mountan clusterng reles on dvdng the data space nto grd ponts and calculatng a mountan functon at every grd pont. Ths mountan functon s a representaton of the densty of data at ths pont. The performance of mountan clusterng s severely affected by the dmenson of the problem; the computaton needed rses exponentally wth the dmenson of nput data because the mountan functon has to be evaluated at each grd pont n the data space. For a problem wth c clusters, n dmensons, m data ponts, and a grd sze of g per dmenson, the requred number of calculatons s: n n N = m g + ( c ) g () st cluster remander clusters So for the problem at hand, wth nput data of 3- dmensons, 00 tranng nputs, and a grd sze of 0 per dmenson, the requred number of mountan functon calculaton s approxmately calculatons. In addton the value of the mountan functon needs to be stored for every grd pont for later calculatons n fndng n subsequent clusters; whch requres g storage 3 locatons, for our problem ths would be 0 storage locatons. Obvously ths seems mpractcal for a problem of ths dmenson. In order to be able to test ths algorthm, the dmenson of the problem have to be reduced to a reasonable number; e.g. 4-dmensons. Ths s acheved by randomly selectng 4 varables from the nput data out of the orgnal 3 and performng the test on those varables. Several tests nvolvng dfferently selected random varables are conducted n order to have a better understandng of the results. Table 3 lsts the results of 0 test runs of randomly selected varables. The accuracy acheved ranged between 5% and 78% wth an average of 70%, and average RMSE of Those results are qute dscouragng compared to the results acheved n K-means and FCM clusterng. Ths s due to the fact that not all of the varables of the nput data contrbute to the clusterng process; only 4 are chosen at random to make t possble to conduct the tests. However, wth only 4 attrbutes chosen to do the tests, mountan clusterng requred far much more tme than any other technque durng the tests; ths s because of the fact that the

9 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 9 Performance measure Neghborhood radus r a RMSE Accuracy 55.0% 58.0% 58.0% 75.0% 75.0% 75.0% 75.0% 75.0% 58.0% Regresson Lne Slope Table 4. Subtractve Clusterng Performance Results number of computaton requred s exponentally proportonal to the number of dmensons n the problem, as stated n Equaton (). So apparently mountan clusterng s not sutable for problems of dmensons hgher than two or three. D. Subtractve Clusterng Ths method s smlar to mountan clusterng, wth the dfference that a densty functon s calculated only at every data pont, nstead of at every grd pont. So the data ponts themselves are the canddates for cluster centers. Ths has the effect of reducng the number of computatons sgnfcantly, makng t lnearly proportonal to the number of nput data nstead of beng exponentally proportonal to ts dmenson. For a problem of c clusters and m data ponts, the requred number of calculatons s: N = m + ( c ) m st cluster remander clusters (3) As seen from the equaton, the number of calculatons does not dep on the dmenson of the problem. For the problem at hand, the number of computatons requred s n the range of few ten thousands only. Snce the algorthm s fxed and does not rely on any randomness, the results are fxed. However, we can test the effect of the two varables r a and r b on the accuracy of the algorthm. Those varables represent a radus of neghborhood after whch the effect (or contrbuton) of other data ponts to the densty functon s dmnshed. Usually the r b varable s taken to be as.5r a. Table 4 shows the results of varyng r a. Fgure 4 shows a plot of accuracy and RMSE aganst r a. Accuracy Subtractve Clusterng Performance Accuracy RMSE ra Fgure 4. Subtractve Clusterng Performance It s clear from the results that choosng r a very small or very large wll result n a poor accuracy because f r a s chosen very small the densty functon wll not take nto account the effect of neghborng data ponts; whle f taken very large, the densty functon wll be affected account all the data ponts n the data space. So a value between 0.4 and 0.7 should be adequate for the radus of neghborhood. As seen from table 4, the maxmum acheved accuracy was 75% wth an RMSE of 0.5. Compared to K-means and FCM, ths result s a lttle bt behnd the accuracy acheved n those other technques. E. Results Summary and Comparson Accordng to the prevous dscusson of the mplementaton of the four data clusterng technques and ther results, t s useful to

10 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 0 summarze the results and present some comparson of performances. A summary of the best acheved results for each of the four technques s presented n Table 5. Algorthm RMSE Comparson Aspect Regresson Accuracy Lne Slope Tme (sec) K-means % FCM % Mountan % Subtractve % Table 5. Performance Results Comparson From ths comparson we can conclude some remarks: K-means clusterng produces farly hgher accuracy and lower RMSE than the other technques, and requres less computaton tme. Mountan clusterng has a very poor performance regardng ts requrement for huge number of computaton and low accuracy. However, we have to notce that tests conducted on mountan clusterng were done usng part of the nput varables n order to make t feasble to run the tests. Mountan clusterng s sutable only for problems wth two or three dmensons. FCM produces close results to K-means clusterng, yet t requres more computaton tme than K-means because of the fuzzy measures calculatons nvolved n the algorthm. In subtractve clusterng, care has to be taken when choosng the value of the neghborhood radus r a, snce too small rad wll result n neglectng the effect of neghborng data ponts, whle large rad wll result n a neghborhood of all the data ponts thus cancelng the effect of the cluster. Snce non of the algorthms acheved enough hgh accuracy rates, t s assumed that the problem data tself contans some overlappng n some of the dmensons; because of the hgh number of dmensons t to dsrupt the couplng of data and reduce the accuracy of clusterng. As stated earler n ths paper, clusterng algorthms are usually used n conjuncton wth radal bass functon networks and fuzzy models. The technques descrbed here can be used as preprocessors for RBF networks for determnng the centers of the radal bass functons. In such cases, more accuracy can be ganed by usng gradent descent or other advanced dervatvebased optmzaton schemes for further refnement. In fuzzy modelng, the clusterng cluster centers produced by the clusterng technques can be modeled as f-then rules n ANFIS for example; where a tranng set (ncludng nputs and outputs) to fnd cluster centers ( x, y ) va clusterng frst and then formng a zero-order Sugeno fuzzy modelng n whch the th rule s expressed as If X s close to x then Y s close to y Whch means that the th rule s based on the th cluster center dentfed by the clusterng method. Agan, after the structure s determned, backpropagaton-type gradent descent and other optmzaton schemes can be appled to proceed wth parameter dentfcaton. V. CONCLUSION Four clusterng technques have been revewed n ths paper, namely: K-means clusterng, Fuzzy C-means clusterng, Mountan clusterng, and Subtractve clusterng. These approaches solve the problem of categorzng data by parttonng a data set nto a number of clusters based on some smlarty measure so that the smlarty n each cluster s larger than among clusters. The four methods have been mplemented and tested aganst a data set for medcal dagnoss of heart dsease. The comparatve study done here s concerned wth the accuracy of each algorthm, wth care beng taken toward the effcency n calculaton and other performance measures. The medcal problem presented has a hgh number of dmensons, whch mght nvolve some complcated relatonshps between the varables n the nput data. It was obvous that mountan

11 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES clusterng s not one of the good technques for problems wth ths hgh number of dmensons due to ts exponental proportonalty to the dmenson of the problem. K-means clusterng seemed to over perform the other technques for ths type of problem. However n other problems where the number of clusters s not known, K- means and FCM cannot be used to solve ths type of problem, leavng the choce only to mountan or subtractve clusterng. Subtractve clusterng seems to be a better alternatve to mountan clusterng snce t s based on the same dea, and uses the data ponts as cluster centers canddates nstead of grd ponts; however, mountan clusterng can lead to better results f the grd granularty s small enough to capture the potental cluster centers, but wth the sde effect of ncreasng computaton needed for the larger number of grd ponts. Fnally, the clusterng technques dscussed here do not have to be used as stand-alone approaches; they can be used n conjuncton wth other neural or fuzzy systems for further refnement of the overall system performance. VI. REFERENCES [] Jang, J.-S. R., Sun, C.-T., Mzutan, E., Neuro- Fuzzy and Soft Computng A Computatonal Approach to Learnng and Machne Intellgence, Prentce Hall. [] Azuaje, F., Dubtzky, W., Black, N., Adamson, K., Dscoverng Relevance Knowledge n Data: A Growng Cell Structures Approach, IEEE Transactons on Systems, Man, and Cybernetcs- Part B: Cybernetcs, Vol. 30, No. 3, June 000 (pp. 448) [3] Ln, C., Lee, C., Neural Fuzzy Systems, Prentce Hall, NJ, 996. [4] Tsoukalas, L., Uhrg, R., Fuzzy and Neural Approaches n Engneerng, John Wley & Sons, Inc., NY, 997. [5] Nauck, D., Kruse, R., Klawonn, F., Foundatons of Neuro-Fuzzy Systems, John Wley & Sons Ltd., NY, 997. [6] J. A. Hartgan and M. A. Wong, A k-means clusterng algorthm, Appled Statstcs, 8: , 979. [7] The MathWorks, Inc., Fuzzy Logc Toolbox For Use Wth MATLAB, The MathWorks, Inc., 999.

12 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES Appx % K-means clusterng K-means Clusterng (MATLAB scrpt) % CLUSTERING PHASE % Load the Tranng Set TrSet = load('tranngset.txt'); [m,n] = sze(trset); % (m samples) x (n dmensons) for = :m % the output (last column) values (0,,,3) are mapped to (0,) f TrSet(,)>= TrSet(,)=; % fnd the range of each attrbute (for normalzaton later) for = :n range(,) = mn(trset(:,)); range(,) = max(trset(:,)); x = Normalze(TrSet, range); x(:,) = []; [m,n] = sze(x); % normalze the data set to a hypercube % get rd of the output column nc = ; % number of clusters = % Intalze cluster centers to random ponts c = zeros(nc,n); for = :nc rnd = nt6(rand*m + ); % select a random vector from the nput set c(,:) = x(rnd,:); % assgn ths vector value to cluster () % Clusterng Loop delta = e-5; n = 000; ter = ; whle (ter < n) % Determne the membershp matrx U % u(,j) = f euc_dst(x(j),c()) <= euc_dst(x(j),c(k)) for each k ~= % u(,j) = 0 otherwse for = :nc for j = :m d = euc_dst(x(j,:),c(,:)); u(,j) = ; for k = :nc f k~= f euc_dst(x(j,:),c(k,:)) < d u(,j) = 0; % Compute the cost functon J J(ter) = 0; for = :nc JJ() = 0; for k = :m f u(,k)== JJ() = JJ() + euc_dst(x(k,:),c(,:)); J(ter) = J(ter) + JJ();

13 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 3 % Stop f ether J s below a certan tolerance value, % or ts mprovement over prevous teraton s below a certan threshold str = sprntf('teraton: %.0d, J=%d', ter, J(ter)); f (ter~=) & (abs(j(ter-) - J(ter)) < delta) break; % Update the cluster centers % c() = mean of all vectors belongng to cluster () for = :nc sum_x = 0; G() = sum(u(,:)); for k = :m f u(,k)== sum_x = sum_x + x(k,:); c(,:) = sum_x./ G(); ter = ter + ; % whle dsp('clusterng Done.'); % TESTING PHASE % Load the evaluaton data set EvalSet = load('evaluatonset.txt'); [m,n] = sze(evalset); for = :m f EvalSet(,)>= EvalSet(,)=; x = Normalze(EvalSet, range); x(:,) = []; [m,n] = sze(x); % Assgn evaluaton vectors to ther respectve clusters accordng % to ther dstance from the cluster centers for = :nc for j = :m d = euc_dst(x(j,:),c(,:)); evu(,j) = ; for k = :nc f k~= f euc_dst(x(j,:),c(k,:)) < d evu(,j) = 0; % Analyze results ev = EvalSet(:,)'; rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); subplot(,,); f rmse() < rmse() r = ; else r = ; str = sprntf('testng Set RMSE: %f', rmse(r)); ctr = 0; for = :m f evu(r,)==ev()

14 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 4 ctr = ctr + ; str = sprntf('testng Set accuracy: %.f%%', ctr*00/m); [m,b,r] = postreg(evu(r,:),ev); % Regresson Analyss dsp(sprntf('r = %.3f', r)); Fuzzy C-means Clusterng (MATLAB scrpt) % Fuzzy C-means clusterng % CLUSTERING PHASE % Load the Tranng Set TrSet = load('tranngset.txt'); [m,n] = sze(trset); % (m samples) x (n dmensons) for = :m % the output (last column) values (0,,,3) are mapped to (0,) f TrSet(,)>= TrSet(,)=; % fnd the range of each attrbute (for normalzaton later) for = :n range(,) = mn(trset(:,)); range(,) = max(trset(:,)); x = Normalze(TrSet, range); x(:,) = []; [m,n] = sze(x); % normalze the data set to a hypercube % get rd of the output column nc = ; % number of clusters = % Intalze the membershp matrx wth random values between 0 and % such that the summaton of membershp degrees for each vector equals unty u = zeros(nc,m); for = :m u(,) = rand; u(,) = - u(,); % Clusterng Loop m_exp = ; prevj = 0; J = 0; delta = e-5; n = 000; ter = ; whle (ter < n) % Calculate the fuzzy cluster centers for = :nc sum_ux = 0; sum_u = 0; for j = :m sum_ux = sum_ux + (u(,j)^m_exp)*x(j,:); sum_u = sum_u + (u(,j)^m_exp); c(,:) = sum_ux./ sum_u; % Compute the cost functon J J(ter) = 0; for = :nc JJ() = 0; for j = :m JJ() = JJ() + (u(,j)^m_exp)*euc_dst(x(j,:),c(,:));

15 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 5 J(ter) = J(ter) + JJ(); % Stop f ether J s below a certan tolerance value, % or ts mprovement over prevous teraton s below a certan threshold str = sprntf('teraton: %.0d, J=%d', ter, J); f (ter~=) & (abs(j(ter-) - J(ter)) < delta) break; % Update the membershp matrx U for = :nc for j = :m sum_d = 0; for k = :nc sum_d = sum_d + (euc_dst(c(,:),x(j,:))/euc_dst(c(k,:),x(j,:)))^(/(m_exp-)); u(,j) = /sum_d; ter = ter + ; % whle dsp('clusterng Done.'); % TESTING PHASE % Load the evaluaton data set EvalSet = load('evaluatonset.txt'); [m,n] = sze(evalset); for = :m f EvalSet(,)>= EvalSet(,)=; x = Normalze(EvalSet, range); x(:,) = []; [m,n] = sze(x); % Assgn evaluaton vectors to ther respectve clusters accordng % to ther dstance from the cluster centers for = :nc for j = :m sum_d = 0; for k = :nc sum_d = sum_d + (euc_dst(c(,:),x(j,:))/euc_dst(c(k,:),x(j,:)))^(/(m_exp-)); evu(,j) = /sum_d; % defuzzfy the membershp matrx for j = :m f evu(,j) >= evu(,j) evu(,j) = ; evu(,j) = 0; else evu(,j) = 0; evu(,j) = ; % Analyze results ev = EvalSet(:,)'; rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); subplot(,,); f rmse() < rmse() r = ; else r = ;

16 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 6 str = sprntf('testng Set RMSE: %f', rmse(r)); ctr = 0; for = :m f evu(r,)==ev() ctr = ctr + ; str = sprntf('testng Set accuracy: %.f%%', ctr*00/m); [m,b,r] = postreg(evu(r,:),ev); % Regresson Analyss dsp(sprntf('r = %.3f', r)); Mountan Clusterng (MATLAB scrpt) % Mountan Clusterng % Setup the tranng data % Load the Tranng Set TrSet = load('tranngset.txt'); [m,n] = sze(trset); % (m samples) x (n dmensons) for = :m % the output (last column) values (0,,,3) are mapped to (0,) f TrSet(,)>= TrSet(,)=; % fnd the range of each attrbute (for normalzaton later) for = :n range(,) = mn(trset(:,)); range(,) = max(trset(:,)); x = Normalze(TrSet, range); x(:,) = []; [m,n] = sze(x); % normalze the data set to a hypercube % get rd of the output column % Due to memory and speed lmtatons, the number of attrbutes % wll be set to a maxmum of 4 attrbutes. Extra attrbutes wll % be dropped at random. n_dropped = 0; f n>4 for = :(n-4) attr = cel(rand*(n-+)); x(:,attr) = []; dropped() = attr; % save dropped attrbutes postons n_dropped = n_dropped+; [m,n] = sze(x); % Frst: setup a grd matrx of n-dmensons (V) % (n = the dmenson of nput data vectors) % The grddng granularty s 'gr' = # of grd ponts per dmenson gr = 0; % setup the dmenson vector [d d d3... dn] v_dm = gr * ones([ n]); % setup the mountan matrx M = zeros(v_dm); sgma = 0.; % Second: calculate the mountan functon at every grd pont

17 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 7 % setup some adng varables cur = ones([ n]); for = :n for j = : cur() = cur()*v_dm(j); max_m = 0; % greatest densty value max_v = 0; % cluster center poston dsp('fndng Cluster...'); % loop over each grd pont for = :cur(,) % calculate the vector ndexes dx = ; for j = n:-: dm(j) = cel(dx/cur(j-)); dx = dx - cur(j-)*(dm(j)-); dm() = dx; % dm s holdng the current pont ndex vector % but needs to be normalzed to the range [0,] v = dm./gr; % calculate the mountan functon for the current pont M() = mnt(v,x,sgma); f M() > max_m max_m = M(); max_v = v; max_ = ; % report progress f mod(,5000)==0 str = sprntf('vector %.0d/%.0d; M(v)=%.f',, cur(,), M()); % Thrd: select the frst cluster center by choosng the pont % wth the greatest densty value c(,:) = max_v; c = max_; str = sprntf('cluster :'); str = sprntf('%4.f', c(,:)); str = sprntf('m=%.3f', max_m); % CLUSTER Mnew = zeros(v_dm); max_m = 0; max_v = 0; beta = 0.; dsp('fndng Cluster...'); for = :cur(,) % calculate the vector ndexes dx = ; for j = n:-: dm(j) = cel(dx/cur(j-)); dx = dx - cur(j-)*(dm(j)-); dm() = dx; % dm s holdng the current pont ndex vector

18 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 8 % but needs to be normalzed to the range [0,] v = dm./gr; % calculate the REVISED mountan functon for the current pont Mnew() = M() - M(c)*exp((-euc_dst(v,c(,:)))./(*beta^)); f Mnew() > max_m max_m = Mnew(); max_v = v; max_ = ; % report progress f mod(,5000)==0 str = sprntf('vector %.0d/%.0d; Mnew(v)=%.f',, cur(,), Mnew()); c(,:) = max_v; str = sprntf('cluster :'); str = sprntf('%4.f', c(,:)); str = sprntf('m=%.3f', max_m); % Evaluaton % Load the evaluaton data set EvalSet = load('evaluatonset.txt'); [m,n] = sze(evalset); for = :m f EvalSet(,)>= EvalSet(,)=; x = Normalze(EvalSet, range); x(:,) = []; [m,n] = sze(x); % drop the attrbutes correspondng to the ones dropped n the tranng set for = :n_dropped x(:,dropped()) = []; [m,n] = sze(x); % Assgn every test vector to ts nearest cluster for = : for j = :m d = euc_dst(x(j,:),c(,:)); evu(,j) = ; for k = : f k~= f euc_dst(x(j,:),c(k,:)) < d evu(,j) = 0; % Analyze results ev = EvalSet(:,)'; rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); f rmse() < rmse() r = ; else r = ;

19 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 9 str = sprntf('testng Set RMSE: %f', rmse(r)); ctr = 0; for = :m f evu(r,)==ev() ctr = ctr + ; str = sprntf('testng Set accuracy: %.f%%', ctr*00/m); [m,b,r] = postreg(evu(r,:),ev); dsp(sprntf('r = %.3f', r)); Subtractve Clusterng (MATLAB scrpt) % Subtractve Clusterng % Setup the tranng data % Load the Tranng Set TrSet = load('tranngset.txt'); [m,n] = sze(trset); % (m samples) x (n dmensons) for = :m % the output (last column) values (0,,,3) are mapped to (0,) f TrSet(,)>= TrSet(,)=; % fnd the range of each attrbute (for normalzaton later) for = :n range(,) = mn(trset(:,)); range(,) = max(trset(:,)); x = Normalze(TrSet, range); x(:,) = []; [m,n] = sze(x); % normalze the data set to a hypercube % get rd of the output column % Frst: Intalze the densty matrx and some varables D = zeros([m ]); ra =.0; % Second: calculate the densty functon at every data pont % setup some adng varables max_d = 0; % greatest densty value max_x = 0; % cluster center poston dsp('fndng Cluster...'); % loop over each data pont for = :m % calculate the densty functon for the current pont D() = densty(x(,:),x,ra); f D() > max_d max_d = D(); max_x = x(,:); max_ = ; % report progress f mod(,50)==0 str = sprntf('vector %.0d/%.0d; D(v)=%.f',, m, D());

20 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES 0 % % Thrd: select the frst cluster center by choosng the pont % wth the greatest densty value c(,:) = max_x; c = max_; str = sprntf('cluster :'); str = sprntf('%4.f', c(,:)); str = sprntf('d=%.3f', max_d); % CLUSTER Dnew = zeros([m ]); max_d = 0; max_x = 0; rb =.5*ra; dsp('fndng Cluster...'); for = :m % calculate the REVISED densty functon for the current pont Dnew() = D() - D(c)*exp((-euc_dst(x(,:),c(,:)))./((rb/)^)); f Dnew() > max_d max_d = Dnew(); max_x = x(,:); max_ = ; % report progress f mod(,50)==0 str = sprntf('vector %.0d/%.0d; Dnew(v)=%.f',, m, Dnew()); c(,:) = max_x; str = sprntf('cluster :'); str = sprntf('%4.f', c(,:)); str = sprntf('d=%.3f', max_d); % Evaluaton % Load the evaluaton data set EvalSet = load('evaluatonset.txt'); [m,n] = sze(evalset); for = :m f EvalSet(,)>= EvalSet(,)=; x = Normalze(EvalSet, range); x(:,) = []; [m,n] = sze(x); % Assgn every test vector to ts nearest cluster for = : for j = :m

21 A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES dd = euc_dst(x(j,:),c(,:)); evu(,j) = ; for k = : f k~= f euc_dst(x(j,:),c(k,:)) < dd evu(,j) = 0; % Analyze results ev = EvalSet(:,)'; rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); rmse() = norm(evu(,:)-ev)/sqrt(length(evu(,:))); f rmse() < rmse() r = ; else r = ; str = sprntf('testng Set RMSE: %f', rmse(r)); ctr = 0; for = :m f evu(r,)==ev() ctr = ctr + ; str = sprntf('testng Set accuracy: %.f%%', ctr*00/m); [m,b,r] = postreg(evu(r,:),ev); dsp(sprntf('r = %.3f', r));

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

A Simple Approach to Clustering in Excel

A Simple Approach to Clustering in Excel A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

Ants Can Schedule Software Projects

Ants Can Schedule Software Projects Ants Can Schedule Software Proects Broderck Crawford 1,2, Rcardo Soto 1,3, Frankln Johnson 4, and Erc Monfroy 5 1 Pontfca Unversdad Católca de Valparaíso, Chle FrstName.Name@ucv.cl 2 Unversdad Fns Terrae,

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT Chapter 4 ECOOMIC DISATCH AD UIT COMMITMET ITRODUCTIO A power system has several power plants. Each power plant has several generatng unts. At any pont of tme, the total load n the system s met by the

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

A Fast Incremental Spectral Clustering for Large Data Sets

A Fast Incremental Spectral Clustering for Large Data Sets 2011 12th Internatonal Conference on Parallel and Dstrbuted Computng, Applcatons and Technologes A Fast Incremental Spectral Clusterng for Large Data Sets Tengteng Kong 1,YeTan 1, Hong Shen 1,2 1 School

More information

An interactive system for structure-based ASCII art creation

An interactive system for structure-based ASCII art creation An nteractve system for structure-based ASCII art creaton Katsunor Myake Henry Johan Tomoyuk Nshta The Unversty of Tokyo Nanyang Technologcal Unversty Abstract Non-Photorealstc Renderng (NPR), whose am

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE, FEBRUARY ISSN 77-866 Logcal Development Of Vogel s Approxmaton Method (LD- An Approach To Fnd Basc Feasble Soluton Of Transportaton

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Fixed income risk attribution

Fixed income risk attribution 5 Fxed ncome rsk attrbuton Chthra Krshnamurth RskMetrcs Group chthra.krshnamurth@rskmetrcs.com We compare the rsk of the actve portfolo wth that of the benchmark and segment the dfference between the two

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

Period and Deadline Selection for Schedulability in Real-Time Systems

Period and Deadline Selection for Schedulability in Real-Time Systems Perod and Deadlne Selecton for Schedulablty n Real-Tme Systems Thdapat Chantem, Xaofeng Wang, M.D. Lemmon, and X. Sharon Hu Department of Computer Scence and Engneerng, Department of Electrcal Engneerng

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Data Visualization by Pairwise Distortion Minimization

Data Visualization by Pairwise Distortion Minimization Communcatons n Statstcs, Theory and Methods 34 (6), 005 Data Vsualzaton by Parwse Dstorton Mnmzaton By Marc Sobel, and Longn Jan Lateck* Department of Statstcs and Department of Computer and Informaton

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

Cluster Analysis. Cluster Analysis

Cluster Analysis. Cluster Analysis Cluster Analyss Cluster Analyss What s Cluster Analyss? Types of Data n Cluster Analyss A Categorzaton of Maor Clusterng Methos Parttonng Methos Herarchcal Methos Densty-Base Methos Gr-Base Methos Moel-Base

More information

Tools for Privacy Preserving Distributed Data Mining

Tools for Privacy Preserving Distributed Data Mining Tools for Prvacy Preservng Dstrbuted Data Mnng hrs lfton, Murat Kantarcoglu, Jadeep Vadya Purdue Unversty Department of omputer Scences 250 N Unversty St West Lafayette, IN 47907-2066 USA (clfton, kanmurat,

More information

Loop Parallelization

Loop Parallelization - - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Fuzzy Regression and the Term Structure of Interest Rates Revisited

Fuzzy Regression and the Term Structure of Interest Rates Revisited Fuzzy Regresson and the Term Structure of Interest Rates Revsted Arnold F. Shapro Penn State Unversty Smeal College of Busness, Unversty Park, PA 68, USA Phone: -84-865-396, Fax: -84-865-684, E-mal: afs@psu.edu

More information

The Journal of Systems and Software

The Journal of Systems and Software The Journal of Systems and Software 82 (2009) 241 252 Contents lsts avalable at ScenceDrect The Journal of Systems and Software journal homepage: www. elsever. com/ locate/ jss A study of project selecton

More information

Inter-Ing 2007. INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007.

Inter-Ing 2007. INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007. Inter-Ing 2007 INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007. UNCERTAINTY REGION SIMULATION FOR A SERIAL ROBOT STRUCTURE MARIUS SEBASTIAN

More information

Sensor placement for leak detection and location in water distribution networks

Sensor placement for leak detection and location in water distribution networks Sensor placement for leak detecton and locaton n water dstrbuton networks ABSTRACT R. Sarrate*, J. Blesa, F. Near, J. Quevedo Automatc Control Department, Unverstat Poltècnca de Catalunya, Rambla de Sant

More information

Implementations of Web-based Recommender Systems Using Hybrid Methods

Implementations of Web-based Recommender Systems Using Hybrid Methods Internatonal Journal of Computer Scence & Applcatons Vol. 3 Issue 3, pp 52-64 2006 Technomathematcs Research Foundaton Implementatons of Web-based Recommender Systems Usng Hybrd Methods Janusz Sobeck Insttute

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms Internatonal Journal of Appled Informaton Systems (IJAIS) ISSN : 2249-0868 Foundaton of Computer Scence FCS, New York, USA Volume 7 No.7, August 2014 www.jas.org Cluster Analyss of Data Ponts usng Parttonng

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

A machine vision approach for detecting and inspecting circular parts

A machine vision approach for detecting and inspecting circular parts A machne vson approach for detectng and nspectng crcular parts Du-Mng Tsa Machne Vson Lab. Department of Industral Engneerng and Management Yuan-Ze Unversty, Chung-L, Tawan, R.O.C. E-mal: edmtsa@saturn.yzu.edu.tw

More information

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Activity Scheduling for Cost-Time Investment Optimization in Project Management PROJECT MANAGEMENT 4 th Internatonal Conference on Industral Engneerng and Industral Management XIV Congreso de Ingenería de Organzacón Donosta- San Sebastán, September 8 th -10 th 010 Actvty Schedulng

More information

An Adaptive and Distributed Clustering Scheme for Wireless Sensor Networks

An Adaptive and Distributed Clustering Scheme for Wireless Sensor Networks 2007 Internatonal Conference on Convergence Informaton Technology An Adaptve and Dstrbuted Clusterng Scheme for Wreless Sensor Networs Xnguo Wang, Xnmng Zhang, Guolang Chen, Shuang Tan Department of Computer

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Lecture 3: Force of Interest, Real Interest Rate, Annuity Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and

More information