Searching for Interacting Features for Spam Filtering

Size: px
Start display at page:

Download "Searching for Interacting Features for Spam Filtering"

Transcription

1 Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng , Chna 2 Software Insttute, Nanjng Unversty, Nanjng, Chna 3 Department of Electrcal Engneerng, Helsnk Unversty of Technology, Otakaar 5 A, Espoo, Fnland Abstract. In ths paper, we propose a novel feature selecton method INTERACT to select relevant words of emals for spam emal flterng,.e. classfyng an emal as spam or legtmate. Four tradtonal feature selecton methods n text categorzaton doman, Informaton Gan, Gan Rato, Ch Squared, and RelefF, are also used for performance comparson. Three classfers, Support Vector Machne (SVM), Naïve Bayes and a novel classfer Locally Weghted learnng wth Naïve Bayes (LWNB) are dscussed n ths paper. Four popular datasets are employed as the benchmark corpora n our experments to examne the capabltes of these fve feature selecton methods and the three classfers. In our smulatons, we dscover that the LWNB mproves the Naïve Bayes and gan hgher predcton results by learnng local models, and ts performance s sometmes better than that of the SVM. Our study also shows the INTERACT can result n better performances of classfers than the other four tradtonal methods for the spam emal flterng. Key words: Interactng Features, Feature Selecton, Naïve Bayes, Spam Flterng. 1 Introducton The ncreasng popularty of electronc mals has ntrgued drect marketers to flood the malboxes of mllons of users wth unsolcted messages. These messages are usually referred to as spam or, more formally, Unsolcted Bulk E-mal (UBE), and may advertse anythng, from vacatons to get-rch schemes [1]. The negatve effect of spam has nfluenced people s daly lves: fllng malboxes, engulfng mportant personal mals, wastng network bandwdth, consumng users' tme and energy to solve t, not to menton all the other problems assocated wth t (crashed mal-servers, pornography advertsements sent to chldren, etc.). A study n 1997 ndcated that the spam messages consttuted approxmately 10% of the ncomng messages to a corporate network [4]. CAUBE.AU reports that ther statstcs show the volume of spam s ncreasng at an alarmng rate, and some people clam they are even Correspondng author.

2 abandonng ther emal accounts because of spam [3]. Ths stuaton seems to be worsenng wth tme, and wthout approprate counter-measures, spam messages could eventually undermne the usablty of e-mals. These serous threats from spam make the spam flterng, whose task s to rule out unsolcted emals automatcally from the emal stream, more mportant and n need of solvng. In recent years, many studes address the ssue of spam flterng based on machne learnng, because the attempts to ntroduce legal measures aganst spam malng have lmted effect. Several supervsed learnng algorthms have been successfully appled to spam flterng: Naïve Bayes [5,6,7,8], Support Vector Machne [9,10], Memory Based Learnng methods [11,12], and Decson Tree [13]. Among these classfcaton methods, the Naïve Bayes s partcularly attractve for spam flterng, as ts performance s surprsngly good [12]. The Naïve Bayes classfer has been the flterng engne of many commercal ant-spam software. Therefore, n ths paper, we am at mprovng the predcton ablty of the Naïve Bayes by ntroducng locally learned model. In order to tran or test classfers, t s necessary to go through large corpus wth spam and legtmate emals. E-mals of corpuses have to be preprocessed to extract ther words (features) belongng to the message subjects, the bodes and/or the attachments. As the number of features n a corpus can end up beng very hgh, t s usual to choose those features that better represent each message before carryng out the flter tranng to prevent the classfers from over-fttng [14]. The effectveness of the classfers reles on the approprate choce of these features, the preprocessng steps of the e-mal features extracton, and the selecton of the most representatve features are crucal for the performance of the flters [15]. In ths paper, a novel feature selecton method INTERACT and a novel classfer LWNB are ntroduced to deal wth spam flterng. The remander of ths paper s organzed as follows. Secton 2 demonstrates the INTERACT algorthm for the spam flterng. We explan the prncples of the e-mal representaton and preprocessng n Secton 3. Classfers used n ths paper are presented n Secton 4. We report the performances of the four feature selecton methods and three classfers usng F measure and accuracy n Secton 5. Secton 6 concludes our study wth a few remarks and conclusons. 2 INTERACT Algorthm Interactng features challenge the current feature selecton methods for classfcaton. A feature by tself may have lttle correlaton wth the target concept. However, when t s combned wth some other features, they can be strongly correlated wth the target concept [2]. Many tradtonal feature selecton methods usually unntentonally remove these features, and thus result n the poor classfcaton performances. The INTERACT algorthm can effcently handle the feature nteracton wth much lower tme cost than the tradtonal methods. A bref descrpton of the INTERACT algorthm s presented below, and more detals can be found n [2]. The INTERACT algorthm searches for the nteractng features by solvng two key problems: how to update c-contrbuton effectvely, and how to deal wth the feature

3 order problem? C-contrbuton of a feature s an ndcator about how sgnfcantly the elmnaton of that feature wll affect consstency. Especally, the C-contrbuton of an rrelevant feature s zero. To solve the frst problem, the INTERACT algorthm calculates the C-contrbuton effcently wth a hashng mechansm [2]: each nstance s nserted nto the hash table, and ts values of those features n S lst are used as the hash keys, where S lst s the set of the ranked features not yet elmnated (S lst s ntalzed wth the full set of features). Instances wth the same hash keys wll be nserted nto the same entry n the hash table and cover the old nformaton of the labels. For the second problem, we assume that the set of features can be dvded nto subset S 1 ncludng relevant features, and subset S 2 contanng rrelevant ones. The INTERACT algorthm ntends to remove the features n S 2 frst, and preserve features n S 1, whch more probably reman n the fnal set of selected features. The INTERACT algorthm acheves ths target by applyng a heurstc to rank the ndvdual features usng symmetrcal uncertanty (SU) n an descendng order so that the (heurstcally) most relevant feature s postoned at the begnnng of the lst. SU has been descrbed n the nformaton theory books and numercal recpes. It s often used as a fast correlaton measure to evaluate the relevance of ndvdual features [12,17]. The INTERACT s a flterng algorthm that employs backward elmnaton to remove the features wth no or low C-contrbuton. Gven a full set wth N features and a class attrbute C, the INTERACT fnds a feature subset S best for the class concept [2]. The algorthm conssts of two major parts: frstly, the features are ranked n the descendng order based on ther Symmetrcal Uncertanty values; secondly, the features are evaluated one by one startng from the end of the ranked feature lst. The process s shown as follows. Algorthm 1. INTERACT Algorthm. Input: Output: Process: F s the full features set wth N features{f 1,F 2,, F N }; C s the class label; δ s a predefned threshold. S best s subset of selected features. S best = for = 1 to N then calculate SU F,c for F append F to S best end sort S best n descendng order accordng to SU,c F last element of S best repeat f F NULL then p c-contrbuton of F f p δ then remove F from S best end end untl F = NULL return S best

4 3 Preprocessng of Corpus and Message Representaton 3.1 Feature Selecton Methods for Comparson Other four feature selecton methods are used n ths paper to test the capablty of the 2 INTERACT algorthm. They are Ch Squared (.e. χ ) Statstc, Informaton Gan, Gan Rato, and RelefF. Ther defntons are gven as follows. In the followng formulas, m s the number of classes (n spam flterng doman, m s 2), and C denotes the th class. V represents the number of parttons a feature can splt the tranng set nto. Let N s the total number of samples, and NC s that of class ( v). In the vth partton, NC denotes the number of samples belongng to class. Ch Squared: The Ch Squared Statstc s calculated by comparng the obtaned frequency wth the pror frequency of the same class. The defnton s: v ( ) ( v) m V v 2 ( N ) 2 C NC χ =. (1) ( v) = 1 v= 1 N C where ( ) ( v) NC = ( N / N) N C denotes the pror frequency. Informaton Gan: Informaton Gan s based on the feature s mpact on the decreasng entropy, and s defned as follows: N N N N InfoGan = [ ( )log( )] [ ( ) ( )log( )] m V ( v) m ( v) ( v) C C N C C ( v) ( v) = 1 N N v= 1 N = 1 N N. (2) Gan Rato: Gan Rato s frstly used n C4.5, whch s defned as (3): m NC N C GanRato = InfoGan /[ ( )log( )]. (3) = 1 N N RelefF: The key dea of Relef s to estmate the features accordng to how well ther values dstngush among the nstances that are near to each other. The RelefF s an extenson of the Relef, mprovng the Relef algorthm by estmatng the probabltes more relably and extendng to deal wth the ncomplete and multclass data sets. More detals can be found n [17]. 3.2 Corpus Preprocessng and Message Representaton Each e-mal n the corpora s represented as a set of words. After analyzng all the e- mals of a corpus, a dctonary wth N words/features s formed. Every e-mal s represented as a feature vector ncludng N elements, and the th word of the vector s a bnary varable representng whether ths word s n ths e-mal. Durng preprocessng, we perform the word stemmng, stop-word removable and Document

5 Frequency Threshold (DFT), n order to reduce the dmenson of feature space. The HTML tags of the e-mals are also removed durng preprocessng. Fnally, we extract the frst 5,000 tokens of the dctonary accordng to ther mutual nformaton to form the corpora used n ths paper. 4 Classfers for Spam Flterng In ths paper, we use three classfers to test the capabltes of the aforementoned feature selecton methods. The three classfers are Support Vector Machne (SVM), Naïve Bayes, and Locally Weghted learnng wth Naïve Bayes (LWNB) that s an mprovement of Naïve Bayes frstly ntroduced nto spam flterng doman by us. We here only brefly ntroduce the LWNB, and more detals can be found n [1]. In the LWNB, the Naïve Bayes s learned locally n the same way as the lnear regresson s used n locally weghted lnear regresson. A local Naïve Bayes model s ft to a subset of the data that s n the neghborhood of the nstance, whose class value s to be predcted [1]. The tranng samples n ths neghborhood are weghted, and further ones are assgned wth less weght. The classfcaton s then obtaned from these Naïve Bayes models. The subset of the data used to tran each locally weghted Naïve Bayes model s determned by a nearest neghbors algorthm. In the LWNB, the frst k nearest neghbors are selected to form ths subset, where k s a user-specfed parameter. How to determne the weght of each nstance of the subset? As n [1], we use a lnear weghtng functon n our experments, whch s defned as: f = 1 d / d, (4) lnear k where d s the Eucldean dstance to the th nearest neghbor x. Obvously, by usng f lnear, the weght decreases lnearly wth the dstance. Emprcal study shows the LWNB s not partcularly senstve to the choce of k as long as k s not too small [1]. Too small k may cause the local Naïve Bayes model to ft the nose n the data. The Naïve Bayes calculates the posteror probablty of class c for a test nstance wth m attrbute values a 1, a 2,, a m as follows: pc ( ) pa ( c) m l 1 2 m l j= 1 j l = C m pc 1 j= 1 paj c = pc ( a, a,..., a ) [ ( ) ( )], (5) where C s the total number of classes. In the LWNB, the ndvdual probabltes on the rght-hand sde of (5) are estmated based on the weghted data. The pror probablty for class c l becomes: 1 + I( c ) 0 = c l w = pc ( l ) = n C+ w n = 0, (6) where c s the class value of the th tranng nstance, and the ndcator functon I(x=y) s 1 ff x = y.

6 The attrbute of data s assumed nomnal, and as for the numerc attrbutes, they are dscretzed. The condtonal probablty of a j s gven by: 1 + I( a = a ) I( c = c ) w = 0 j j l j cl = n nj + I( a ) 0 j = aj w = pa ( ) n, (7) n j s the number of values of attrbute j, and a j s the value of attrbute j of th nstance. 5 Experments and Analyss 5.1 Corpus n Smulatons The experments are based on four popular benchmark corpora, PU1, PU2, PUA, and Lng Spam, whch are all avalable on [16]. In all PU corpora and Lng Spam corpus, attachments, html tags, and header felds other than the subjects are removed, leavng only subject lnes and mal body texts. In order to address prvacy, each token of a corpus s encoded to a unque nteger. The detals about each corpus are gven below. PU1 Corpus: The PU1 corpus conssts of 1,099 messages, whch has 481 spam messages and 618 legtmated ones. The spam rate s 43.77%. PU2 Corpus: The PU2 corpus contans less messages than PU1, whch has 721 messages. Among them, there are 579 messages labeled legtmate and 142 spam. PUA Corpus: The PUA corpus has 1,142 messages, half of whch,.e., 571 messages, are marked as spam and the other half legtmate. Lng Spam Corpus: The Lng spam corpus ncludes 2,412 legtmate messages from a lngustc malng lst and 481 spam ones collected by the author. The spam rate s 16.63%. Dfferent from PU corpora, the messages of Lng spam corpus come from dfferent sources: the legtmate messages are collected from a spam-free, topcspecfc malng lst and the spam ones from a personal malbox. Therefore, the dstrbuton of mals s less smlar from the normal user s mal stream, whch makes the messages of Lng spam corpus easly separated. 5.2 Performance Measures We use two popular evaluaton metrcs of the text categorzaton doman to measure the performance of the classfers: accuracy and F measure. Accuracy: Accuracy s the percentage of the correct predctons n the total predctons. It s defned as follows: Pc Accuracy = 100%. (8) P t

7 where P c s the number of the correct predctons, and P t s the number of the total predctons. The hgher of the accuracy, the better. F measure: The defnton of F measure s as follows: 2R P F = R+ P, (9) where R represents Recall, whch s the percentage of the messages for a gven category that are classfed correctly; P s the Precson, the percentage of the predcted messages for a gven class that are classfed correctly. F measure ranges from 0 to 1, and the hgher, the better. 5.3 Results and Analyss The followng classfcaton performance s measured through a 10-fold crossvaldaton. We select all of the nteractng features,.e., features wth non-negatve C- contrbuton. Table 1 summarzes the results of dmenson reducton after the INTERACT selects the features. Table 1. Summary of results of INTERACT selected features on the four benchmark corpora. PU1 PU2 PUA Lng Spam Num. of features wth non-negatve c-contrbuton From Table 1, we can fnd that the dmensons of data have been reduced sharply after removng rrelevant features by the INTERACT. Therefore, we just run the classfers on these data rather than reducng them further by adjustng the parameter δ. From Table 1, we also can conclude that there are many rrelevant words/features exstng n corpus for the spam flterng, and more than 99% of the features are removed by the INTERACT. The followng hstograms show the performances of the three classfers, SVM (usng lnear kernel), Naïve Bayes, and LWNB, on the four corpora. As for other four feature selecton methods for comparson, we select the frst M features accordng to the features scores, where M s the number of the nteractng features found by the INTERACT algorthm. From Fg. 1 and Fg. 2, we dscover that the INTERACT algorthm can mprove the performances of all the three classfers. Ther performances on the reduced corpus are equal to or better than those on the full corpus, evaluated by the accuracy and F measure. For example, the performances of the SVM on PU1 and PU2 corpora reduced by the INTERACT s equal to those on the full corpora, and ts performance on PUA corpus reduced by the INTERACT s better than that on the full corpus. However, the performance of the SVM on Lng Spam corpus reduced by the INTERACT s slghtly worse than that on the full corpus. The feature selecton capablty of the INTERACT s obvously better than the other popular feature selecton methods. The compettve performances of the classfers on the data handled by the INTERACT show that only a few relevant words can stll dstngush between the spam and legtmate emals. Ths s true n practce, for example, t s

8 well known that the words buy, purchase, jobs, usually appear n the spam e- mals, and they thus are useful emal category dstngushers. 95% 96% 94% INTERACT Ch Squared InfoGan 95% INTERACT Ch Squared InfoGan GanRato RelefF Full GanRato RelefF Full 93% 94% Accuracy 92% 91% Accuracy 93% 92% 91% 90% 90% 89% 89% 88% 88% SVM LWNB NaveBayes SVM LWNB NaveBayes (a) Results on PU1 (b) Results on PU2 96% 100% 94% 98% 96% 92% 94% Accuracy 90% 88% INTERACT Ch Squared InfoGan GanRato RelefF Full Accuracy 92% 90% 88% INTERACT Ch Squared InfoGan 86% 86% GanRato RelefF Full 84% 84% 82% SVM LWNB NaveBayes (c) Results on PUA 82% SVM LWNB NaveBayes (d) Results on Lng Spam Fg. 1. Performances of aforementoned three classfers and four feature selecton methods on PU1, PU2, PUA, and Lng Spam benchmark corpora wth accuracy evaluaton measure. The performance of the LWNB s also promsng. On Lng Spam corpus, ts performance s even better than that of the SVM, whch s a well-known powerful classfer. On PU1 and Lng Spam corpora, the LWNB successfully mproves the performance of the Naïve Bayes by usng locally weghted model. However, ts performance s worse than that of the Naïve Bayes on PU2 and PUA corpora. The reason may be that the task of the spam flterng suts the hypothess of the class condtonal ndependence of the Naïve Bayes, that s, gven the class label of the others, the frequences of the words n one emal are condtonally ndependent of one another. Based on a careful observaton, we have another queston why the LWNB performs poorly on full corpus? The reason s: there are many rrelevant features exstng on full corpus, whch can be also concluded from the feature selecton results by performng the INTERACT. When determnng the neghbors, all the features take part n calculatng dstance, and too many rrelevant features conceal the truly useful effects of the relevant features, and therefore result n that the LWNB fnds the wrong or rrelevant neghbors to generate locally weghted Naïve Bayes models. However, the LWNB s stll a promsng classfer for the spam flterng, when combned wth some excellent feature selecton methods, such as the INTERACT.

9 INTERACT Ch Squared InfoGan GanRato RelefF Full 0.92 INTERACT Ch Squared InfoGan GanRato RelefF Full F-measure F-measure SVM LWNB NaveBayes SVM LWNB NaveBayes (a) Results on PU1 (b) Results on PU F-measure F-measure INTERACT Ch Squared InfoGan GanRato RelefF Full 0.86 INTERACT Ch Squared InfoGan GanRato RelefF Full SVM LWNB NaveBayes (c) Results on PUA 0.75 SVM LWNB NaveBayes (d) Results on Lng Spam Fg. 2. Performances of aforementoned classfers and four feature selecton methods on PU1, PU2, PUA and Lng Spam benchmark corpora wth F measure evaluaton measure. 6 Conclusons In ths paper, we present our work on the spam flterng. Frstly, we ntroduce the INTERACT algorthm to select nteractng words/features for the spam flterng. Other four tradtonal feature selecton methods are also performed n the experments for performance comparson. Secondly, we propose a novel classfer LWNB to mprove the performance of the Naïve Bayes, a most popular classfer n the spam flterng area, to deal wth the spam flterng. Totally, three classfers, SVM, Naïve Bayes and LWNB, are run on four corpora preprocessed by the fve feature selecton methods and correspondng full corpora n our smulatons. Two popular evaluaton metrcs, accuracy and F measure, are used to measure the performances of these three classfers. Our emprcal study shows that the INTERACT feature selecton can mprove all of the three classfers performances, and ts feature selecton ablty s better than that of the four tradtonal feature selecton methods. We brefly analyze the reason why the INTERACT and other four methods can work together to perform well. We also fnd out that the LWNB can mprove the performance of the Naïve Bayes, whch s sometmes superor to the SVM. Acknowledgements. The research work presented n ths paper was supported by grants from the Natonal Natural Scence Foundaton of Chna (Project No.

10 ). X. Z. Gao's research work was funded by the Academy of Fnland under Grant References 1. Frank, E., Hall, M., Pfahrnger, B.: Locally Weghted Nave Bayes. Proceedngs of the Conference on Uncertanty n Artfcal Intellgence. Morgan Kaufmann, pp , Zhao, Z., Lu, H.: Searchng for Interactng Features. In: Proc. of Internatonal Jont Conference on Artfcal Intellgence (IJCAI), Hyderabad, Inda, CAUBE.AU: Cranor, L.F., LaMaccha, B.A.: Spam! Communcatons of ACM, 41(8):74-83, Saham, M, Dumas, S., Heckerman, D., et al.: A Bayesan approach to flterng junk e- mal. In Learnng for Text Categorzaton, Madson, Wsconsn, Schneder, K.M.: A Comparson of Event Models for Naïve Bayes Ant-Spam E-Mal Flterng. In: Proc. of the 10th Conference of the European Chapter of the Assocaton for Computatonal Lngustcs, Budapest, Hungary, pp , Androutsopoulos, I, Palouras, G., Karkaletss, V., et al.: Learnng to Flter Spam E-mal: A Comparson of a Naïve Bayesan and a Memory-based Approach. In: Proc. of the Workshop on Machne Learnng and Textual Informaton Access, pp. 1-13, Zhang, L., Zhu, J., Yao, T.: An Evaluaton of Statstcal Spam Flterng Technques. ACM Trans. Asan Lang. Inf. Process, 3(4): , Drucker, H., Wu, D., Vapnk, V.N.: Support vector machnes for spam categorzaton. IEEE Trans. on Neural Networks, 10(5): , Kolcz, A., Alspector, J.: SVM-based flterng of e-mal spam wth content-specfc msclassfcaton costs. In: Proc. of the TextDM'01 Workshop on Text Mnng - held at the 2001 IEEE Internatonal Conference on Data Mnng, Sakks, G., Androutsopoulos, I., Palouras, G., et al.: A Memory-Based Approach to Ant- Spam Flterng for Malng Lsts. Informaton Retreval, 6(1):49-73, Yu, L., Lu, H.: Feature selecton for hgh-dmensonal data: a fast correlaton-based flter soluton. In: Proc. of ICML 2003, Carreras, X., Marquez, L.: Boostng trees for ant-spam emal flterng. In: Proc. of RANLP01, 4th Int. Conference on Recent Advances n Natural Language Processng, Méndez, J.R., Iglesas, E.L., Fdez-Rverola, F., et al.: Analyzng the Impact of Corpus Preprocessng on Ant-Spam Flterng Software. Research on Computng Scence, 17: , Méndez, J.R., Fdez-Rverola, F., Díaz, F., et al.: A Comparatve Performance Study of Feature Selecton Methods for the Ant-spam Flterng Doman. In: Proc. of Advances n Data Mnng, Applcatons n Medcne, Web Mnng, Marketng, Image and Sgnal Mnng (ICDM 2006), pp , Emal Benchmark Corpus, Kononenko, I.: Estmatng Attrbutes: Analyss and Extensons of Relef. In: Proc. of European Conference on Machne Learnng, pp , 1994.

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Study on CET4 Marks in China s Graded English Teaching

Study on CET4 Marks in China s Graded English Teaching Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

Product Quality and Safety Incident Information Tracking Based on Web

Product Quality and Safety Incident Information Tracking Based on Web Product Qualty and Safety Incdent Informaton Trackng Based on Web News 1 Yuexang Yang, 2 Correspondng Author Yyang Wang, 2 Shan Yu, 2 Jng Q, 1 Hual Ca 1 Chna Natonal Insttute of Standardzaton, Beng 100088,

More information

A novel Method for Data Mining and Classification based on

A novel Method for Data Mining and Classification based on A novel Method for Data Mnng and Classfcaton based on Ensemble Learnng 1 1, Frst Author Nejang Normal Unversty;Schuan Nejang 641112,Chna, E-mal: lhan-gege@126.com Abstract Data mnng has been attached great

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: ruoyu.l@skf.com

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

Assessing Student Learning Through Keyword Density Analysis of Online Class Messages

Assessing Student Learning Through Keyword Density Analysis of Online Class Messages Assessng Student Learnng Through Keyword Densty Analyss of Onlne Class Messages Xn Chen New Jersey Insttute of Technology xc7@njt.edu Brook Wu New Jersey Insttute of Technology wu@njt.edu ABSTRACT Ths

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Semantic Link Analysis for Finding Answer Experts *

Semantic Link Analysis for Finding Answer Experts * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 51-65 (2012) Semantc Lnk Analyss for Fndng Answer Experts * YAO LU 1,2,3, XIAOJUN QUAN 2, JINGSHENG LEI 4, XINGLIANG NI 1,2,3, WENYIN LIU 2,3 AND YINLONG

More information

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons

More information

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)

More information

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets Improved Mnng of Software Complexty Data on Evolutonary Fltered Tranng Sets VILI PODGORELEC Insttute of Informatcs, FERI Unversty of Marbor Smetanova ulca 17, SI-2000 Marbor SLOVENIA vl.podgorelec@un-mb.s

More information

Set. algorithms based. 1. Introduction. System Diagram. based. Exploration. 2. Index

Set. algorithms based. 1. Introduction. System Diagram. based. Exploration. 2. Index ISSN (Prnt): 1694-0784 ISSN (Onlne): 1694-0814 www.ijcsi.org 236 IT outsourcng servce provder dynamc evaluaton model and algorthms based on Rough Set L Sh Sh 1,2 1 Internatonal School of Software, Wuhan

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Planning for Marketing Campaigns

Planning for Marketing Campaigns Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns A study on the ablty of Support Vector Regresson and Neural Networks to Forecast Basc Tme Seres Patterns Sven F. Crone, Jose Guajardo 2, and Rchard Weber 2 Lancaster Unversty, Department of Management

More information

Using Association Rule Mining: Stock Market Events Prediction from Financial News

Using Association Rule Mining: Stock Market Events Prediction from Financial News Usng Assocaton Rule Mnng: Stock Market Events Predcton from Fnancal News Shubhang S. Umbarkar 1, Prof. S. S. Nandgaonkar 2 1 Savtrba Phule Pune Unversty, Vdya Pratshtan s College of Engneerng, Vdya Nagar,

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Adaptive Intrusion Detection based on Boosting and Naïve Bayesian Classifier

Adaptive Intrusion Detection based on Boosting and Naïve Bayesian Classifier Adaptve Intruson Detecton based on Boostng and Naïve Bayesan Classfer Dewan Md. Fard Department of CSE Jahangrnagar Unversty Dhaka-1342, Bangladesh Mohammad Zahdur Rahman Department of CSE Jahangrnagar

More information

Network Security Situation Evaluation Method for Distributed Denial of Service

Network Security Situation Evaluation Method for Distributed Denial of Service Network Securty Stuaton Evaluaton Method for Dstrbuted Denal of Servce Jn Q,2, Cu YMn,2, Huang MnHuan,2, Kuang XaoHu,2, TangHong,2 ) Scence and Technology on Informaton System Securty Laboratory, Bejng,

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Web Object Indexing Using Domain Knowledge *

Web Object Indexing Using Domain Knowledge * Web Object Indexng Usng Doman Knowledge * Muyuan Wang Department of Automaton Tsnghua Unversty Bejng 100084, Chna (86-10)51774518 Zhwe L, Le Lu, We-Yng Ma Mcrosoft Research Asa Sgma Center, Hadan Dstrct

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15

The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15 The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the

More information

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow Dragan Smć Svetlana Smć Vasa Svrčevć Invocng and Fnancal Forecastng of Tme and Amount of Correspondng Cash Inflow Artcle Info:, Vol. 6 (2011), No. 3, pp. 014-021 Receved 13 Janyary 2011 Accepted 20 Aprl

More information

Using Content-Based Filtering for Recommendation 1

Using Content-Based Filtering for Recommendation 1 Usng Content-Based Flterng for Recommendaton 1 Robn van Meteren 1 and Maarten van Someren 2 1 NetlnQ Group, Gerard Brandtstraat 26-28, 1054 JK, Amsterdam, The Netherlands, robn@netlnq.nl 2 Unversty of

More information

The Journal of Systems and Software

The Journal of Systems and Software The Journal of Systems and Software 82 (2009) 241 252 Contents lsts avalable at ScenceDrect The Journal of Systems and Software journal homepage: www. elsever. com/ locate/ jss A study of project selecton

More information

Lecture 18: Clustering & classification

Lecture 18: Clustering & classification O CPS260/BGT204. Algorthms n Computatonal Bology October 30, 2003 Lecturer: Pana K. Agarwal Lecture 8: Clusterng & classfcaton Scrbe: Daun Hou Open Problem In HomeWor 2, problem 5 has an open problem whch

More information

A spam filtering model based on immune mechanism

A spam filtering model based on immune mechanism Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):2533-2540 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A spam flterng model based on mmune mechansm Ya-png

More information

ERP Software Selection Using The Rough Set And TPOSIS Methods

ERP Software Selection Using The Rough Set And TPOSIS Methods ERP Software Selecton Usng The Rough Set And TPOSIS Methods Under Fuzzy Envronment Informaton Management Department, Hunan Unversty of Fnance and Economcs, No. 139, Fengln 2nd Road, Changsha, 410205, Chna

More information

Rank Based Clustering For Document Retrieval From Biomedical Databases

Rank Based Clustering For Document Retrieval From Biomedical Databases Jayanth Mancassamy et al /Internatonal Journal on Computer Scence and Engneerng Vol.1(2), 2009, 111-115 Rank Based Clusterng For Document Retreval From Bomedcal Databases Jayanth Mancassamy Department

More information

An Inductive Fuzzy Classification Approach applied to Individual Marketing

An Inductive Fuzzy Classification Approach applied to Individual Marketing An Inductve Fuzzy Classfcaton Approach appled to Indvdual Marketng Mchael Kaufmann, Andreas Meer Abstract A data mnng methodology for an nductve fuzzy classfcaton s ntroduced. The nducton step s based

More information

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers Journal of Computatonal Informaton Systems 7: 13 (2011) 4740-4747 Avalable at http://www.jofcs.com A Load-Balancng Algorthm for Cluster-based Mult-core Web Servers Guohua YOU, Yng ZHAO College of Informaton

More information

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688, dskim@ssu.ac.kr

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688, dskim@ssu.ac.kr Proceedngs of the 41st Internatonal Conference on Computers & Industral Engneerng BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK Yeong-bn Mn 1, Yongwoo Shn 2, Km Jeehong 1, Dongsoo

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

Keywords : classifier, Association rules, data mining, healthcare, Associative Classifiers, CBA, CMAR, CPAR, MCAR. GJCST Classification : H.2.

Keywords : classifier, Association rules, data mining, healthcare, Associative Classifiers, CBA, CMAR, CPAR, MCAR. GJCST Classification : H.2. Global Journal of Computer Scence and Technology Volume 11 Issue 22 Verson 1.0 Type: Double Blnd Peer Revewed Internatonal Research Journal Publsher: Global Journals Inc. (USA) Onlne ISSN: 0975-4172 &

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

Semantic Content Enrichment of Sensor Network Data for Environmental Monitoring

Semantic Content Enrichment of Sensor Network Data for Environmental Monitoring Proceedngs of the Twenty-Seventh Internatonal Florda Artfcal Intellgence Research Socety Conference Semantc Content Enrchment of Sensor Network Data for Envronmental Montorng Dustn R. Franz and Rcardo

More information

Online Wireless Mesh Network Traffic Classification using Machine Learning

Online Wireless Mesh Network Traffic Classification using Machine Learning Journal of Computatonal Informaton Systems 7:5 (2011) 1524-1532 Avalable at http://www.jofcs.com Onlne Wreless Mesh Network Traffc Classfcaton usng Machne Learnng Chengje GU 1,, Shuny ZHANG 1, Xaozhen

More information

3 Supervised Learning

3 Supervised Learning 3 Supervsed Learnng Supervsed learnng has been a great success n real-world applcatons. It s used n almost every doman, ncludng text and Web domans. Supervsed learnng s also called classfcaton or nductve

More information

On Mean Squared Error of Hierarchical Estimator

On Mean Squared Error of Hierarchical Estimator S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques

Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques Journal of Computer Scence (7): 565-57, 6 ISSN 59-66 6 Scence Publcatons Desgn of Output Codes for Fast Coverng Learnng usng Basc Decomposton Technques Aruna Twar and Narendra S. Chaudhar, Faculty of Computer

More information

Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*

Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising* Probablstc Latent Semantc User Segmentaton for Behavoral Targeted Advertsng* Xaohu Wu 1,2, Jun Yan 2, Nng Lu 2, Shucheng Yan 3, Yng Chen 1, Zheng Chen 2 1 Department of Computer Scence Bejng Insttute of

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

Time Domain simulation of PD Propagation in XLPE Cables Considering Frequency Dependent Parameters

Time Domain simulation of PD Propagation in XLPE Cables Considering Frequency Dependent Parameters Internatonal Journal of Smart Grd and Clean Energy Tme Doman smulaton of PD Propagaton n XLPE Cables Consderng Frequency Dependent Parameters We Zhang a, Jan He b, Ln Tan b, Xuejun Lv b, Hong-Je L a *

More information

Machine Learning and Software Quality Prediction: As an Expert System

Machine Learning and Software Quality Prediction: As an Expert System I.J. Informaton Engneerng and Electronc Busness, 2014, 2, 9-27 Publshed Onlne Aprl 2014 n MECS (http://www.mecs-press.org/) DOI: 10.5815/jeeb.2014.02.02 Machne Learnng and Software Qualty Predcton: As

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

Multi-sensor Data Fusion for Cyber Security Situation Awareness

Multi-sensor Data Fusion for Cyber Security Situation Awareness Avalable onlne at www.scencedrect.com Proceda Envronmental Scences 0 (20 ) 029 034 20 3rd Internatonal Conference on Envronmental 3rd Internatonal Conference on Envronmental Scence and Informaton Applcaton

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng Fran Weber Maro V. Wüthrch October 28, 2011 Abstract For the predcton of dsablty frequences, not only the observed, but also the ncurred but

More information

When do data mining results violate privacy? Individual Privacy: Protect the record

When do data mining results violate privacy? Individual Privacy: Protect the record When do data mnng results volate prvacy? Chrs Clfton March 17, 2004 Ths s jont work wth Jashun Jn and Murat Kantarcıoğlu Indvdual Prvacy: Protect the record Indvdual tem n database must not be dsclosed

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

Vehicle Detection and Tracking in Video from Moving Airborne Platform

Vehicle Detection and Tracking in Video from Moving Airborne Platform Journal of Computatonal Informaton Systems 10: 12 (2014) 4965 4972 Avalable at http://www.jofcs.com Vehcle Detecton and Trackng n Vdeo from Movng Arborne Platform Lye ZHANG 1,2,, Hua WANG 3, L LI 2 1 School

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance 1 Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng 1, Fran Weber 1, Maro V. Wüthrch 2 Abstract: For the predcton of dsablty frequences, not only the observed, but also the ncurred but not yet

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Feature Selection via Correlation Coefficient Clustering

Feature Selection via Correlation Coefficient Clustering JOURNAL OF SOFTWARE, VOL. 5, NO. 1, DECEMBER 010 1371 Feature Selecton va Correlaton Coeffcent Clusterng Hu-Huang Hsu Department of Computer Scence and Informaton Engneerng, Tamkang Unversty, Tape, 5137,

More information

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and POLYSA: A Polynomal Algorthm for Non-bnary Constrant Satsfacton Problems wth and Mguel A. Saldo, Federco Barber Dpto. Sstemas Informátcos y Computacón Unversdad Poltécnca de Valenca, Camno de Vera s/n

More information