Learning to Filter Spam A Comparison of a Naive Bayesian and a Memory-Based Approach 1

Size: px
Start display at page:

Download "Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach 1"

Transcription

1 Learg to Flter Spam E-Mal: A Comparso of a Nave Bayesa ad a Memory-Based Approach 1 Io Adroutsopoulos, Georgos Palouras, Vagels Karkaletss, Georgos Sakks, Costate D. Spyropoulos ad Paagots Stamatopoulos Software ad Kowledge Egeerg Laboratory Isttute of Iformatcs ad Telecommucatos Natoal Cetre for Scetfc Research Demokrtos Ag. Paraskev, Athes, Greece e-mal: {oadr, palourg, vagels, costass}@t.demokrtos.gr Departmet of Iformatcs, Uversty of Athes TYPA Buldgs, Paepstmopols, Athes, Greece e-mal: {stud0926, T.Stamatopoulos}@d.uoa.gr Abstract We vestgate the performace of two mache learg algorthms the cotext of at flterg. The creasg volume of usolcted bulk e-mal () has geerated a eed for relable at- flters. Flters of ths type have so far bee based mostly o keyword patters that are costructed by had ad perform poorly. The Nave Bayesa classfer has recetly bee suggested as a effectve method to costruct automatcally at- flters wth superor performace. We vestgate thoroughly the performace of the Nave Bayesa flter o a publcly avalable corpus, cotrbutg towards stadard bechmarks. At the same tme, we compare the performace of the Nave Bayesa flter to a alteratve memorybased learg approach, after troducg sutable cost-sestve evaluato measures. Both methods acheve very accurate flterg, outperformg clearly the keyword-based flter of a wdely used e-mal reader. 1. Itroducto Electroc mal s a effcet ad creasgly popular commucato medum. Lke every powerful medum, however, t s proe to msuse. Oe such case of msuse s the bld postg of usolcted e-mal messages, also kow as, to very large umbers of recpets. Spam messages are typcally set usg bulk-malers ad address lsts harvested from web pages ad ewsgroup archves. They vary sgfcatly cotet, from vacato advertsemets to get-rch schemes. The commo feature of these messages s that they are usually of lttle terest to the majorty of the recpets. I some cases, they may eve be harmful, e.g. messages advertsg porographc stes may be read by chldre. Apart from wastg tme ad badwdth, e-mal also costs moey to users wth dal-up coectos. A 1997 study (Craor & Lamaccha 1998) reported that messages costtuted approxmately 10% of the comg messages to a corporate etwork. The stuato seems to be worseg, ad wthout approprate couter-measures, messages could evetually uderme the usablty of e-mal. 1 Proceedgs of the workshop Mache Learg ad Textual Iformato Access, H. Zaragoza, P. Gallar, ad M. Rajma (Eds.), 4 th Europea Coferece o Prcples ad Practce of Kowledge Dscovery Databases (PKDD-2000), Lyo, Frace, September 2000, pp

2 Attempts to troduce legal measures agast malg have had lmted effect. 2 A more effectve soluto s to develop tools to help recpets detfy or remove automatcally messages. Such tools, called at- flters, vary fuctoalty from blacklsts of frequet mers to cotet-based flters. The latter are geerally more powerful, as mers ofte use fake addresses. Exstg cotet-based flters search for partcular keyword patters the messages. These patters eed to be crafted by had, ad to acheve better results they eed to be tued to each user ad to be costatly mataed (Craor & Lamaccha 1998), a tedous task, requrg expertse that a user may ot have. We address the ssue of at- flterg wth the ad of mache learg. We exame supervsed learg methods, whch lear to detfy e-mal after recevg trag o messages that have bee maually classfed as or o- (hereafter legtmate). Learg algorthms of ths type have bee appled to several text categorzato tasks (e.g. Apte & Damerau 1994, Lews 1996, Daga et al. 1997), cludg classfyg e-mal to folders (Cohe 1996, Paye & Edwards 1997), or detfyg terestg ews artcles (Lag 1995; see also Spertus 1997). Recetly, Saham et al. (1998) traed a Nave Bayesa classfer (Duda & Hart 1973, Mtchell 1997) for at- flterg, reportg mpressve performace o usee messages. To our kowledge, ths s the oly prevous attempt to apply mache learg to at- flterg. We have costructed a ew bechmark corpus, whch s a mxture of messages ad messages set va a moderated (ad, hece, -free) malg lst. The corpus s made publcly avalable for other researchers to use as a bechmark. 3 Usg ths corpus, we performed a thorough evaluato of the Nave Bayesa algorthm, used (Saham et al. 1998), after troducg ew cost-sestve evaluato metrcs. These are ecessary to get a objectve pcture of the performace of the algorthm, whe the cost of msclassfcato dffers for the two classes ( ad legtmate). Furthermore, we used 10-fold crossvaldato to get a more ubased performace estmate, ad vestgated the effect of attrbute-set sze, a ssue that had ot bee examed (Saham et al. 1998). Aother mportat cotrbuto of the work preseted here s the comparso of the Nave Bayesa classfer wth aother learg method, amely the memory-based classfer of TMBL (Daelemas et al. 1999). We chose a memory-based classfer o the grouds that messages cover a very broad rage of topcs. Ths suggests that memory-based algorthms, that attempt to classfy messages by fdg smlar prevously receved messages, may perform equally well as algorthms that attempt to lear ufyg characterstcs of messages. Our results cofrmed ths suspco, ad TMBL acheved hgh classfcato accuracy. O average, the two learg methods performed equally well, wth the best method depedg o the exact usage scearo of the flter. Both methods outperformed clearly the keyword-based flter of Outlook 2000, a wdely used e-mal reader. 4 5 The remader of ths paper s orgazed as follows: secto 2 descrbes our bechmark corpus; secto 3 dscusses preprocessg steps that are eeded before applyg the learg algorthms; secto 4 presets the learg algorthms that we used; secto 5 troduces costsestve evaluato measures; secto 6 dscusses our expermetal results; ad secto 7 cocludes. 2 Cosult ad 3 The corpus s avalable from the publcatos secto of 4 Outlook 2000 s a trademark of Mcrosoft Corporato. Outlook s documetato pots to a fle cotag the patters of ts at- flter. We tred both a case-sestve ad a case-sestve verso of these patters, ad use the best-performg verso each expermet. 5 A earler summary of our expermets wth the Nave Bayesa classfer, ot comprsg the expermets wth TMBL ad Outlook 2000, ca be foud (Adroutsopoulos et al. 2000a). 2

3 2. Corpus collecto The bechmark corpus that we costructed s a mxture of messages ad messages receved va the Lgust lst, a moderated malg lst about the professo ad scece of lgustcs. 6 The corpus, dubbed Lg-Spam, cossts of 2893 messages: 2412 Lgust messages, obtaed by radomly dowloadg dgests from the lst s archves, breakg the dgests to ther messages, ad removg text added by the lst s server. 481 messages, receved by the frst author. Attachmets, HTML tags, ad duplcate messages receved o the same day were ot cluded. Spam messages are 16.6% of the corpus, a fgure close to the comg rates of the authors, ad rates reported (Saham et al. 1998) ad (Craor & LaMaccha 1998). Although the Lgust messages are more topc-specfc tha most users comg e- mal, they are less stadardzed tha oe mght expect (e.g. they cota job postgs, software avalablty aoucemets, eve flame-lke resposes). Hece, useful prelmary coclusos about at- flterg ca be reached wth Lg-Spam, utl better publc corpora become avalable. 7 Wth a more drect terpretato, our expermets ca also be see as a study o at- flters for ope u-moderated malg lsts or ewsgroups. 3. Corpus preprocessg For every message Lg-Spam, a vector represetato x = x1, x2, x3,, x was computed, where x, 1, x are the values of attrbutes X 1,, X, much as the vector space model (Salto & McGll 1983). Followg (Saham et al. 1998), all attrbutes are bary: X = 1 f some characterstc represeted by X s preset the message; otherwse = 0. I our expermets, attrbutes correspod to words,.e. each attrbute shows f a X partcular word (e.g. adult ) occurs the message. It s also possble, however, to troduce attrbutes correspodg to phrases (e.g. showg f be over 21 s preset) or o-textual propertes (e.g. whether or ot a message cotas attachmets; see Saham et al. 1998). As (Saham et al. 1998), to select amog all possble attrbutes ( our case, all possble word-attrbutes), we compute the mutual formato ( MI ) of each caddate attrbute X wth the category-deotg varable C : X = x, C = c) MI( X ; C) = X = x, C = c) log x { 0,1}, c {, legt} X = x) C = c) The attrbutes wth the m hghest MI -scores are the selected. The probabltes are estmated from the trag corpus as frequecy ratos. 8 To avod treatg forms of the same word as dfferet attrbutes, a lemmatzer was appled to Lg-Spam, substtutg each word by ts base form (e.g. earg becomes ear ). 9 6 The Lgust lst s archved at 7 To address prvacy ssues, we have recetly started expermetg wth sutably ecoded persoal e-mal folders. Cosult (Adroutsopoulos et al. 2000b). 8 Cosult (Mtchell 1996) for more elaborate estmates. 9 We used morph, a lemmatzer cluded GATE. See 3

4 4. Classfcato of e-mal messages We ow tur to the learg algorthms we expermeted wth Nave Bayesa classfcato From Bayes theorem ad the theorem of total probablty, the probablty that a documet d wth vector x = x,, x belogs to category c s: 1 l P ( C = c X = x) = C = c) X k {, legt} C = k) X = x C = c) = x C = k) I practce, the probabltes P ( X C) are mpossble to estmate wthout smplfyg assumptos, because the possble values of X are too may ad there are also data sparseess problems. The Nave Bayesa classfer assumes that X 1, l, X are codtoally depedet gve the category C, whch yelds: P ( C = c X = x) = C = c) = 1 C = k) X k {, legt} = 1 = x X C = c) = x C = k) X C) ad P (C) are easy to estmate from the frequeces of the trag corpus. A large umber of emprcal studes have foud the Nave Bayesa classfer to be surprsgly effectve (Lagley et al. 1992, Domgos & Pazza 1996), despte the fact that the assumpto that the depedece assumpto s usually overly smplstc. 10 Mstakely blockg a legtmate message (classfyg a legtmate message as ) s geerally more severe a error tha lettg a message pass the flter (classfyg a message as legtmate). Let legt ad legt deote the two error types. Ivokg a decso-theoretc oto of cost, we assume that legt s λ tmes more costly tha legt. A message s classfed as f the followg crtero s met: C = X = x) > λ C = legtmate X = x) To the extet that the depedece assumpto holds ad the probablty estmates are accurate, a classfer based o ths crtero acheves optmal results (Duda & Hart 1973). I our case, P ( C = X = x) = 1 C = legtmate X = x), ad the classfcato crtero s equvalet to: P ( C = X = x) > t, wth t λ = 1 + λ, t λ = 1 t I the expermets of (Saham et al. 1998), t was set to 0.999, whch correspods to λ = 999. That s, mstakely blockg a legtmate message was take to be as bad as lettg 999 messages pass the flter. Whe blocked messages are dscarded wthout further 10 Cosult (Fredma et al. 1997) for Bayesa classfers wth less restrctve depedece assumptos. 4

5 processg, settg λ to such a hgh value s reasoable, because that case most users would cosder losg a legtmate message uacceptable. Alteratve usage scearos are possble, however, ad lower λ values are reasoable those cases. For example, rather tha beg deleted, a blocked message could be retured to the seder, wth a automatcally serted apology paragraph. The extra paragraph would expla that a flter blocked the message, ad t would ask the seder to repost the message to a dfferet, prvate u-fltered e-mal address of the recpet (see also Hall 1998). The prvate address would ever be advertsed (e.g. o web pages or ewsgroups), makg t ulkely to receve mal drectly. Furthermore, the apology paragraph could clude a frequetly chagg rddle (e.g. Iclude the subject the captal of Frace. ) to esure that messages are ot forwarded automatcally to the prvate address by robots that sca retured messages for ew e-mal addresses. Messages set to the prvate address wthout the correct rddle aswer would be deleted automatcally. (Spammers caot afford the tme to aswer thousads of rddles.) I the scearo of the prevous paragraph, λ = 9 ( t = 0. 9 ) seems more reasoable: blockg a legtmate message s pealzed mldly more tha lettg a message pass, to accout for the fact that recoverg from a blocked legtmate message requres overall more work (coutg the seder s extra work to repost t) tha recoverg from a message that passed the flter (deletg t maually). A thrd scearo would be to assume that the at- flter smply flags messages t cosders to be, wthout removg them from the user s malbox (e.g. to help the user prortze the readg of the messages). I that case, λ = 1 ( t = 0. 5 ) seems reasoable, sce oe of the two error types s sgfcatly graver tha the other Memory-based classfcato The secod method that we evaluated belogs to the famly of memory-based (or stacebased) methods (Mtchell, 1997). The commo feature of these methods s that they store all trag staces a memory structure, ad use them drectly for classfcato. The smplest form of memory structure s the mult-dmesoal space defed by the attrbutes the stace vectors. Each trag stace s represeted as a pot that space. The classfcato procedure s usually a varat of the smple k-earest-eghbor (k-) algorthm. k- assgs to each ew usee stace the majorty class amog the k trag staces that are closest to the usee stace (ts k-eghborhood). We used the memory-based classfcato algorthm mplemeted the TMBL software (Daelemas et al., 1999). TMBL provdes a basc memory-based classfcato algorthm ad extesos to address ssues such as effcet computato of the k-eghborhood ad attrbute weghtg. We oly used the basc algorthm, whch s a varat of k-. Oe mportat dfferece from k- s the defto of the k-eghborhood. TMBL cosders all the trag staces at the k closest dstaces from the usee stace. If there are more tha oe eghbors at each dstace, the algorthm exames may more tha k eghbors. I such cases, a small value of k s ecessary, to avod cosderg staces that are very dfferet from the usee oe. A further addto we made to the basc TMBL algorthm s a postprocessg stage to take λ to accout. Ths smply multples the umber of legtmate eghbors by λ, before decdg o the majorty class the eghborhood. 5

6 5. Measures to evaluate classfcato performace I classfcato tasks, performace s ofte measured terms of accuracy (Acc ) or error rate ( Err = 1 Acc ). Let N legt ad N be the total umbers of legtmate ad messages, respectvely, to be classfed by the flter, ad Y Z the umber of messages belogg to category Y that the flter classfed as belogg to category Z ( Y, Z { legt, } ). The: Acc = legt legt N legt + + N Err = legt N legt + + N legt Accuracy ad error rate assg equal weghts to the two error types ( legt ad legt ). However, legt s λ tmes more costly tha legt. To make accuracy ad error rate sestve to ths cost dfferece, each legtmate message s treated, for evaluato purposes, as f t were λ messages. That s, whe a legtmate message s blocked, ths couts as λ errors; ad whe t passes the flter, ths couts as λ successes. Ths leads to the followg deftos of weghted accuracy (WAcc ) ad weghted error rate ( WErr = 1 WAcc ): λ WAcc = λ N legt legt legt + + N λ WErr = λ N legt legt + + N legt The values of accuracy ad error rate (or ther weghted versos) are ofte msleadgly hgh. To get a clear pcture of a classfer s performace, t s commo to compare ts accuracy or error rate to those of a smplstc basele approach. We use the case where o flter s preset as our basele: legtmate messages are (correctly) ever blocked, ad messages (mstakely) always pass. The weghted accuracy ad weghted error rate of the basele are: WAcc b λ N legt = λ N + N legt WErr b = λ N The total cost rato (TCR ) allows the performace of a flter to be compared easly to that of the basele: b WErr TCR = WErr = λ N legt + N legt legt Greater TCR values dcate better performace. For TCR < 1, the basele (ot usg the flter) s better. If cost s proportoal to wasted tme, a tutve meag for TCR s the followg: t measures how much tme s wasted to delete maually all messages whe o flter s used ( N ), compared to the tme wasted to delete maually ay messages that passed the flter ( legt ) plus the tme eeded to recover from mstakely blocked legtmate messages ( λ ). legt For the beeft of readers more famlar wth formato retreval ad extracto tasks, our expermetal results are also preseted terms of recall (SR ) ad precso ( SP ): + N 6

7 6,0 5,5 5,0 4,5 4,0 Nave Bayesa TMBL(1) TMBL(2) TMBL(10) Outlook patters 3,5 TCR 3,0 2,5 2,0 1,5 1,0 0,5 0, umber of retaed attrbutes SR = N Fgure 1. TCR scores for λ=1 SP = + legt Spam recall measures the percetage of messages that the flter maages to block (tutvely ts effectveess), whle precso measures the degree to whch the blocked messages are deed (the flter s safety). Despte ther tutveess, t s dffcult to compare the performace of dfferet flters usg recall ad precso: each flter (or flter cofgurato) yelds a par of recall ad precso results; wthout a sgle ufyg measure, lke TCR that corporates the cost dfferece betwee the two error types, t s dffcult to decde whch par s better Expermetal results We performed three sets of expermets o Lg-Spam, correspodg to the three scearos (parameter λ) that were descrbed secto 4.1. I each scearo, we vared the umber of selected attrbutes from 50 to 700 by 50, each tme retag the attrbutes wth the hghest MI scores. 10-fold cross-valdato was used all expermets: Lg-Spam was parttoed radomly to te parts, ad the expermet was repeated te tmes, each tme reservg a dfferet part for testg, ad usg the remag e parts for trag. WAcc was the b averaged over the te teratos, ad TCR was computed as WErr dvded by the average WErr. The fgures below show the average performace of each method each expermet, cludg the TCR scores we obtaed wth Outlook s patters. At the ed of ths secto we 11 The F-measure, used formato retreval ad extracto to combe recall ad precso (e.g. Rloff & Lehert 1994), caot be used here, because ts weghtg factor caot be related to the cost dfferece of the two error types. 7

8 6,0 5,5 5,0 4,5 4,0 Nave Bayesa TMBL(1) TMBL(2) TMBL(10) Outlook patters 3,5 TCR 3,0 2,5 2,0 1,5 1,0 0,5 0, umber of retaed attrbutes Fgure 2. TCR scores for λ=9 select the best-performg cofgurato for each flter ad scearo, ad perform tests to establsh statstcally sgfcat dffereces Scearo 1: Flaggg messages (λ=1) I ths scearo, the msclassfcato cost s detcal for both error types. Fgure 1 shows the correspodg results. The most mportat fdg here s that both learg methods acheve very accurate classfcato, mprovg sgfcatly o the basele. Both methods perform better wth small umbers of attrbutes. Ther performace deterorates as the sze of the attrbute set creases, whch s due to the kow sestvty of the methods to data sparseess, caused by creasg the umber of attrbutes. The Nave Bayesa classfer performs best for 100 attrbutes, whle TMBL does best wth the smallest attrbute set sze (50). TMBL s performace was evaluated for three dfferet values of k (1,2,10). The method seems to perform best for small k values. For k = 10, the performace of the method falls to a very low level, mprovg oly slghtly o the base case. Ths s due to the large umber of tes for each of the k ( = 10 ) dstaces, whch leads to a very large eghborhood (> 500 eghbors). I such cases, the behavor of the classfer approxmates that of the default rule, whch classfes everythg accordg to the majorty class (legtmate our case). Ths s also resposble for the sestvty of the method to the umber of attrbutes for k = 10. Outlook s keyword patters perform very poorly compared to the other two methods, wth the excepto of TMBL for k = 10, whch does eve worse Scearo 2: Notfyg seders about blocked messages (λ=9) Here we creased the cost of msclassfyg legtmate messages, by settg λ = 9. Fgure 2 shows the correspodg results. Comparg to the prevous scearo, the most mportat 8

9 6,0 5,5 5,0 4,5 Nave Bayesa TMBL(1) TMBL(2) TMBL(10) Outlook patters 4,0 3,5 TCR 3,0 2,5 2,0 1,5 1,0 0,5 0, umber of retaed attrbutes Fgure 3. TCR scores for λ=999 dfferece s the lower mprovemet of the learg methods over the basele. Ths s due to the creased performace of the basele as λ creases: wthout a flter all legtmate messages are retaed, ad ths becomes beefcal as λ creases, makg t harder to beat the basele. The two methods also seem to be less sestve to the sze of the attrbute set for λ = 9. Ths ca be explaed by the fact that after a certa umber of attrbutes, the classfcato performace approaches ts lowest possble value asymptotcally. Aother terestg observato s that the performace of Outlook s patters falls below the base case,.e. oe s better off ot usg the flter Scearo 3: Removg blocked messages (λ=999) I the thrd scearo a large λ value s used (999). I ths case, the choce to use ay flter at all becomes doubtful, as the performace of the basele creases to a level that ay mprovemet o t s very hard. It s worth otg that ths λ value was the oe used (Saham et al., 1998). Fgure 3 presets the scearo s results. As expected, ow all methods have dffcultes achevg better results tha the basele. Oe excepto s TMBL for k = 10, whch s cosstetly hgher tha the base case by a small marg. Ths s aga a effect of the very large eghborhood, whch ow classfes most messages as legtmate, due to the large value of λ. The oly staces that are classfed as are those lyg a area of the stace space that s solely occuped by trag staces,.e. the most certa cases of usee messages. TMBL for k = 2, maages to acheve a sgfcat mprovemet over the basele for several cosecutve attrbute set szes. Although the Nave Bayesa classfer acheves better performace for 300 attrbutes, ths s 9

10 the oly pot where t mproves over the basele. I practcal applcatos, ppotg the optmal attrbute set sze s feasble, ad hece TMBL for k = 2 s to be preferred. The reaso for the abrupt fluctuatos the performace of the methods s that a sgle msclassfcato of a legtmate message causes a very large fall TCR. Ths happes, for example, wth TMBL for k = 10 ad 550 attrbutes Best-performg cofguratos Havg vestgated the effect of attrbute set sze, we ow cocetrate o the attrbute set szes for whch each learg method performs best, ad exame whether the dffereces betwee the methods are statstcally sgfcat. Table 1 presets the results for each method ad λ value by decreasg TCR. Pared sgle-taled t-tests o WAcc show that the performace dffereces betwee the flter cofguratos of table 1 are all statstcally sgfcat at p < 0. 05, wth the followg exceptos (talcs table 1): Nave Bayesa ad TMBL (k = 1, 2 ) for λ = 1; Nave Bayesa ad TMBL (k = 1) for λ = 9. We ote that for a corpus of smlar rate, (Saham et al. 1998) reports 92.3% precso ad 80.0% recall usg the Nave Bayesa classfer at 500 attrbutes ad λ = 999. No prcpled comparso to these results ca be made, however, as they were obtaed usg a dfferet corpus ad addtoal maually selected phrasal ad o-textual attrbutes 7. Coclusos We performed a thorough evaluato of two learg methods o the task of at- flterg, usg a corpus that we made publcly avalable, ad sutable cost-sestve Flter used λ o. of attrbutes recall (%) precso (%) weghted accuracy (%) TCR Nave Bayesa TMBL(1) TMBL(2) Outlook patters TMBL(10) Basele (o flter) Nave Bayesa TMBL(1) TMBL(2) TMBL(10) Basele (o flter) Outlook patters Nave Bayesa TMBL(2) TMBL(10) Basele (o flter) TMBL(1) Outlook patters Table 1: Results o the Lg-Spam corpus usg the best cofguratos 10

11 evaluato measures. Both methods acheved very hgh classfcato accuracy ad clearly outperformed the at- keyword patters of a wdely used e-mal reader. Our fdgs suggest that t s etrely feasble to costruct learg-based at- flters whe messages are smply to be flagged, or whe addtoal mechasms are avalable to form the seders of blocked messages. Whe o such mechasms are preset, a memory-based approach appears to be more vable, but great care s eeded to cofgure the flter approprately. We are curretly examg alteratve learg methods for the same task, cludg attrbute-weghted versos of the memory-based algorthm. We also pla to explore alteratve attrbute selecto techques, cludg term extracto methods to move from word to phrasal attrbutes. Ackowledgemets Part of ths work was performed usg text categorzato machery developed wth the cotext of ADIET (Adaptve Iformato Extracto Techology), a blateral cooperato project fuded by the govermets of Greece ad Frace. Refereces Adroutsopoulos I., J. Koutsas, K.V. Chadros, G. Palouras, ad C.D. Spyropoulos. 2000a. A Evaluato of Nave Bayesa At-Spam Flterg. Proceedgs of the Workshop o Mache Learg the New Iformato Age, 11th Europea Coferece o Mache Learg, Barceloa, Spa, pages Adroutsopoulos I., J. Koutsas, K.V. Chadros, ad C.D. Spyropoulos. 2000b. A Expermetal Comparso of Nave Bayesa ad Keyword-Based At-Spam Flterg wth Ecrypted Persoal Messages. Proceedgs of the 23rd Aual Iteratoal ACM SIGIR Coferece o Research ad Developmet Iformato Retreval, Athes, Greece (forthcomg). Apte, C. ad F. Damerau Automated Learg of Decso Rules for Text Categorzato. ACM Trasactos o Iformato Systems, 12(3): Cohe, W.W Learg Rules that Classfy E-Mal. Proceedgs of the AAAI Sprg Symposum o Mache Learg Iformato Access, Staford, Calfora. Craor, L.F. ad B.A. LaMaccha Spam! Commucatos of ACM, 41(8): Daelemas, W., Z. Jakub, K. va der Sloot ad A. va de Bosch TMBL: Tlburg Memory Based Learer, verso 2.0, Referece Gude. ILK,Computatoal Lgustcs, Tlburg Uversty. Daga, I., Y. Karov ad D. Roth Mstake-Drve Learg Text Categorzato. Proceedgs of the 2 d Coferece o Emprcal Methods Natural Laguage Processg, pages 55 63, Provdece, Rhode Islad. Domgos, P. ad M. Pazza Beyod Idepedece: Codtos for the Optmalty of the Smple Bayesa Classfer. Proceedgs of the 13 th Iteratoal Coferece o Mache Learg, pages , Bar, Italy. Duda, R.O. ad P.E. Hart Bayes Decso Theory. Chapter 2 Patter Classfcato ad Scee Aalyss, pages Joh Wley. Fredma, N., D. Geger ad M. Goldszmdt Bayesa Network Classfers. Mache Learg, 29(2/3):

12 Hall, R.J How to Avod Uwated Emal. Commucatos of ACM, 41(3): Lag, K Newsweeder: Learg to Flter Netews. Proceedgs of the 12 th Iteratoal Coferece o Mache Learg, pages , Staford, Calfora. Lagley, P., I. Waye ad K. Thompso A Aalyss of Bayesa Classfers. Proceedgs of the 10 th Natoal Coferece o Artfcal Itellgece, pages , Sa Jose, Calfora. Lews, D Trag Algorthms for Lear Text Classfers. Proceedgs of the 19 th Aual Iteratoal ACM-SIGIR Coferece o Research ad Developmet Iformato Retreval, pages , Kostaz, Germay. Mtchell, T.M Mache Learg. McGraw-Hll. Paye, T.R. ad P. Edwards Iterface Agets that Lear: A Ivestgato of Learg Issues a Mal Aget Iterface. Appled Artfcal Itellgece, 11(1):1 32. Rloff, E. ad W. Lehert Iformato Extracto as a Bass for Hgh-Precso Text Classfcato. ACM Trasactos o Iformato Systems, 12(3): Saham, M., S. Dumas, D. Heckerma, ad E. Horvtz A Bayesa Approach to Flterg Juk E-Mal. Learg for Text Categorzato Papers from the AAAI Workshop, pages 55 62, Madso Wscos. AAAI Techcal Report WS Salto, G. ad M.J. McGll Itroducto to Moder Iformato Retreval. McGraw- Hll. Spertus, E Smokey: Automatc Recogto of Hostle Messages. Proceedgs of the 14 th Natoal Coferece o Artfcal Itellgece ad the 9 th Coferece o Iovatve Applcatos of Artfcal Itellgece, pages , Provdece, Rhode Islad. 12

An Evaluation of Naive Bayesian Anti-Spam Filtering

An Evaluation of Naive Bayesian Anti-Spam Filtering Proceedgs of the workshop o Mache earg the New Iformato Age, G. Potamas, V. Moustaks ad M. va omere (eds.), th Europea Coferece o Mache earg, Barceloa, pa, pp. 9-7, 2000. A Evaluato of Nave Bayesa At-pam

More information

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering Moder Appled Scece October, 2009 Applcatos of Support Vector Mache Based o Boolea Kerel to Spam Flterg Shugag Lu & Keb Cu School of Computer scece ad techology, North Cha Electrc Power Uversty Hebe 071003,

More information

APPENDIX III THE ENVELOPE PROPERTY

APPENDIX III THE ENVELOPE PROPERTY Apped III APPENDIX III THE ENVELOPE PROPERTY Optmzato mposes a very strog structure o the problem cosdered Ths s the reaso why eoclasscal ecoomcs whch assumes optmzg behavour has bee the most successful

More information

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki IDENIFICAION OF HE DYNAMICS OF HE GOOGLE S RANKING ALGORIHM A. Khak Sedgh, Mehd Roudak Cotrol Dvso, Departmet of Electrcal Egeerg, K.N.oos Uversty of echology P. O. Box: 16315-1355, ehra, Ira [email protected],

More information

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology I The Name of God, The Compassoate, The ercful Name: Problems' eys Studet ID#:. Statstcal Patter Recogto (CE-725) Departmet of Computer Egeerg Sharf Uversty of Techology Fal Exam Soluto - Sprg 202 (50

More information

6.7 Network analysis. 6.7.1 Introduction. References - Network analysis. Topological analysis

6.7 Network analysis. 6.7.1 Introduction. References - Network analysis. Topological analysis 6.7 Network aalyss Le data that explctly store topologcal formato are called etwork data. Besdes spatal operatos, several methods of spatal aalyss are applcable to etwork data. Fgure: Network data Refereces

More information

Green Master based on MapReduce Cluster

Green Master based on MapReduce Cluster Gree Master based o MapReduce Cluster Mg-Zh Wu, Yu-Chag L, We-Tsog Lee, Yu-Su L, Fog-Hao Lu Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of

More information

Average Price Ratios

Average Price Ratios Average Prce Ratos Morgstar Methodology Paper August 3, 2005 2005 Morgstar, Ic. All rghts reserved. The formato ths documet s the property of Morgstar, Ic. Reproducto or trascrpto by ay meas, whole or

More information

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability Egeerg, 203, 5, 4-8 http://dx.do.org/0.4236/eg.203.59b003 Publshed Ole September 203 (http://www.scrp.org/joural/eg) Mateace Schedulg of Dstrbuto System wth Optmal Ecoomy ad Relablty Syua Hog, Hafeg L,

More information

An IG-RS-SVM classifier for analyzing reviews of E-commerce product

An IG-RS-SVM classifier for analyzing reviews of E-commerce product Iteratoal Coferece o Iformato Techology ad Maagemet Iovato (ICITMI 205) A IG-RS-SVM classfer for aalyzg revews of E-commerce product Jaju Ye a, Hua Re b ad Hagxa Zhou c * College of Iformato Egeerg, Cha

More information

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS I Ztou, K Smaïl, S Delge, F Bmbot To cte ths verso: I Ztou, K Smaïl, S Delge, F Bmbot. A COMPARATIVE STUDY BETWEEN POLY- CLASS AND MULTICLASS

More information

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data ANOVA Notes Page Aalss of Varace for a Oe-Wa Classfcato of Data Cosder a sgle factor or treatmet doe at levels (e, there are,, 3, dfferet varatos o the prescrbed treatmet) Wth a gve treatmet level there

More information

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog, Frst ad Correspodg Author

More information

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN Wojcech Zelńsk Departmet of Ecoometrcs ad Statstcs Warsaw Uversty of Lfe Sceces Nowoursyowska 66, -787 Warszawa e-mal: wojtekzelsk@statystykafo Zofa Hausz,

More information

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time Joural of Na Ka, Vol. 0, No., pp.5-9 (20) 5 A Study of Urelated Parallel-Mache Schedulg wth Deteroratg Mateace Actvtes to Mze the Total Copleto Te Suh-Jeq Yag, Ja-Yuar Guo, Hs-Tao Lee Departet of Idustral

More information

A Parallel Transmission Remote Backup System

A Parallel Transmission Remote Backup System 2012 2d Iteratoal Coferece o Idustral Techology ad Maagemet (ICITM 2012) IPCSIT vol 49 (2012) (2012) IACSIT Press, Sgapore DOI: 107763/IPCSIT2012V495 2 A Parallel Trasmsso Remote Backup System Che Yu College

More information

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are : Bullets bods Let s descrbe frst a fxed rate bod wthout amortzg a more geeral way : Let s ote : C the aual fxed rate t s a percetage N the otoal freq ( 2 4 ) the umber of coupo per year R the redempto of

More information

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract Preset Value of Autes Uder Radom Rates of Iterest By Abraham Zas Techo I.I.T. Hafa ISRAEL ad Uversty of Hafa, Hafa ISRAEL Abstract Some attempts were made to evaluate the future value (FV) of the expected

More information

IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm

IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm Iteratoal Joural of Grd Dstrbuto Computg, pp.141-150 http://dx.do.org/10.14257/jgdc.2015.8.6.14 IP Network Topology Lk Predcto Based o Improved Local Iformato mlarty Algorthm Che Yu* 1, 2 ad Dua Zhem 1

More information

of the relationship between time and the value of money.

of the relationship between time and the value of money. TIME AND THE VALUE OF MONEY Most agrbusess maagers are famlar wth the terms compoudg, dscoutg, auty, ad captalzato. That s, most agrbusess maagers have a tutve uderstadg that each term mples some relatoshp

More information

1. The Time Value of Money

1. The Time Value of Money Corporate Face [00-0345]. The Tme Value of Moey. Compoudg ad Dscoutg Captalzato (compoudg, fdg future values) s a process of movg a value forward tme. It yelds the future value gve the relevat compoudg

More information

Efficient Traceback of DoS Attacks using Small Worlds in MANET

Efficient Traceback of DoS Attacks using Small Worlds in MANET Effcet Traceback of DoS Attacks usg Small Worlds MANET Yog Km, Vshal Sakhla, Ahmed Helmy Departmet. of Electrcal Egeerg, Uversty of Souther Calfora, U.S.A {yogkm, sakhla, helmy}@ceg.usc.edu Abstract Moble

More information

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity Computer Aded Geometrc Desg 19 (2002 365 377 wwwelsevercom/locate/comad Optmal mult-degree reducto of Bézer curves wth costrats of edpots cotuty Guo-Dog Che, Guo-J Wag State Key Laboratory of CAD&CG, Isttute

More information

10.5 Future Value and Present Value of a General Annuity Due

10.5 Future Value and Present Value of a General Annuity Due Chapter 10 Autes 371 5. Thomas leases a car worth $4,000 at.99% compouded mothly. He agrees to make 36 lease paymets of $330 each at the begg of every moth. What s the buyout prce (resdual value of the

More information

On Error Detection with Block Codes

On Error Detection with Block Codes BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 3 Sofa 2009 O Error Detecto wth Block Codes Rostza Doduekova Chalmers Uversty of Techology ad the Uversty of Gotheburg,

More information

Optimal Packetization Interval for VoIP Applications Over IEEE 802.16 Networks

Optimal Packetization Interval for VoIP Applications Over IEEE 802.16 Networks Optmal Packetzato Iterval for VoIP Applcatos Over IEEE 802.16 Networks Sheha Perera Harsha Srsea Krzysztof Pawlkowsk Departmet of Electrcal & Computer Egeerg Uversty of Caterbury New Zealad [email protected]

More information

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree , pp.277-288 http://dx.do.org/10.14257/juesst.2015.8.1.25 A New Bayesa Network Method for Computg Bottom Evet's Structural Importace Degree usg Jotree Wag Yao ad Su Q School of Aeroautcs, Northwester Polytechcal

More information

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK Fractal-Structured Karatsuba`s Algorthm for Bary Feld Multplcato: FK *The authors are worg at the Isttute of Mathematcs The Academy of Sceces of DPR Korea. **Address : U Jog dstrct Kwahadog Number Pyogyag

More information

The Time Value of Money

The Time Value of Money The Tme Value of Moey 1 Iversemet Optos Year: 1624 Property Traded: Mahatta Islad Prce : $24.00, FV of $24 @ 6%: FV = $24 (1+0.06) 388 = $158.08 bllo Opto 1 0 1 2 3 4 5 t ($519.37) 0 0 0 0 $1,000 Opto

More information

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software J. Software Egeerg & Applcatos 3 63-69 do:.436/jsea..367 Publshed Ole Jue (http://www.scrp.org/joural/jsea) Dyamc Two-phase Trucated Raylegh Model for Release Date Predcto of Software Lafe Qa Qgchua Yao

More information

DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT

DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT ESTYLF08, Cuecas Meras (Meres - Lagreo), 7-9 de Septembre de 2008 DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT José M. Mergó Aa M. Gl-Lafuete Departmet of Busess Admstrato, Uversty of Barceloa

More information

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0 Chapter 2 Autes ad loas A auty s a sequece of paymets wth fxed frequecy. The term auty orgally referred to aual paymets (hece the ame), but t s ow also used for paymets wth ay frequecy. Autes appear may

More information

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R = Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS Objectves of the Topc: Beg able to formalse ad solve practcal ad mathematcal problems, whch the subjects of loa amortsato ad maagemet of cumulatve fuds are

More information

Speeding up k-means Clustering by Bootstrap Averaging

Speeding up k-means Clustering by Bootstrap Averaging Speedg up -meas Clusterg by Bootstrap Averagg Ia Davdso ad Ashw Satyaarayaa Computer Scece Dept, SUNY Albay, NY, USA,. {davdso, ashw}@cs.albay.edu Abstract K-meas clusterg s oe of the most popular clusterg

More information

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS L et al.: A Dstrbuted Reputato Broker Framework for Web Servce Applcatos A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS Kwe-Jay L Departmet of Electrcal Egeerg ad Computer Scece

More information

The Digital Signature Scheme MQQ-SIG

The Digital Signature Scheme MQQ-SIG The Dgtal Sgature Scheme MQQ-SIG Itellectual Property Statemet ad Techcal Descrpto Frst publshed: 10 October 2010, Last update: 20 December 2010 Dalo Glgorosk 1 ad Rue Stesmo Ødegård 2 ad Rue Erled Jese

More information

Classic Problems at a Glance using the TVM Solver

Classic Problems at a Glance using the TVM Solver C H A P T E R 2 Classc Problems at a Glace usg the TVM Solver The table below llustrates the most commo types of classc face problems. The formulas are gve for each calculato. A bref troducto to usg the

More information

Settlement Prediction by Spatial-temporal Random Process

Settlement Prediction by Spatial-temporal Random Process Safety, Relablty ad Rs of Structures, Ifrastructures ad Egeerg Systems Furuta, Fragopol & Shozua (eds Taylor & Fracs Group, Lodo, ISBN 978---77- Settlemet Predcto by Spatal-temporal Radom Process P. Rugbaapha

More information

Credibility Premium Calculation in Motor Third-Party Liability Insurance

Credibility Premium Calculation in Motor Third-Party Liability Insurance Advaces Mathematcal ad Computatoal Methods Credblty remum Calculato Motor Thrd-arty Lablty Isurace BOHA LIA, JAA KUBAOVÁ epartmet of Mathematcs ad Quattatve Methods Uversty of ardubce Studetská 95, 53

More information

Proactive Detection of DDoS Attacks Utilizing k-nn Classifier in an Anti-DDos Framework

Proactive Detection of DDoS Attacks Utilizing k-nn Classifier in an Anti-DDos Framework World Academy of Scece, Egeerg ad Techology Iteratoal Joural of Computer, Electrcal, Automato, Cotrol ad Iformato Egeerg Vol:4, No:3, 2010 Proactve Detecto of DDoS Attacks Utlzg k-nn Classfer a At-DDos

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Statistical Intrusion Detector with Instance-Based Learning

Statistical Intrusion Detector with Instance-Based Learning Iformatca 5 (00) xxx yyy Statstcal Itruso Detector wth Istace-Based Learg Iva Verdo, Boja Nova Faulteta za eletroteho raualštvo Uverza v Marboru Smetaova 7, 000 Marbor, Sloveja [email protected] eywords:

More information

Report 52 Fixed Maturity EUR Industrial Bond Funds

Report 52 Fixed Maturity EUR Industrial Bond Funds Rep52, Computed & Prted: 17/06/2015 11:53 Report 52 Fxed Maturty EUR Idustral Bod Fuds From Dec 2008 to Dec 2014 31/12/2008 31 December 1999 31/12/2014 Bechmark Noe Defto of the frm ad geeral formato:

More information

CHAPTER 2. Time Value of Money 6-1

CHAPTER 2. Time Value of Money 6-1 CHAPTER 2 Tme Value of Moey 6- Tme Value of Moey (TVM) Tme Les Future value & Preset value Rates of retur Autes & Perpetutes Ueve cash Flow Streams Amortzato 6-2 Tme les 0 2 3 % CF 0 CF CF 2 CF 3 Show

More information

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time. Computatoal Geometry Chapter 6 Pot Locato 1 Problem Defto Preprocess a plaar map S. Gve a query pot p, report the face of S cotag p. S Goal: O()-sze data structure that eables O(log ) query tme. C p E

More information

Bayesian Network Representation

Bayesian Network Representation Readgs: K&F 3., 3.2, 3.3, 3.4. Bayesa Network Represetato Lecture 2 Mar 30, 20 CSE 55, Statstcal Methods, Sprg 20 Istructor: Su-I Lee Uversty of Washgto, Seattle Last tme & today Last tme Probablty theory

More information

Low-Cost Side Channel Remote Traffic Analysis Attack in Packet Networks

Low-Cost Side Channel Remote Traffic Analysis Attack in Packet Networks Low-Cost Sde Chael Remote Traffc Aalyss Attack Packet Networks Sach Kadloor, Xu Gog, Negar Kyavash, Tolga Tezca, Nkta Borsov ECE Departmet ad Coordated Scece Lab. IESE Departmet ad Coordated Scece Lab.

More information

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li Iteratoal Joural of Scece Vol No7 05 ISSN: 83-4890 Proecto model for Computer Network Securty Evaluato wth terval-valued tutostc fuzzy formato Qgxag L School of Software Egeerg Chogqg Uversty of rts ad

More information

Numerical Methods with MS Excel

Numerical Methods with MS Excel TMME, vol4, o.1, p.84 Numercal Methods wth MS Excel M. El-Gebely & B. Yushau 1 Departmet of Mathematcal Sceces Kg Fahd Uversty of Petroleum & Merals. Dhahra, Saud Araba. Abstract: I ths ote we show how

More information

Impact of Interference on the GPRS Multislot Link Level Performance

Impact of Interference on the GPRS Multislot Link Level Performance Impact of Iterferece o the GPRS Multslot Lk Level Performace Javer Gozalvez ad Joh Dulop Uversty of Strathclyde - Departmet of Electroc ad Electrcal Egeerg - George St - Glasgow G-XW- Scotlad Ph.: + 8

More information

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil ECONOMIC CHOICE OF OPTIMUM FEEDER CABE CONSIDERING RISK ANAYSIS I Camargo, F Fgueredo, M De Olvera Uversty of Brasla (UB) ad The Brazla Regulatory Agecy (ANEE), Brazl The choce of the approprate cable

More information

Reinsurance and the distribution of term insurance claims

Reinsurance and the distribution of term insurance claims Resurace ad the dstrbuto of term surace clams By Rchard Bruyel FIAA, FNZSA Preseted to the NZ Socety of Actuares Coferece Queestow - November 006 1 1 Itroducto Ths paper vestgates the effect of resurace

More information

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation Securty Aalyss of RAPP: A RFID Authetcato Protocol based o Permutato Wag Shao-hu,,, Ha Zhje,, Lu Sujua,, Che Da-we, {College of Computer, Najg Uversty of Posts ad Telecommucatos, Najg 004, Cha Jagsu Hgh

More information

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN Colloquum Bometrcum 4 ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN Zofa Hausz, Joaa Tarasńska Departmet of Appled Mathematcs ad Computer Scece Uversty of Lfe Sceces Lubl Akademcka 3, -95 Lubl

More information

Performance Attribution. Methodology Overview

Performance Attribution. Methodology Overview erformace Attrbuto Methodology Overvew Faba SUAREZ March 2004 erformace Attrbuto Methodology 1.1 Itroducto erformace Attrbuto s a set of techques that performace aalysts use to expla why a portfolo's performace

More information

Entropy-Based Link Analysis for Mining Web Informative Structures

Entropy-Based Link Analysis for Mining Web Informative Structures Etropy-Based Lk Aalyss for Mg Web Iformatve Structures Hug-Yu Kao, Sha-Hua L *, Ja-Mg Ho *, Mg-Sya Che Electrcal Egeerg Departmet Natoal Tawa Uversty Tape, Tawa, ROC E-Mal: {[email protected], [email protected]}

More information

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

The impact of service-oriented architecture on the scheduling algorithm in cloud computing Iteratoal Research Joural of Appled ad Basc Sceces 2015 Avalable ole at www.rjabs.com ISSN 2251-838X / Vol, 9 (3): 387-392 Scece Explorer Publcatos The mpact of servce-oreted archtecture o the schedulg

More information

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev The Gompertz-Makeham dstrbuto by Fredrk Norström Master s thess Mathematcal Statstcs, Umeå Uversty, 997 Supervsor: Yur Belyaev Abstract Ths work s about the Gompertz-Makeham dstrbuto. The dstrbuto has

More information

RUSSIAN ROULETTE AND PARTICLE SPLITTING

RUSSIAN ROULETTE AND PARTICLE SPLITTING RUSSAN ROULETTE AND PARTCLE SPLTTNG M. Ragheb 3/7/203 NTRODUCTON To stuatos are ecoutered partcle trasport smulatos:. a multplyg medum, a partcle such as a eutro a cosmc ray partcle or a photo may geerate

More information

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM 28-30 August, 2013 Sarawak, Malaysa. Uverst Utara Malaysa (http://www.uum.edu.my ) ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM Rosshary Abd. Rahma 1 ad Razam Raml 2 1,2 Uverst Utara

More information

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011 Cyber Jourals: Multdscplary Jourals cece ad Techology, Joural of elected Areas Telecommucatos (JAT), Jauary dto, 2011 A ovel rtual etwork Mappg Algorthm for Cost Mmzg ZHAG hu-l, QIU Xue-sog tate Key Laboratory

More information

A particle swarm optimization to vehicle routing problem with fuzzy demands

A particle swarm optimization to vehicle routing problem with fuzzy demands A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg, Ye-me Qa A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg 1,Ye-me Qa 1 School of computer ad formato

More information

Integrating Production Scheduling and Maintenance: Practical Implications

Integrating Production Scheduling and Maintenance: Practical Implications Proceedgs of the 2012 Iteratoal Coferece o Idustral Egeerg ad Operatos Maagemet Istabul, Turkey, uly 3 6, 2012 Itegratg Producto Schedulg ad Mateace: Practcal Implcatos Lath A. Hadd ad Umar M. Al-Turk

More information

A probabilistic part-of-speech tagger for Swedish

A probabilistic part-of-speech tagger for Swedish A probablstc part-of-speech tagger for Swedsh eter Nlsso Departmet of Computer Scece Uversty of Lud Lud, Swede [email protected] Abstract Ths paper presets a project for mplemetg ad evaluatg a probablstc

More information

CHAPTER 13. Simple Linear Regression LEARNING OBJECTIVES. USING STATISTICS @ Sunflowers Apparel

CHAPTER 13. Simple Linear Regression LEARNING OBJECTIVES. USING STATISTICS @ Sunflowers Apparel CHAPTER 3 Smple Lear Regresso USING STATISTICS @ Suflowers Apparel 3 TYPES OF REGRESSION MODELS 3 DETERMINING THE SIMPLE LINEAR REGRESSION EQUATION The Least-Squares Method Vsual Exploratos: Explorg Smple

More information

Automated Event Registration System in Corporation

Automated Event Registration System in Corporation teratoal Joural of Advaces Computer Scece ad Techology JACST), Vol., No., Pages : 0-0 0) Specal ssue of CACST 0 - Held durg 09-0 May, 0 Malaysa Automated Evet Regstrato System Corporato Zafer Al-Makhadmee

More information

Numerical Comparisons of Quality Control Charts for Variables

Numerical Comparisons of Quality Control Charts for Variables Global Vrtual Coferece Aprl, 8. - 2. 203 Nuercal Coparsos of Qualty Cotrol Charts for Varables J.F. Muñoz-Rosas, M.N. Pérez-Aróstegu Uversty of Graada Facultad de Cecas Ecoócas y Epresarales Graada, pa

More information

Chapter Eight. f : R R

Chapter Eight. f : R R Chapter Eght f : R R 8. Itroducto We shall ow tur our atteto to the very mportat specal case of fuctos that are real, or scalar, valued. These are sometmes called scalar felds. I the very, but mportat,

More information

Fault Tree Analysis of Software Reliability Allocation

Fault Tree Analysis of Software Reliability Allocation Fault Tree Aalyss of Software Relablty Allocato Jawe XIANG, Kokch FUTATSUGI School of Iformato Scece, Japa Advaced Isttute of Scece ad Techology - Asahda, Tatsuokuch, Ishkawa, 92-292 Japa ad Yaxag HE Computer

More information

The paper presents Constant Rebalanced Portfolio first introduced by Thomas

The paper presents Constant Rebalanced Portfolio first introduced by Thomas Itroducto The paper presets Costat Rebalaced Portfolo frst troduced by Thomas Cover. There are several weakesses of ths approach. Oe s that t s extremely hard to fd the optmal weghts ad the secod weakess

More information

Suspicious Transaction Detection for Anti-Money Laundering

Suspicious Transaction Detection for Anti-Money Laundering Vol.8, No. (014), pp.157-166 http://dx.do.org/10.1457/jsa.014.8..16 Suspcous Trasacto Detecto for At-Moey Lauderg Xgrog Luo Vocatoal ad techcal college Esh Esh, Hube, Cha [email protected] Abstract Moey lauderg

More information

Forecasting Trend and Stock Price with Adaptive Extended Kalman Filter Data Fusion

Forecasting Trend and Stock Price with Adaptive Extended Kalman Filter Data Fusion 2011 Iteratoal Coferece o Ecoomcs ad Face Research IPEDR vol.4 (2011 (2011 IACSIT Press, Sgapore Forecastg Tred ad Stoc Prce wth Adaptve Exteded alma Flter Data Fuso Betollah Abar Moghaddam Faculty of

More information

Constrained Cubic Spline Interpolation for Chemical Engineering Applications

Constrained Cubic Spline Interpolation for Chemical Engineering Applications Costraed Cubc Sple Iterpolato or Chemcal Egeerg Applcatos b CJC Kruger Summar Cubc sple terpolato s a useul techque to terpolate betwee kow data pots due to ts stable ad smooth characterstcs. Uortuatel

More information

ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS. Janne Peisa Ericsson Research 02420 Jorvas, Finland. Michael Meyer Ericsson Research, Germany

ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS. Janne Peisa Ericsson Research 02420 Jorvas, Finland. Michael Meyer Ericsson Research, Germany ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS Jae Pesa Erco Research 4 Jorvas, Flad Mchael Meyer Erco Research, Germay Abstract Ths paper proposes a farly complex model to aalyze the performace of

More information

Optimizing Software Effort Estimation Models Using Firefly Algorithm

Optimizing Software Effort Estimation Models Using Firefly Algorithm Joural of Software Egeerg ad Applcatos, 205, 8, 33-42 Publshed Ole March 205 ScRes. http://www.scrp.org/joural/jsea http://dx.do.org/0.4236/jsea.205.8304 Optmzg Software Effort Estmato Models Usg Frefly

More information

Optimization Model in Human Resource Management for Job Allocation in ICT Project

Optimization Model in Human Resource Management for Job Allocation in ICT Project Optmzato Model Huma Resource Maagemet for Job Allocato ICT Project Optmzato Model Huma Resource Maagemet for Job Allocato ICT Project Saghamtra Mohaty Malaya Kumar Nayak 2 2 Professor ad Head Research

More information

Application of Grey Relational Analysis in Computer Communication

Application of Grey Relational Analysis in Computer Communication Applcato of Grey Relatoal Aalyss Computer Commucato Network Securty Evaluato Jgcha J Applcato of Grey Relatoal Aalyss Computer Commucato Network Securty Evaluato *1 Jgcha J *1, Frst ad Correspodg Author

More information

On formula to compute primes and the n th prime

On formula to compute primes and the n th prime Joural's Ttle, Vol., 00, o., - O formula to compute prmes ad the th prme Issam Kaddoura Lebaese Iteratoal Uversty Faculty of Arts ad ceces, Lebao Emal: [email protected] amh Abdul-Nab Lebaese Iteratoal

More information

VIDEO REPLICA PLACEMENT STRATEGY FOR STORAGE CLOUD-BASED CDN

VIDEO REPLICA PLACEMENT STRATEGY FOR STORAGE CLOUD-BASED CDN Joural of Theoretcal ad Appled Iformato Techology 31 st Jauary 214. Vol. 59 No.3 25-214 JATIT & S. All rghts reserved. ISSN: 1992-8645 www.att.org E-ISSN: 1817-3195 VIDEO REPICA PACEMENT STRATEGY FOR STORAGE

More information

AnySee: Peer-to-Peer Live Streaming

AnySee: Peer-to-Peer Live Streaming ysee: Peer-to-Peer Lve Streamg School of Computer Scece ad Techology Huazhog Uversty of Scece ad Techology Wuha, 40074, Cha {xflao, hj, dfdeg }@hust.edu.c Xaofe Lao, Ha J, *Yuhao Lu, *Loel M. N, ad afu

More information

Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network

Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network Iteratoal Joural of Cotrol ad Automato Vol.7, No.7 (204), pp.-4 http://dx.do.org/0.4257/jca.204.7.7.0 Usg Phase Swappg to Solve Load Phase Balacg by ADSCHNN LV Dstrbuto Network Chu-guo Fe ad Ru Wag College

More information

Developing tourism demand forecasting models using machine learning techniques with trend, seasonal, and cyclic components

Developing tourism demand forecasting models using machine learning techniques with trend, seasonal, and cyclic components BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, 05, Vol.3, No. 4 Developg toursm demad forecastg models usg mache learg techques wth tred, seasoal, ad cyclc compoets S. Cakurt ad A. Subas Abstract

More information

Efficient Compensation for Regulatory Takings. and Oregon s Measure 37

Efficient Compensation for Regulatory Takings. and Oregon s Measure 37 Effcet Compesato for Regulatory Takgs ad Orego s Measure 37 Jack Scheffer Ph.D. Studet Dept. of Agrcultural, Evrometal ad Developmet Ecoomcs The Oho State Uversty 2120 Fyffe Road Columbus, OH 43210-1067

More information