Adversarial Classification

Size: px
Start display at page:

Download "Adversarial Classification"

Transcription

1 Adversaral Classfcaton Nlesh Dalv Pedro Domngos Mausam Sumt Sangha Deepak Verma Department of Computer Scence and Engneerng Unversty of Washngton, Seattle Seattle, WA , U.S.A. ABSTRT Essentally all data mnng algorthms assume that the datageneratng process s ndependent of the data mner s actvtes. However, n many domans, ncludng spam detecton, ntruson detecton, fraud detecton, survellance and counter-terrorsm, ths s far from the case: the data s actvely manpulated by an adversary seekng to make the classfer produce false negatves. In these domans, the performance of a classfer can degrade rapdly after t s deployed, as the adversary learns to defeat t. Currently the only soluton to ths s repeated, manual, ad hoc reconstructon of the classfer. In ths paper we develop a formal framework and algorthms for ths problem. We vew classfcaton as a game between the classfer and the adversary, and produce a classfer that s optmal gven the adversary s optmal strategy. Experments n a spam detecton doman show that ths approach can greatly outperform a classfer learned n the standard way, and (wthn the parameters of the problem) automatcally adapt the classfer to the adversary s evolvng manpulatons. Categores and Subject Descrptors H.2.8 [Database Management]: Database Applcatons data mnng; I.2.6 [Artfcal Intellgence]: Learnng concept learnng, nducton, parameter learnng; I.5. [Pattern Recognton]: Models statstcal; I.5.2 [Pattern Recognton]: Desgn Methodology classfer desgn and evaluaton, feature evaluaton and selecton; G.3 [Mathematcs of Computng]: Probablty and Statstcs multvarate statstcs General Terms Algorthms Keywords Cost-senstve learnng, game theory, nave Bayes, spam detecton, nteger lnear programmng Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. KDD 4, August 22 25, 24, Seattle, Washngton, USA. Copyrght 24 M /4/8...$5... INTRODUCTION Many major applcatons of KDD share a characterstc that has so far receved lttle attenton from the research communty: the presence of an adversary actvely manpulatng the data to defeat the data mner. In these domans, deployment of a KDD system causes the data to change so as to make the system neffectve. For example, n the doman of emal spam detecton, standard classfers lke nave Bayes were ntally qute successful (e.g., [23]). Unfortunately, spammers soon learned to fool them by nsertng non-spam words nto emals, breakng up spam ones wth spurous punctuaton, etc. Once spam flters were modfed to detect these trcks, spammers started usng new ones [4]. Effectvely, spammers and data mners are engaged n a never-endng game where data mners contnually come up wth new ways to detect spam, and spammers contnually come up wth new ways to avod detecton. Smlar arms races are found n many other domans: computer ntruson detecton, where new attacks crcumvent the defenses put n place aganst old ones [7]; fraud detecton, where perpetrators learn to avod the actons that prevously gave them away [5, 25]; counter-terrorsm, where terrorsts dsguse ther dentty and actvtes n ever-shftng ways []; aeral survellance, where targets are camouflaged wth ncreasng sophstcaton [22]; comparson shoppng, where merchants contnually change ther Web stes to avod wrappng by shopbots [3]; fle sharng, where meda companes try to detect and frustrate llegal copyng, and users fnd ways to crcumvent the obstacles [4]; Web search, where webmasters manpulate pages and lnks to nflate ther rankngs, and search engnes reengneer ther rankng functons to deflate them back agan [9, 6]; etc. In many of these domans, researchers have noted the presence of adaptve adversares and the need to take them nto account (e.g., [4, 5, ]), but to our knowledge no systematc approach for ths has so far been developed. The result s that the performance of deployed KDD systems n adversaral domans can degrade rapdly over tme, and much human effort and cost s ncurred n repeatedly brngng the systems back up to the desred performance level. Ths paper proposes a frst step towards automatng ths process. Whle complete automaton wll never be possble, we beleve our approach and ts future extensons have the potental to sgnfcantly mprove the speed and cost-effectveness of keepng KDD systems up to date wth ther adversares. Notce that adversaral problems cannot smply be solved by learners that account for concept drft (e.g., []): whle these learners allow the data-generatng process to change

2 over tme, they do not allow ths change to be a functon of the classfer tself. We frst formalze the problem as a game between a costsenstve classfer and a cost-senstve adversary (Secton 2). Focusng on the nave Bayes classfer (Secton 3), we descrbe the optmal strategy for the adversary aganst a standard (adversary-unaware) classfer (Secton 4), and the optmal strategy for a classfer playng aganst ths strategy (Secton 5). We provde effcent algorthms for computng or approxmatng these strateges. Experments n a spam detecton doman llustrate the sometmes very large utlty gans that an adversary-aware classfer can yeld, and ts ablty to co-evolve wth the adversary (Secton 6). We conclude wth a dscusson of future research drectons (Secton 7). 2. PROBLEM DEFINITION Consder a vector varable X = (X,..., X,..., X n), where X s the th feature or attrbute, and let the nstance space X be the set of possble values of X. An nstance x s a vector where feature X has the value x. Instances can belong to one of two classes: postve (malcous) or negatve (nnocent). Innocent nstances are generated..d. (ndependent and dentcally dstrbuted) from a dstrbuton P (X -), and malcous ones lkewse from P (X +). The global dstrbuton s thus P (X) = P (-)P (X -) + P (+)P (X +). Let the tranng set S and test set T be two sets of (x, y) pars, where x s generated accordng to P (X) and y s the true class of x. We defne adversaral classfcaton as a game between two players: Classfer, whch attempts to learn from S a functon y C = C(x) that wll correctly predct the classes of nstances n T, and Adversary, whch attempts to make Classfer classfy postve nstances n T as negatve by modfyng those nstances from x to x = A(x). (Adversary cannot modfy negatve nstances, and thus A(x) = x for all x -.) Classfer s characterzed by a set of cost/utlty parameters (see Table for a summary of the notaton used n ths paper):. V : Cost of measurng X. Dependng on ther costs, Classfer may choose not to measure some features. 2. (y C, y): Utlty of classfyng as y C an nstance wth true class y. Typcally, (+, -) < and (-, +) <, denotng the cost of msclassfyng an nstance (costs beng negatve utltes), and (+, +) >, (-, -) >. Adversary has a correspondng set of parameters:. W (x, x ) : Cost of changng the th feature from x to x. W (x, x ) = for all x. We wll also use W (x, x ) to represent the cost of changng an nstance x to x (whch s smply the sum of the costs of all the ndvdual feature changes made). 2. U A(y C, y): Utlty accrued by Adversary when Classfer classfes as y C an nstance of class y. Typcally, U A(-, +) >, U A(+, +) < and U A(-, -) = U A(+, -) =, and we wll assume ths henceforth. The goal of Classfer s to buld a classfer C that wll maxmze ts expected utlty, takng nto account that nstances may have been modfed by Adversary: = (x,y) X Y P (x, y) (C(A(x)), y) X X C (x) V () where Y = {+, -} and X C(x) {X,..., X n} s the set of features measured by C, possbly dependent on x. We call C the optmal strategy of Classfer. The goal of Adversary s to fnd a feature change strategy A that wll maxmze ts own expected utlty: U A = (x,y) X Y P (x, y) [U A(C(A(x)), y) W (x, A(x))] (2) We call A the optmal strategy of Adversary. Notce that Adversary wll not change nstances f the cost of dong so exceeds the utlty of foolng Classfer. For example, a spammer wll not modfy hs emals to the pont where they no longer help sell hs product. In practce, and U A are estmated by averages over T : = (/ T ) (x,y) T [(C(A(x)), y) X X C (x) V], etc. Gven two players, the actons avalable to each, and the payoffs from each combnaton of actons, classcal game theory s concerned wth fndng a combnaton of strateges such that nether player can gan by unlaterally changng ts strategy. Ths combnaton s known as a Nash equlbrum [7]. In our case, the actons are classfers C and feature change strateges A, and the payoffs are and U A. As the followng theorem shows, some realzatons of the adversaral classfcaton game always have a Nash equlbrum. Theorem 2.. Consder a classfcaton game wth a bnary cost model for Adversary,.e., gven a par of nstances x and x, Adversary can ether change x to x (ncurrng a unt cost) or t cannot (the cost s nfnte). Ths game always has a Nash equlbrum, whch can be found n tme polynomal n the number of nstances. We omt the proof due to lack of space. Unfortunately, the calculaton of the Nash equlbrum requres complete and perfect knowledge of the probabltes of all the nstances, whch n practce Adversary and Classfer wll not have. Computng Nash equlbra wll generally be ntractable. The chef dffculty s that even n fnte domans the number of avalable actons s doubly exponental n the number of features n. The best known algorthms for fndng Nash equlbra n general (nonzero) sum games have worst-case exponental tme n the number of actons, makng them trply exponental n our case. Even usng the more general noton of correlated equlbra, for whch polynomal algorthms exst, the computatonal cost s stll doubly exponental. Recent years have seen substantal work on computatonally tractable approaches to game theory, but they focus manly on scalng up wth the number of players, not the number of actons [2]. Further, equlbrum strateges, ether mxed or pure, assume optmal play on the part of the opponent, whch s hghly unrealstc n our case. When ths assumpton s not met, standard game theory gves no gudance on how to play. (Ths, and computatonal ntractablty, have sgnfcantly lmted ts practcal use.) We thus leave the general exstence and form of Nash or other equlbra n adversaral classfcaton as an open

3 Symbol Meanng X = (X, X 2,..., X n) Instance. P (x) Probablty dstrbuton of untanted data. X th feature (attrbute). X, X Doman of X and X, respectvely. x, x An nstance and the th attrbute of that nstance. S, T Tranng and test set. y C = C(x) The Classfer functon. x A = A(x) The Adversary transformaton. V Cost of measurng X. (y C, y) Utlty for Classfer of classfyng as y C an nstance of class y. W (x, x ), W (x, x ) Cost of changng the th feature from x to x and nstance x to x, respectvely. U A(y C, y) Utlty accrued by Adversary when Classfer classfes as y C an nstance of class y. X C(x) Set of features measured by C. ( ) LO C(x ) Log-odds or contrbuton of th attrbute to nave Bayes classfer ( ln P (X =x +) P (X =X -) ). gap(x) gap(x) > classfes x as postve LO C(x) (-,-) (+,-) (+,+) (-,+). U A Adversary s utlty gan from successfully camouflagng a postve nstance (U A(-,+) U A(+,+)). LO,x Gan towards makng x negatve by changng th feature to x (LO C(x ) LO C(x )). MCC(x) Nearest nstance (costwse) to x whch Nave Bayes classfes as negatve. x [=x ] An nstance dentcal to x except that th attrbute s changed to x X. P A(x) Probablty dstrbuton after Adversary has modfed the data. Table : Summary of the notaton used n ths paper. queston, and propose nstead to start from a set of assumptons that more closely resembles the way adversaral classfcaton takes place n practce: Classfer ntally operates assumng the data s untanted (.e., A(x) = x for all x); Adversary then deploys an optmal plan A(x) aganst ths classfer; Classfer n turn deploys an optmal classfer C(A(x)) aganst ths adversary, etc. Ths approach has some commonalty wth evolutonary game theory [26], but the latter makes a number of assumptons that are napproprate n our case (nfnte populaton of players repeatedly matched at random, symmetrc payoff matrces, players havng offsprng proportonal to average payoff, etc.). In ths paper, we focus manly on the sngle-shot verson of the adversaral classfcaton game: one move by each of the players. We touch only brefly on the repeated verson of the game, where players contnue to make moves ndefntely. A number of learnng approaches to repeated games have been proposed [6], but these are also ntractable n large acton spaces. Other learnng approaches focus on games wth sequental states (e.g., [5]), whle classfcaton s stateless. We make the assumpton, standard n game theory, that all parameters of both players are known to each other. Although ths s unlkely to be the case n practce, t s generally plausble that each player wll be able to make a rough guess of the other s (and, ndeed, ts own) parameters. Classfcaton wth mprecsely known costs and other parameters has been well studed n KDD (e.g., [2]), and extendng ths to the adversaral case s an mportant tem for future work. 3. COST-SENSITIVE LEARNING In ths paper, we wll focus on nave Bayes as the classfer to be made adversary-aware [2]. Nave Bayes s attractve because of ts smplcty, effcency, and excellent performance n a wde range of applcatons, ncludng adversaral ones lke spam detecton [23]. Nave Bayes estmates the probablty that an nstance x belongs to class y as P (y x) = P (y) P (y) P (x y) = P (x) P (x) n P (x y) (3) and predcts the class wth hghest P (y x). The denomnator P (x) s ndependent of the class, and can be gnored. P (x y) = n = P (x y) s the nave Bayes assumpton. The relevant probabltes are learned smply by countng the correspondng occurrences n the tranng set S. We begn by extendng nave Bayes to ncorporate the measurement costs V and classfcaton utltes (y C, y) defned n the prevous secton, and to maxmze the expected utlty (Equaton ). For now, we assume that no adversary s present (.e., A(x) = x for all x). We remove ths restrcton n the next sectons. Cost-senstve learnng has been the object of substantal study n the KDD lterature [, 27]. Gven a classfcaton utlty matrx (y C, y), the Bayes optmal predcton for an nstance x s the class y C that maxmzes the condtonal utlty U(y C x): = U(y C x) = y Y P (y x)(y C, y) (4) Ths s smply Equaton condtoned on a partcular x, and gnorng the adversary and measurement costs V. In nave Bayes, P (y x) s computed usng Equaton 3. Measurement costs are ncorporated nto the choce of whch subset of features to measure, X C {X,..., X n}. Intutvely, we want to measure feature X only f ths mproves the expected utlty by more than V. Snce a feature s effect on wll n general depend on what other features are beng measured, fndng the optmal X C requres a potentally exponental search. In practce, X C can be found usng standard feature selecton algorthms wth as the evaluaton functon. We use greedy forward selecton ([3])

4 n our experments. (Feature selecton can also be carred out onlne, but we do not pursue that approach here.) 4. ADVERSARY STRATEGY In ths secton, we formalze the noton of an optmal strategy for Adversary. We model t as a constraned optmzaton problem, whch can be formulated as an nteger lnear program. We then propose a pseudo-lnear tme soluton to the nteger LP, based on dynamc programmng. We make the followng assumptons. Assumpton. Complete Informaton: Both Classfer and Adversary know all the relevant parameters: V,, W, U A and the nave Bayes model learned by Classfer on S (ncludng X C, P (y), and P (x y) for each feature and class). Assumpton 2. Adversary assumes that Classfer s unaware of ts presence (.e., Adversary assumes that C(x) s the nave Bayes model descrbed n the prevous secton). To defeat Classfer, Adversary needs only to modfy features n X C, snce the others are not measured. From Equaton 3: log P (+ x) P (+) = log P (- x) P (-) + log P (x +) P (x x X -) C For brevty, we wll use the notaton LO C(x) = log P (+ x) P (- x) and LO C(x ) = log P (x +) P (x, where LO s short for log odds. -) Nave Bayes classfes an nstance x as postve f the expected utlty of dong so exceeds that of classfyng t as negatve,.e., f (+, +)P (+ x) + (+, -)P (- x) > (-, +) P (+ x) + (-, -)P (- x), or P (+ x) UC(-, -) UC(+, -) > P (- x) (+, +) (-, +) Let the log of the rght hand sde be LT () (log threshold). Then nave Bayes classfes nstance x as postve f LO C(x) > LT (), or equvalently f gap(x) >, where gap(x) = LO C(x) LT (). If the nstance s classfed as negatve, Adversary does not need to do anythng. Let us assume, then, that x s classfed as postve,.e., gap(x) >. The objectve of Adversary s to make some set of feature changes to x that wll cause t to be classfed as negatve, whle ncurrng the mnmum possble cost. Ths causes Adversary to gan a utlty of U A = U A(-, +) U A(+, +). Thus Adversary wll transform x as long as the total cost ncurred s less than U A and not otherwse. We formulate the problem of fndng an optmal strategy for Adversary as an nteger lnear program. Recall that X s the doman of X. For x X, let δ,x be an nteger (bnary) varable whch takes the value one f the feature X s changed from x to x, and zero otherwse. Let the new data tem thus obtaned be x. The cost of transformng x to x s W (x, x ) = W (x, x ), and the resultng change n log odds s LO C(x ) LO C(x) = LO C(x ) LO C(x ). Defne LO,x = LO C(x ) LO C(x ). Ths s the gan n Adversary s objectve of makng the nstance negatve. Note that LO,x = ; ths represents the case where X has not been changed. To transform x so that the new nstance s classfed as negatve, Adversary needs to change the values of (5) (6) some features such that the sum of ther gans (decrease n log odds) s more that gap(x). Thus, to fnd the mnmum cost changes requred to transform ths nstance nto a negatve nstance, we need to solve the followng nteger (bnary) lnear program: mn X X C x X X X C x X δ,x {, }, W (x, x )δ,x s.t. LO,x δ,x gap(x) x X δ,x The bnary δ,x values encode whch features are changed to whch values. The optmzng equaton mnmzes the cost ncurred n ths transformaton. The frst constrant makes sure that the new nstance wll be classfed as negatve. The second constrant encodes the requrement that a feature can only have a sngle value n an nstance. We wll call the transformed nstance obtaned by solvng ths nteger lnear program the mnmum cost camouflage (MCC) of x. In other words, MCC(x) s the nearest nstance (costwse) to x whch nave Bayes classfes as negatve. After solvng ths nteger LP, Adversary transforms the nstance only f the mnmum cost obtaned s less than U A. Therefore, lettng (x) be the nave Bayes class predcton for x, { MCC(x) f (x) = +, W (x, MCC(x)) < U A A(x) = x otherwse (7) The above nteger (bnary) LP problem s NP-hard, as the - knapsack problem can be reduced to t [8]. However, a pseudo-lnear tme algorthm can be obtaned by dscretzng LO C, whch allows dynamc programmng to be used. Although the algorthm s approxmate, t can compute the soluton to arbtrary precson. The procedure s shown n Algorthm. Functon Fnd- MCC(, w) computes the mnmum cost needed to change the log odds of x by w usng only the frst features. It returns the par (MnCost, MnLst) where MnCost s the mnmum cost and MnLst s a lst of feature-value pars denotng the changes to be made to x. (In each par, s the feature ndex and x s the value t should be changed to.) To obtan the optmal adversary strategy, we need to compute FndMCC(n, W ), where the nteger W s gap(x) after dscretzaton and n s the number of features n X C. Note that LO,x s now a (non-negatve) nteger n the dscretzed log odds space. The algorthm can be effcently mplemented usng topdown recurson wth memozaton (so that no recursve call s computed more than once). Note that although the features can be consdered n any order, some orderngs may fnd solutons faster than the others. If, n the dscretzed space, nstance x requres a gap of W to be flled by the transformaton, then the algorthm runs n tme at most O(W X ) (snce the for loop n FndMCC s called at most W tmes and each tme t takes O( X ) tme). Hence t s pseudo-lnear n the number of features. Pseudo-

5 lnearty may be expensve for large values of W or for cases where features have large domans. We now present two prunng rules, one for use n the frst stuaton, and one for the second. Algorthm FndMCC(,w) f w then return (, {}) end f f = then return (,Undefned) end f MnCost MnLst Undefned for x X do f LO,x then (CurCost, CurLst) FndMCC(, w LO,x ) CurCost CurCost + W (x, x ). CurLst CurLst + (, x ). f CurCost < MnCost then MnCost CurCost MnLst CurLst end f end f end for return (MnCost, MnLst) Algorthm 2 A(x) W gap(x) (dscretzed). (MnCost, MnLst) FndMCC(n, W ) f (x) = + and MnCost < U A then newx x for all (, x ) MnLst do newx x end for return newx else return x end f Lemma 4.. If then A(x) = x. max,x ( ) LO,x W (x, x ) < gap(x) U A Ths lemma s easy to prove and can be used to detect the nstances for whch MnCost > U A. Instances whch are postve by very large gap(x) values can thus be pruned early on, and we need to run the algorthm only for more reasonable values of gap(x). Our second prunng strategy can be employed n stuatons where the cost metrc s suffcently coarsely dscretzed. We globally sort all the (, x ) tuples n ncreasng order of W (x, x ). For dentcal values of W (x, x ), we use decreasng order of LO,x as the secondary key. For a partcular, W (x, x ) combnaton, over all, we can remove all but the frst entry n the lst. Ths s vald because, f the X s changed n the optmal soluton, then takng the value x wth the hghest LO,x wll also yeld the optmal soluton. We can prune even further by only consderng the frst k tuples n each W such that j= LO j,x > gap(x) j and k j= LO j,x < gap(x). It s easy to see that ths j prunng does not affect the optmal soluton. Thus, f the feature-changng costs W are suffcently coarsely dscretzed, we wll never need to consder more than a few tuples for each nteger value of W. Our algorthm wll thus run effcently even when the domans of features are large. 5. CLASSIFIER STRATEGY We now descrbe how Classfer can adapt to the adversary strategy descrbed n the prevous secton. We derve the optmal C(x) takng nto account A(x), and gve an effcent algorthm for computng t. We make the followng addtonal assumptons. Assumpton 3. Classfer assumes that Adversary uses ts optmal strategy to modfy test nstances (Algorthm 2). Assumpton 4. The tranng set S used for learnng the ntal nave Bayes classfer s not tampered wth by Adversary (.e., S s drawn from the real dstrbuton of adversaral and non-adversaral data). Assumpton 5. X X, W (x, x ) s a sem-metrc,.e., t has the followng propertes:. W (x, x ) and the equalty holds ff x = x 2. W (x, x ) W (x, x ) + W (x, x ) The above also mples that W (x, x ) W (x, x )+W (x, x ). The trangular nequalty for cost holds n most real domans. Ths s because to change a feature from x to x the adversary always has the opton of changng t va x,.e., wth x as an ntermedate value. The goal of Classfer, as n Secton 3, s to predct for each nstance x the class that maxmzes ts condtonal utlty (Equaton 4). The dfference s that now we want to take nto account the fact that Adversary has tampered wth the data. Of all the probabltes used by Classfer (Equaton 3), the only one that s changed by Adversary s P (x +); P (+), P (-) and P (x -) reman unaltered. Let P A(x +) be the post-adversary verson of P (x +). Then P A(x +) = x X P (x +)P A(x x, +) (8) In other words, the probablty of observng an nstance x s the probablty that the adversary generates some nstance x and then modfes t nto x, summed over all x. Snce P A(x x, +) = f A(x) = x and P A(x x, +) = otherwse, P A(x +) = x X A (x ) P (x +) (9) where X A(x ) = {x : x = A(x)}. There are two cases where Adversary wll leave an nstance x untampered (.e., A(x) = x): when nave Bayes predcts t s negatve, snce then no acton s necessary, and when there s no transformaton of x whose cost s lower than the utlty ganed by makng t appear negatve. Thus

6 P A(x +) = x X A (x ) P (x +) + I(x )P (x +) () where X A(x ) = X A(x ) \ {x }, I(x ) = f (x ) = - or W (x, MCC(x )) U A, and I(x ) = otherwse (see Equaton 7 and Algorthm 2). The untampered probabltes P (x +) are estmated usng the nave Bayes model (Equaton 3): P (x +) = X X C P (X = x +). The optmal adversary-aware classfcaton algorthm C(x ) s shown below, wth ˆP () used to denote the probablty P () estmated from the tranng data S. ˆPA(x +) s gven by Equaton usng the emprcal estmates of P (x +). The second term n Equaton, I(x )P (x +), s easy to compute gven calls to (x ) and Algorthm to determne f x has a feasble camouflage. The remander of ths secton s devoted to effcently computng the frst term, x X A (x ) P (x +). Algorthm 3 C(x ) P - x ˆP (-) ˆP (X = x -) P + x ˆP (+) ˆP A(x +) U(+ x ) P + x UC(+, +) + P - x UC(-, +) U(- x ) P + x UC(-, +) + P - x UC(-, -) f U(+ x ) > U(- x ) then return + else return - end f One soluton s to terate through all possble postve examples and check f x s ther mnmum cost camouflage. Ths s, of course, not feasble. We now study some theoretcal propertes of the MCC functon whch wll later be used to prune ths search. Recall that f (x) =- then gap(x) < and vce versa. We defne x [=x ] as a data nstance whch s dentcal to x except that ts th feature s changed to x. Lemma 5.. Let x A be any postve nstance and let x = MCC(x A). Then,, (x A) x gap(x ) + LO C((x A) ) LO C(x ) > Proof. Let x = x [=(x A ) ]. Ths mples that W (xa, x ) < W (x A, x ), snce x dffers from x A on one less feature than x. Also gap(x ) = gap(x ) + LO C((x A) ) LO C(x ). Snce x s MCC(x A) and W (x A, x ) < W (x A, x ), (x ) must be +, and therefore gap(x ) >, provng the result. Gven a negatve nstance x, for each feature we compute all values v that satsfy Lemma 5.. To compute X A(x ), we only need to take combnatons of these feature-value pars and check f x s ther MCC. Ths can substantally reduce the number of postve nstances n our search space. The search space can stll potentally contan an exponental number of nstances. However, after we employ the next theorem, we obtan a fast algorthm for estmatng the set X A(x ). Notce that the optmal feature subset X C for the adversary-aware classfer may be dfferent from the adversary-unaware one, but can be found usng the same methods (see Secton 3). Theorem 5.2. Let x A be a postve nstance such that x = MCC(x A). Let D be the set of features that are changed n x A to obtan x. Let E be a non-trval subset of D, and let x A be an nstance that matches x for all features n E and x A for all others,.e., (x A) = x f X E, (x A) = (x A) otherwse. Then x = MCC(x A). Proof. By contradcton. Suppose x = MCC(x A) and x x. Then W (x A, x ) < W (x A, x ). Also, snce E D, by defnton of W (x, y) we have W (x A, x ) = W (x A, x A) + W (x A, x ). So by the trangle nequalty W (x A, x ) W (x A, x A) + W (x A, x ) < W (x A, x A) + W (x A, x ) = W (x A, x ) Thus W (x A, x ) < W (x A, x ), whch gves a contradcton, snce then x MCC(x A). Ths completes the proof. The above theorem mples that f x A s a postve nstance such that x MCC(x A) then x cannot be the MCC of any other nstance x A such that the changed features from x A to x form a superset of the changed features from x A to x. We now use the followng result to obtan bounds on X A(x ). Corollary 5.3. Let F V be the set of feature-value pars that satsfy Lemma 5.. Let GV F V be such that (, x ) GV f x [=x ] X A(x ). Then for every x A X A(x ), the set of feature-value pars where x A and x dffer form a subset of GV. From the above corollary, after we compute GV, we only need to consder the combnatons of feature-value pars that are n GV and change those n the observed nstance x. Theorem 5.2 also mples that performng sngle changes from GV returns nstances n X A(x ). Ths gves us the followng bounds on x X A (x ) P (x +). Theorem 5.4. Let x be any nstance and let GV be the set defned n Corollary 5.3. Let G = { x (, x ) GV } and let X G = {x X (, x ) GV }. Then P (x [=x ] +) P (x +) (,x ) GV x X A (x ) + G x X G P (x [=x ] +) P (x +) Proof. The proof of the frst nequalty follows drectly from Theorem 5.2, whch states that changng any sngle feature of x to any value n GV returns an nstance from X A(x ). To prove the second nequalty, we observe that the expresson on the rght sde, when expanded, gves the sum of probabltes of all possble changes n x due to the set GV, and X A(x ) s a subset of those nstances. Gven these bounds, we can classfy a test nstance as follows. If pluggng the lower bound nto Algorthm 3 gves U(+ x ) > U(- x ), then the nstance can be safely classfed as postve. Smlarly, f usng the upper bound gves U(+ x ) < U(- x ), then the nstance s negatve. If GV s large, so s the lower bound on P A(X A(x ) +). If GV s small, we can do an exhaustve search over the subsets of GV and check f each of the tems consdered belongs to X A(x ). In our experments we fnd that usng the lower bound for makng predctons works well n practce.

7 6. EXPERIMENTS We mplemented an adversaral classfer system for the spam flterng doman. Spam s an attractve testbed for our methods because of ts practcal mportance, ts rapdly evolvng adversaral nature, the wde avalablty of data (n contrast to many other adversaral domans), the fact that nave Bayes s the de facto standard classfer n ths area, and ts rchness as a challenge problem for KDD [4]. One dsadvantage of spam as a testbed s that feature measurement costs are generally neglgble, leavng ths part of our framework untested. (In contrast, n a doman lke counterterrorsm feature measurements are a major ssue, often requrng large numbers of personnel and expensve equpment, rasng prvacy ssues, and mposng costs on mllons of ndvduals and transactons.) We used the followng two datasets n our experments: Lng-Spam [24]: Ths corpus contans the legtmate dscussons on a lngustcs malng lst and the spam mals receved by the lst. There are 242 non-spam messages and 48 spam ones. Thus, around 6.6% of the corpus s spam. Emal-Data [9]: Ths corpus conssts of texts from 43 emals, wth 642 non-spam messages (conferences (37) and jobs (272)) and 789 spam ones. Each of these datasets was dvded nto ten parts for tenfold cross-valdaton. We defned three scenaros, as descrbed below, and appled our mplementaton of nave Bayes () and the adversary-aware classfer () to each. We used fle [2] for preprocessng emals. 6. Scenaros The three spam flterng scenaros that we mplemented dffer n how the emal s represented for the classfer, how the adversary can modfy the features, and at what cost. Add Words (AW): Ths the smplest scenaro. The bnomal model of text for classfcaton s used [8]: there s one Boolean feature per word, denotng whether or not the word s present n the emal. The only way to modfy the emal s by addng words whch are not already present, and each word added ncurs unt cost. Ths s akn to sayng that the orgnal mal has content that the spammer s not wllng to change, and thus the spammer only adds unnecessary words to fool the spam detector. In ths model, Adversary s strategy reduces to a greedy search where t adds words n decreasng order of LO C. Add Length (AL): Ths model s very smlar to AW, except that the cost of addng a word s proportonal to the number of characters n t. Ths corresponds to a hypothetcal stuaton where Adversary needs to pay a certan amount per bt of mal transmtted, and wants to mnmze the number of characters sent. Synonym (SYN): Generally, spammers want to avod detecton whle preservng the semantc content of ther messages. Thus, n ths scenaro we consder the case where Adversary changes the mal by replacng the exstng words by other semantcally smlar words. For example, a spammer attemptng to sell a product would lke to send emals clamng t to be cheap, but (+, +) (+, -) (-, +) (-, -) U A 2 / / Table 2: Utlty matrces for Adversary and Classfer used n the experments. wthout the use of words lke free, sale, etc. Ths s because the nave Bayes classfer uses the presence or absence of specfc words wth hgh LO C to classfy emals, ndependent of ther actual meanng. Gven the above ntent, we defne ths scenaro as follows. We use the multnomal model of text [8]: an emal s vewed a sequence of word postons, wth one feature per poston, and the doman of each feature s the set of words n the vocabulary. In ths case, the number of tmes a word occurs n an emal s mportant. However, the word order s dsregarded (.e., the probablty of word occurrence s assumed to be ndependent of locaton). For each word, we obtan a lst of synonyms from the WordNet lexcal database [28]. A word n an emal can then be changed only to one of ts synonyms, at unt cost. It s easy to see that the costs used n all scenaros are metrcs, so we can apply Lemma 5. and Theorem 5.2. The U A classfcaton utlty matrx for Adversary we used s such that whenever a spam emal s classfed as non-spam the adversary receves a utlty of 2, and all other entres are. Thus, n the SYN and AW scenaros 2 word replacements/addtons are allowed. In the AL scenaro, the cost of addng a character s set to., and as a result 2 character addtons are allowed. For Classfer, we ran the experments wth three dfferent utlty matrces (). All matrces had a utlty of for correctly classfyng an emal and a penalty (negatve utlty) of for ncorrectly classfyng a spam emal as nonspam. The penalty for ncorrectly classfyng a non-spam emal as spam was set to n one matrx, n another, and n the thrd. Ths reflects the fact that, n spam flterng, the crtcal and domnant cost s that of false postves: lettng a sngle spam emal get through to the user s a relatvely nsgnfcant event, but flterng out a non-spam emal s hghly undesrable (and potentally dsastrous). The dfferent (+, -) values correspond to the dfferent values of the λ parameter n Sakks et al [24]. Table 2 summarzes the utlty parameters used n the experments. 6.2 Results The results of runnng the varous algorthms on the Lng- Spam and Emal-data datasets are shown n Fgures and 2 respectvely. The fgures show the average utltes obtaned (wth a maxmum value of.) by nave Bayes and the adversary-aware classfer under the dfferent scenaros and dfferent matrces. The utlty of nave Bayes on the orgnal, untampered data ( PLAIN ) s represented by the black bar on the left. The remanng black bars represent the performance of nave Bayes on tanted data n the three scenaros, and the whte bars the performance of the correspondng adversary-aware classfer. We observe that Adversary sgnfcantly degrades the performance of nave Bayes n all three scenaros and wth all three Classfer utlty matrces. Ths effect s more pronounced n the Emal-Data set because t has a hgher percentage of spam

8 Avg. Utlty (+, )= PL AW SYN AL Scenaros Avg. Utlty (+, )= PL AWSYN AL Scenaros Avg. Utlty (+, )= PL AWSYN AL Scenaros Fgure : Utlty results on the Lng-Spam dataset for dfferent values of (+, -). Avg. Utlty (+, )= PL AW SYN AL Scenaros Avg. Utlty (+, )= PL AWSYN AL Scenaros Avg. Utlty (+, )= PL AWSYN AL Scenaros Fgure 2: Utlty results on the Emal-Data set for dfferent values of (+, -). emals than Lng-Spam. For nave Bayes on Emal-Data, the cost of msclassfyng spam emals exceeds the utlty of the correct predctons, causng the overall utlty to be negatve. In contrast, Classfer was able to correctly dentfy a large percentage of the spam emals n all cases, and ts accuracy on non-spam emals was also qute hgh. To help n nterpretng these results, we report the numbers of false negatves and false postves for the Lng-Spam dataset n Table 3. We observe that as the msclassfcaton penalty for non-spam ncreases, fewer non-spam emals are classfed ncorrectly, but naturally more spam emals are msclassfed as non-spam. Notce that the adversaral classfer never produces false postves (except for the SYN scenaro wth (+, -) = ). As a result, ts average utlty stays approxmately constant even when (+, -) changes by two orders of magntude. An nterestng observaton s that Adversary s manpulatons can actually help Classfer to reduce the number of false postves. Ths s because Adversary s unlkely to send a spam mal unaltered, and as a result many non-spam emals whch were prevously (+, -) Classfer FN FP FN FP FN FP -PLAIN AW AW AL AL SYN SYN Table 3: False postves and false negatves for nave Bayes and the adversary-aware classfer on the Lng-Spam dataset. The total number of postves n ths dataset s 48, and the total number of negatves s 242. classfed as postve are now classfed as negatve. We also compared the runnng tmes of our algorthms for the three scenaros, for both the adversary and classfer strateges. For both AW and SYN models, the average runnng tmes were less than 5 ms per emal. For AL, the average runnng tme of the classfer strategy was less than 5 ms per mal whle the runnng tme of the adversary strategy was around 5 ms per emal. The adversary runnng tme for AW was small because one can use a smple greedy algorthm to mplement the adversary strategy. In the SYN model, the search space s small because there are few synonyms per word. Hence the tme taken by both algorthms s small. However, n the AL model, when the LO C of emals was hgh (> 5), the adversary took longer. On the other hand, the adversaral classfer, after usng the prunng strateges, had to consder very few nstances and these had small LO C. Hence ts runnng tme was qute small. From the experments we can conclude that n practce we can use the prunng strateges for Adversary and Classfer to reduce the search space and tme wthout compromsng accuracy. To smulate the effects of non-adversaral concept drft (a realty n spam and many other adversaral domans), we also tred classfyng the mals n the Emal-data set usng and traned on the Lng-Spam data. As the frequences of spam and non-spam mals are dfferent n the two datasets, we ran the classfers wthout consderng the class prors. For both algorthms, the results obtaned were only margnally worse than the results obtaned by tranng on the Emal-data set tself, demonstratng the robustness of the algorthms. 6.3 Repeated Game In Sectons 4 and 5 we dscussed one round of the game that goes on between Adversary and Classfer. It conssts of one ply of Adversary n whch t fnds the best strategy to fool Classfer and then one ply of Classfer to adapt to t. Both partes can contnue playng ths game. However, Classfer s no longer usng a smple nave Bayes algorthm. In these experments, we make the smplfyng assumpton that Adversary contnues to model the classfer as Nave Bayes, and uses the technques that we have developed for Nave Bayes. At the end of each round, Adversary learns a Nave Bayes classfer based on the outputs of the actual classfer that Classfer s usng n that round. We denote the classfer used by Classfer n round by C.

9 Let be the classfer that Adversary learns from t. Then A (x) s defned as the optmal adversary strategy (as n Algorthm 2) to fool nstead of the orgnal learned on S. The data comng from Adversary n round s A appled to the orgnal test data to produce T,.e., T = A (T ). Classfer uses Algorthm 3 based on to classfy T,.e., Y = C (T ). The key nsght s that a new nave Bayes model can be traned on (T, Y ) and that can serve as the Classfer for the next round. We compared the performance of wth that of C and found them to be very smlar, justfyng our assumpton, as Adversary s not reactng to a crppled Classfer but to one whch performs almost as well as the optmal Classfer. Ths procedure can then repeated for an arbtrary number of rounds. Fgure 3 shows the results of ths experment on the Lng- Spam dataset for the AW scenaro. The X-axs s round of the game, and the Y-axs s the average utlty obtaned by Y (the th adversary-aware classfer). The graphs also show the average utlty obtaned by (T ), to demonstrate the effect of usng an adversary-aware classfer at each round. In all rounds of the game, Classfer usng the adversaryaware strategy performs sgnfcantly better than the plan nave Bayes. As expected, the dfference s hghest when the penalty for msclassfyng non-spam s. Furthermore, n ths scenaro Classfer and Adversary never reach an equlbrum, and utlty alternates between two values. Ths s surprsng at frst glance, but a closer examnaton elucdates the reason. In the AW scenaro, Adversary can only add words. So the only way of tamperng wth an nstance s to add good words wth very low (negatve) LO C (based on n the th round). Let the top few good words be GW. These would have a hgh frequency of occurrence n spam mals of T. When s learned on (T, Y ) these words no longer have a low LO C and hence are not n GW. Thus, A + gnores these words, makng them have a hgh LO C n + and be n GW +! Ths phenomenon causes LO C values for a word to oscllate, gvng rse to the perodc average utlty n Fg. 3. Avg. Utlty Round () () () () () () Fgure 3: Utlty of nave Bayes and the adversaral classfer for a repeated game n the AW scenaro and Lng-Spam dataset. The number n parentheses s (+,-). 7. FUTURE WORK Ths paper s only the frst step n a potentally very rch research drecton. The next steps nclude: Repeated games. In realty, Adversary and Classfer never cease to evolve aganst each other. Thus, we need to fnd the optmal strategy A for Adversary takng nto account that an adversaral classfer C(A(x)) s beng used, then fnd the optmal strategy for Classfer takng A nto account, and so on ndefntely. To what extent ths can be done analytcally s a key open queston. Theory. We would lke to answer questons lke: What are the most general condtons under whch adversaral classfcaton problems have Nash or correlated equlbra? If so, what form do they take, and are there cases where they can be computed effcently? Under what condtons do repeated games converge to these equlbra? Etc. Incomplete nformaton. When Classfer and Adversary do not know each other s parameters, and Adversary does not know the exact form of the classfer, addtonal learnng needs to occur, and the optmal strateges need to be made robust to mprecse knowledge. Approxmately optmal strateges. When fndng the optmal strategy s too computatonally expensve, approxmate solutons and weaker notons of optmalty become necessary. Also, real-world adversares wll often act suboptmally, and t would be good to take ths nto account. Generalzaton to other classfers. We would lke to extend the deas n ths paper to classfers lke decson trees, nearest neghbor, support vector machnes, etc. Interacton wth humans. Because adversares are resourceful and unpredctable, adversaral classfers wll always requre regular human nterventon. The goal s to make ths as easy and productve as possble. For example, extendng the framework to allow new features to be added at each round of the game could be a good way to combne human and automatc refnement of the classfer. Multple adversares. Classfcaton games are often played aganst more than one adversary at a tme (e.g., multple spammers, ntruders, fraud perpetrators, terrorst groups, etc.). Handlng ths case s a natural but nontrval extenson of our framework. Varants of the problem. Our problem defnton does not ft all classfcaton games, but t could be extended approprately. For example, adversares may produce nnocent as well as malcous nstances, they may delberately seek to make the classfer produce false postves, detecton of some malcous nstances may deter them from producng more, etc. Other domans and tasks. We would lke to apply adversaral classfers to computer ntruson detecton, fraud detecton, face recognton, etc., and to develop adversaral extensons to related data mnng tasks (e.g., adversaral rankng for search engnes).

10 8. CONCLUSION In domans rangng from spam detecton to counter-terrorsm, classfers have to contend wth adversares manpulatng the data to produce false negatves. Ths paper formalzes the problem and extends the nave Bayes classfer to optmally detect and reclassfy tanted nstances, by takng nto account the adversary s optmal feature-changng strategy. When appled to spam detecton n a varety of scenaros, ths approach consstently outperforms the standard nave Bayes, sometmes by a large margn. Research n ths drecton has the potental to produce KDD systems that are more robust to adversary manpulatons and requre less human nterventon to keep up wth them. KNOWLEDGMENTS We are grateful to Danel Lowd, Foster Provost and Ted Senator for ther nsghtful comments on a draft of ths paper. Ths research was partly supported by a Sloan Fellowshp awarded to the second author. 9. REFERENCES [] P. Domngos. MetaCost: A general method for makng classfers cost-senstve. In Proceedngs of the Ffth M SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pages 55 64, San Dego, CA, 999. M Press. [2] P. Domngos and M. Pazzan. On the optmalty of the smple Bayesan classfer under zero-one loss. Machne Learnng, 29:3 3, 997. [3] R. B. Doorenbos, O. Etzon, and D. S. Weld. A scalable comparson-shoppng agent for the World-Wde Web. In Proceedngs of the Frst Internatonal Conference on Autonomous Agents, pages 39 48, Marna del Rey, CA, 997. M Press. [4] T. Fawcett. In vvo spam flterng: A challenge problem for KDD. SIGKDD Exploratons, 5(2):4 48, 23. [5] T. Fawcett and F. Provost. Adaptve fraud detecton. Data Mnng and Knowledge Dscovery, (3):29 36, 997. [6] D. Fudenberg and D. Levne. The Theory of Learnng n Games. MIT Press, Cambrdge, MA, 999. [7] D. Fudenberg and J. Trole. Game Theory. MIT Press, Cambrdge, MA, 99. [8] M. R. Garey and D. S. Johnson. Computers and Intractablty. Freeman, New York, NY, 979. [9] L. Guernsey. Retalers rse n Google rankngs as rvals cry foul. New York Tmes, November 2, 23. [] G. Hulten, L. Spencer, and P. Domngos. Mnng tme-changng data streams. In Proceedngs of the Seventh M SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pages 97 6, San Francsco, CA, 2. M Press. [] D. Jensen, M. Rattgan, and H. Blau. Informaton awareness: A prospectve techncal assessment. In Proceedngs of the Nnth M SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pages , Washngton, DC, 23. M Press. [2] M. Kearns. Computatonal game theory. Tutoral, Department of Computer and Informaton Scences, Unversty of Pennsylvana, Phladelpha, PA, mkearns/nps2tutoral/. [3] R. Kohav and G. John. Wrappers for feature subset selecton. Artfcal Intellgence, 97(-2): , 997. [4] B. Krebs. Onlne pracy spurs hgh-tech arms race. Washngton Post, June 26, 23. [5] M. L. Lttman. Markov games as a framework for mult-agent renforcement learnng. In Proceedngs of the Eleventh Internatonal Conference on Machne Learnng, pages 57 63, New Brunswck, NJ, 994. Morgan Kaufmann. [6] B. Lloyd. Been gazumped by Google? Tryng to make sense of the Florda update. Search Engne Gude, November 25, 23. [7] M. V. Mahoney and P. K. Chan. Learnng nonstatonary models of normal network traffc for detectng novel attacks. In Proceedngs of the Eghth M SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pages , Edmonton, Canada, 22. M Press. [8] A. McCallum and K. Ngam. A comparson of event models for Nave Bayes text classfcaton. In Proceedngs of the AAAI-98 Workshop on Learnng for Text Categorzaton, Madson, WI, 998. AAAI Press. [9] F. Nelsen. Emal data, rem/datasets/. [2] F. Provost and T. Fawcett. Robust classfcaton for mprecse envronments. Machne Learnng, 42:23 23, 2. [2] J. Renne. Ifle spam classfer, [22] P. Robertson and J. M. Brady. Adaptve mage analyss for aeral survellance. IEEE Intellgent Systems, 4(3):3 36, 999. [23] M. Saham, S. Dumas, D. Heckerman, and E. Horvtz. A Bayesan approach to flterng junk e-mal. In Proceedngs of the AAAI-98 Workshop on Learnng for Text Categorzaton, Madson, WI, 998. AAAI Press. [24] G. Sakks, I. Androutsopoulos, G. Palouras, V. Karkaletss, C.D. Spyropoulos, and P. Stamatopoulos. A memory-based approach to ant-spam flterng for malng lsts. In Informaton Retreval, volume 6, pages Kluwer, 23. [25] T. Senator. Ongong management and applcaton of dscovered knowledge n a large regulatory organzaton: A case study of the use and mpact of NASD regulaton s advanced detecton system (ADS). In Proceedngs of the Sxth M SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pages 44 53, Boston, MA, 2. M Press. [26] J. M. Smth. Evoluton and the Theory of Games. Cambrdge Unversty Press, Cambrdge, UK, 982. [27] P. Turney. Cost-senstve learnng bblography. Onlne bblography, NRC Insttute for Informaton Technology, Ottawa, Canada, [28] WordNet 2.: A lexcal database for the Englsh language, wn/.

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

Searching for Interacting Features for Spam Filtering

Searching for Interacting Features for Spam Filtering Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

PERRON FROBENIUS THEOREM

PERRON FROBENIUS THEOREM PERRON FROBENIUS THEOREM R. CLARK ROBINSON Defnton. A n n matrx M wth real entres m, s called a stochastc matrx provded () all the entres m satsfy 0 m, () each of the columns sum to one, m = for all, ()

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia To appear n Journal o Appled Probablty June 2007 O-COSTAT SUM RED-AD-BLACK GAMES WITH BET-DEPEDET WI PROBABILITY FUCTIO LAURA POTIGGIA, Unversty o the Scences n Phladelpha Abstract In ths paper we nvestgate

More information

A Lyapunov Optimization Approach to Repeated Stochastic Games

A Lyapunov Optimization Approach to Repeated Stochastic Games PROC. ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, OCT. 2013 1 A Lyapunov Optmzaton Approach to Repeated Stochastc Games Mchael J. Neely Unversty of Southern Calforna http://www-bcf.usc.edu/

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Proactive Secret Sharing Or: How to Cope With Perpetual Leakage

Proactive Secret Sharing Or: How to Cope With Perpetual Leakage Proactve Secret Sharng Or: How to Cope Wth Perpetual Leakage Paper by Amr Herzberg Stanslaw Jareck Hugo Krawczyk Mot Yung Presentaton by Davd Zage What s Secret Sharng Basc Idea ((2, 2)-threshold scheme):

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Section 5.4 Annuities, Present Value, and Amortization

Section 5.4 Annuities, Present Value, and Amortization Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Generalizing the degree sequence problem

Generalizing the degree sequence problem Mddlebury College March 2009 Arzona State Unversty Dscrete Mathematcs Semnar The degree sequence problem Problem: Gven an nteger sequence d = (d 1,...,d n ) determne f there exsts a graph G wth d as ts

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM BARRIOT Jean-Perre, SARRAILH Mchel BGI/CNES 18.av.E.Beln 31401 TOULOUSE Cedex 4 (France) Emal: jean-perre.barrot@cnes.fr 1/Introducton The

More information

General Auction Mechanism for Search Advertising

General Auction Mechanism for Search Advertising General Aucton Mechansm for Search Advertsng Gagan Aggarwal S. Muthukrshnan Dávd Pál Martn Pál Keywords game theory, onlne auctons, stable matchngs ABSTRACT Internet search advertsng s often sold by an

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

Fisher Markets and Convex Programs

Fisher Markets and Convex Programs Fsher Markets and Convex Programs Nkhl R. Devanur 1 Introducton Convex programmng dualty s usually stated n ts most general form, wth convex objectve functons and convex constrants. (The book by Boyd and

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS Chrs Deeley* Last revsed: September 22, 200 * Chrs Deeley s a Senor Lecturer n the School of Accountng, Charles Sturt Unversty,

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.

More information

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services When Network Effect Meets Congeston Effect: Leveragng Socal Servces for Wreless Servces aowen Gong School of Electrcal, Computer and Energy Engeerng Arzona State Unversty Tempe, AZ 8587, USA xgong9@asuedu

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Planning for Marketing Campaigns

Planning for Marketing Campaigns Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Performance Analysis of View Maintenance Techniques for Data Warehouses A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Period and Deadline Selection for Schedulability in Real-Time Systems

Period and Deadline Selection for Schedulability in Real-Time Systems Perod and Deadlne Selecton for Schedulablty n Real-Tme Systems Thdapat Chantem, Xaofeng Wang, M.D. Lemmon, and X. Sharon Hu Department of Computer Scence and Engneerng, Department of Electrcal Engneerng

More information

Sketching Sampled Data Streams

Sketching Sampled Data Streams Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

More information

Simple Interest Loans (Section 5.1) :

Simple Interest Loans (Section 5.1) : Chapter 5 Fnance The frst part of ths revew wll explan the dfferent nterest and nvestment equatons you learned n secton 5.1 through 5.4 of your textbook and go through several examples. The second part

More information

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture A Desgn Method of Hgh-avalablty and Low-optcal-loss Optcal Aggregaton Network Archtecture Takehro Sato, Kuntaka Ashzawa, Kazumasa Tokuhash, Dasuke Ish, Satoru Okamoto and Naoak Yamanaka Dept. of Informaton

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告

行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告 行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告 畫 類 別 : 個 別 型 計 畫 半 導 體 產 業 大 型 廠 房 之 設 施 規 劃 計 畫 編 號 :NSC 96-2628-E-009-026-MY3 執 行 期 間 : 2007 年 8 月 1 日 至 2010 年 7 月 31 日 計 畫 主 持 人 : 巫 木 誠 共 同

More information

On the Interaction between Load Balancing and Speed Scaling

On the Interaction between Load Balancing and Speed Scaling On the Interacton between Load Balancng and Speed Scalng Ljun Chen, Na L and Steven H. Low Engneerng & Appled Scence Dvson, Calforna Insttute of Technology, USA Abstract Speed scalng has been wdely adopted

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

Distributed Multi-Target Tracking In A Self-Configuring Camera Network

Distributed Multi-Target Tracking In A Self-Configuring Camera Network Dstrbuted Mult-Target Trackng In A Self-Confgurng Camera Network Crstan Soto, B Song, Amt K. Roy-Chowdhury Department of Electrcal Engneerng Unversty of Calforna, Rversde {cwlder,bsong,amtrc}@ee.ucr.edu

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

SPECIALIZED DAY TRADING - A NEW VIEW ON AN OLD GAME

SPECIALIZED DAY TRADING - A NEW VIEW ON AN OLD GAME August 7 - August 12, 2006 n Baden-Baden, Germany SPECIALIZED DAY TRADING - A NEW VIEW ON AN OLD GAME Vladmr Šmovć 1, and Vladmr Šmovć 2, PhD 1 Faculty of Electrcal Engneerng and Computng, Unska 3, 10000

More information

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Activity Scheduling for Cost-Time Investment Optimization in Project Management PROJECT MANAGEMENT 4 th Internatonal Conference on Industral Engneerng and Industral Management XIV Congreso de Ingenería de Organzacón Donosta- San Sebastán, September 8 th -10 th 010 Actvty Schedulng

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

The Application of Fractional Brownian Motion in Option Pricing

The Application of Fractional Brownian Motion in Option Pricing Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn zhouqngxn98@6.com

More information

Software project management with GAs

Software project management with GAs Informaton Scences 177 (27) 238 241 www.elsever.com/locate/ns Software project management wth GAs Enrque Alba *, J. Francsco Chcano Unversty of Málaga, Grupo GISUM, Departamento de Lenguajes y Cencas de

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Sngle Snk Buy at Bulk Problem and the Access Network

Sngle Snk Buy at Bulk Problem and the Access Network A Constant Factor Approxmaton for the Sngle Snk Edge Installaton Problem Sudpto Guha Adam Meyerson Kamesh Munagala Abstract We present the frst constant approxmaton to the sngle snk buy-at-bulk network

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

On the Interaction between Load Balancing and Speed Scaling

On the Interaction between Load Balancing and Speed Scaling On the Interacton between Load Balancng and Speed Scalng Ljun Chen and Na L Abstract Speed scalng has been wdely adopted n computer and communcaton systems, n partcular, to reduce energy consumpton. An

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information