Stackelberg Games for Adversarial Prediction Problems
|
|
|
- Evelyn Briggs
- 10 years ago
- Views:
Transcription
1 Stackelberg Games for Adversarial Predictio Problems Michael Brücker Departmet of Computer Sciece Uiversity of Potsdam, Germay Tobias Scheffer Departmet of Computer Sciece Uiversity of Potsdam, Germay ABSTRACT The stadard assumptio of idetically distributed traiig ad test data is violated whe test data are geerated i respose to a predictive model. This becomes apparet, for example, i the cotext of spam filterig, where a service provider employs a spam filter ad the spam seder ca take this filter ito accout whe geeratig ew s. We model the iteractio betwee learer ad data geerator as a Stackelberg competitio i which the learer plays the role of the leader ad the data geerator may react o the leader s move. We derive a optimizatio problem to determie the solutio of this game ad preset several istaces of the Stackelberg predictio game. We show that the Stackelberg predictio game geeralizes existig predictio models. Fially, we explore properties of the discussed models empirically i the cotext of spam filterig. Categories ad Subject Descriptors I.5. [Patter Recogitio]: Models statistical; H.4.3 [Iformatio System Applicatios]: Commuicatios Applicatios electroic mail Geeral Terms Theory, Algorithms Keywords Adversarial Classificatio, Stackelberg Competitio, Predictio Game, Spam Filterig. INTRODUCTION A commo assumptio o which most learig algorithms are based is that traiig ad test data are govered by idetical distributios. However, i a variety of applicatios, the distributio that govers data at applicatio time may be iflueced by a adversary whose iterests coflict those of the learer. Cosider, for istace, the followig three scearios. I computer ad etwork security, scripts that cotrol Permissio to make digital or hard copies of all or part of this work for persoal or classroom use is grated without fee provided that copies are ot made or distributed for profit or commercial advatage ad that copies bear this otice ad the full citatio o the first page. To copy otherwise, to republish, to post o servers or to redistribute to lists, requires prior specific permissio ad/or a fee. KDD, August 2 24, 20, Sa Diego, Califoria, USA. Copyright 20 ACM //08...$0.00. attacks are egieered with botet ad itrusio detectio systems i mid. Credit card fraudsters adapt their uauthorized use of credit cards i particular, amouts charged per trasactios ad per day ad the type of busiesses that amouts are charged from such as ot to trigger alertig mechaisms employed by credit card compaies. spam seders desig message templates that are istatiated by odes of botets; templates are specifically desiged to produce a low spam score with curret spam filters. The domai of spam filterig will serve as a ruig example throughout the paper. I all of these applicatios, assailats factor iformatio about coutermeasures that are beig employed ito the process of data geeratio. The iteractio betwee learer ad data geerators ca be modeled as a game i which oe player cotrols the predictive model whereas aother exercises some cotrol over the process of data geeratio. The adversary s ifluece o the geeratio of the data ca be mathematically modeled as a trasformatio that is imposed o the distributio that govers the data at traiig time. The trasformed distributio the govers the data at applicatio time. The optimizatio criterio of either player takes as argumets both, the predictive model chose by the learer ad the trasformatio carried out by the adversary. Typically, this problem is modeled uder the worst-case assumptio that the adversary desires to impose the highest possible costs o the learer. This amouts to a zero-sum game i which the loss of oe player is the gai of the other. I this settig, both players ca maximize their expected outcome by followig a miimax strategy. El Ghaoui et al. [5] derive a miimax model for iput data that are kow to lie withi some hyper-rectagles aroud the traiig istaces. Their solutio miimizes the worst-case loss over all possible choices of the data i these itervals. Lackriet et al. [0] study the miimax probability machie. This classifier miimizes the maximal probability of misclassifyig ew istaces for a give mea ad covariace matrix of each class. Geometrically, this solutio correspods to a miimax strategy with hyper-ellipsoids aroud the traiig istaces, rather tha hyper-rectagles. Similarly, worstcase solutios to classificatio games i which the adversary deletes iput features or performs arbitrary feature trasformatio have bee studied [3, 6, 7, 4, 4]. Several applicatios motivate problem settigs i which the goals of the learer ad the data geerator, while still coflictig, are ot ecessarily etirely atagoistic. For istace, a fraudster s goal of maximizig the profit made from exploitig phished accout iformatio is ot the iverse of
2 a service provider s goal of achievig a high spam recogitio rate at close-to-zero false positives. Whe playig a miimax strategy, oe ofte makes overly pessimistic assumptios about the adversary s behavior ad may ot ecessarily obtai a optimal outcome. For games that do ot exhibit the zero-sum property, a game-theoretic model has bee studied that assumes both players to commit to their actios simultaeously []; that is, without iformatio about the oppoet s course of actio. Whe the parameter space of the learer s model ad the adversary s trasformatio ad both players loss fuctios satisfy specific criteria e.g., the loss fuctios have to be mootoic with distict mootoicity ad twice differetiable, the the predictio game has a uique Nash equilibrium that ca be foud by solvig a compact optimizatio problem []. The Nash equilibrium is a combiatio of parameters for the predictive model ad the adversary s trasformatio which has the property that either player beefits by uilaterally deviatig from it. For the learer, playig the Nash equilibrium istead of the miimax strategy is a optimal course of actio uder the followig sufficiet coditios: First, the adversary has to be trusted to behave ratioally i the sese of maximizig their profit by playig a Nash strategy, too. If the learer plays the Nash equilibrium but the adversary deviates from that equilibrium, the both players may fare arbitrarily poorly. Secodly, a uique equilibrium eeds to exist, sice a combiatio of actios from two distict equilibria may lead to a arbitrarily poor outcome for either player. Thirdly, the adversary must ot have ay iformatio about the predictive model that the learer commits to before geeratig the data. I practice, this assumptio ca be violated whe the adversary is able to probe the predictive model. If the adversary violates either of the above three coditios, o guaratees o the optimality ca be give ad, cosequetly, a learer may be ill-advised to play the Nash equilibrium. I practice, a spam seder may follow heuristics derived from past experiece ad experimets with the filter. Such a settig i which both players act o-simultaeously ca be modeled as a Stackelberg competitio which allows oe player the follower to be potetially fully iformed about the move of the other player the leader. We model adversarial learig as a Stackelberg competitio i which the learer acts as leader by committig to a predictive model i the first step. The model is the disclosed to the follower the data geerator who the gets to trasform the iput distributio. Some authors [9, 2] study the case i which the data geerator acts as leader ad the learer as follower. This reflects a settig i which the adversary discloses how the future distributio will differ from the curret distributio before the learer has to commit to a model, which cotradicts the ituitio of a adversarial model-buildig problem. Whe the data geerator acts as leader ad discloses the data trasformatio, the learer oly has to solve a simple optimizatio problem i order to miimize the risk o the trasformed data poits. The rest of this paper is orgaized as follows. Sectio 2 itroduces the problem settig. We formalize the Stackelberg predictio game, derive a optimizatio problem to determie the Stackelberg equilibrium, ad show how to employ kerel fuctios i Sectio 3. I Sectio 4, we preset three istaces of the SPG ad discuss their relatio to existig predictio models. We report o experimets o spam filterig i Sectio 5; Sectio 6 cocludes. 2. PROBLEM SETTING We study predictio games betwee two players: The learer v = ad a adversary, the data geerator v = +. I our ruig example of spam filterig, we study the competitio betwee recipiet ad seders, ot competitio amog seders. To this ed, v = refersto the recipiet whereas v = + models the etirety of all legitimate ad abusive seders as a sigle, amalgamated player. I the past, the data geerator v = + produced a sample D = {x i, y i} of traiig istaces x i X with correspodig class labels y i Y= {, +}. These object-class pairs are draw accordig to a traiig distributio with desity fuctio px, y. By cotrast, future object-class pairs, produced by the data geerator at applicatio time, are draw from some test distributio with desity ṗx, y which may differ from px, y. The task of the learer v = istoselecttheparameters w R m of a predictive model hx = sigf wx implemeted i terms of a geeralized liear decisio fuctio f w : X R with f wx =w T φx ad feature mappig φ : X R m. The learer s theoretical costs at applicatio time are give by θ w, ṗ = Y X c x, yl f wx, yṗx, ydx, where weightig fuctio c : X Y R ad loss fuctio l : R Y R detail the weighted loss c x, yl f wx, y that the learer icurs whe the predictive model classifies istace x as hx =sigf wx while the true label is y. The positive class- ad istace-specific weightig factors c x, y withe[c X, Y] = specify the importace of miimizig the loss l f wx, y forthe correspodig object-class pair x, y. For istace, i spam filterig, the correct classificatio of o-spam messages ca be busiess-critical for service providers while failig to detect spam messages rus up processig ad storage costs, depedig o the size of the message. The data geerator v = + ca modify the data geeratio process for future istaces. I practice, spam seders update their campaig templates which are dissemiated to the odes of botets. Formally, the data geerator trasforms the traiig distributio with desity p to the test distributio with desity ṗ. The data geerator icurs trasformatio costs by modifyig the data geeratio process which is quatified by Ω +p, ṗ. This term acts as a regularizer o the trasformatio ad may implicitly costrai the shift that ca be imposed o the distributio, depedig o the ature of the applicatio that is to be modeled. For istace, the seder may ot be allowed to alter the traiig distributio for o-spam messages, or to modify the ature of the messages by chagig the label from spam to o-spam or vice versa. Additioally, chagig the traiig distributio for spam messages may ru up costs depedig o the extet of distortio iflicted o the iformatioal payload.
3 The theoretical costs of the data geerator at applicatio time are the sum of the expected predictio costs ad the trasformatio costs, θ +w, ṗ = c +x, yl +f wx, yṗx, ydx Y X +Ω +p, ṗ. I aalogy to the learer s costs, c +x, yl +f wx, y quatifies the loss that the data geerator icurs whe istace x is labeled as hx =sigf wx while the true label is y. The weightig factors c +x, y withe[c +X, Y] = express the sigificace of x, y from the perspective of the data geerator. I our example sceario, this allows to reflect that costs of correctly or icorrectly classified istaces may vary greatly across differet physical seders that are aggregated ito the amalgamated player. Sice the theoretical costs of both players deped o the test distributio, they ca, for all practical purposes, ot be calculated. Hece, we focus o a regularized, empirical couterpart of the theoretical costs based o the traiig sample D. The empirical couterpart ˆΩ +D, Ḋ of the data geerator s regularizer Ω +p, ṗ pealizes the divergece betwee traiig sample D = {x i, y i} ad a perturbated traiig sample Ḋ = { xi, yi} that would be the outcome of applyig the trasformatio that traslates p ito ṗ to sample D. The learer s cost fuctio, istead of itegratig over ṗ, sums over the elemets of the perturbated traiig sample Ḋ. The players empirical cost fuctios ca still oly be evaluated after the learer has committed to parameters w ad the data geerator to a trasformatio from traiig to test desity fuctio, but this trasformatio eed oly be represeted i terms of the effects that it will have o the traiig sample D. The trasformed traiig sample Ḋ must ot be mistake for test data; test data will be geerated uder ṗ at applicatio time after the players have committed to their actios. The empirical costs icurred by the predictive model h with parameters w ad the shift from p to ṗ amout to ˆθ w, Ḋ = ˆθ +w, Ḋ = c,il f wẋ i, y i+ρ ˆΩ w, c +,il +f wẋ i, y i+ ˆΩ+D, Ḋ, 2 where we have replaced the weightig terms cvẋi, yi by costat cost factors c v,i > 0with i cv,i =. The learer s regularizer ˆΩ w i accouts for the fact that Ḋ does ot costitute the test data itself, but is merely a traiig sample trasformed to reflect the test distributio ad the used to lear the model parameters w. Thetrade- off betwee the empirical loss ad the regularizer is cotrolled by each player s regularizatio parameter ρ v > 0for v {, +}. I our aalysis, we estimate the trasformatio costs by the average squared l 2 -distace betwee x i ad ẋ i i feature space, ˆΩ +Ḋ, D = 2 φẋi φxi 2. 3 The learer s regularizer ˆΩ pealizes the complexity of the predictive model hx =sigf wx. For our aalysis, we cosider Tikhoov regularizatio which, for liear decisio fuctios f w, reduces to the squared l 2 -orm of w, ˆΩ w = 2 w 2. 4 Note that either player s empirical costs ˆθ vw, Ḋ deped o both players actios. The cocept of a optimal choice of model parameters w regardless of the adversary s choice of a data trasformatio is therefore ot well-defied. I the followig sectio, we will refer to the Stackelberg model which idetifies the cocept of a optimal move of the leader which miimizes ˆθ over w uder the assumptio that the follower will react by miimizig ˆθ + over Ḋ give the parameters w chose by the leader. 3. STACKELBERG PREDICTION GAME We model the predictio game as a Stackelberg competitio; we refer to the resultig model as the Stackelberg predictio game SPG. A Stackelberg game is oe of the simplest dyamic games: I the first stage, the leader i our case, the learer decides o a predictive model hx = sig f wx with parameters w. I the secod stage, the data geerator, who plays the part of the follower, observes the leader s decisio ad chooses a trasformatio that chages the distributio of past istaces ito the distributio of future istaces. I this sceario, the learer has to commit to a set of parameters uilaterally whereas the data geerator ca take the model parameters w ito accout whe preparig the data trasformatio. The optimality of a Stackelberg equilibrium which we will ow itroduce rests o the assumptio that the follower the data geerator will act ratioally i the sese of choosig a trasformatio that miimizes the resultig costs ˆθ + give the disclosed w. To reach miimal costs give w, the data geerator has to idetify a sample Ḋ that costitutes a global miimum of the cost fuctio ˆθ +w, Ḋ. There may be several global miima with idetical values of the cost fuctio; i geeral, the data geerator has to idetify ay elemet Ḋ from the set of optimal resposes to w, Ḋ w = { {ẋ i, y i} : {ẋ i} argmi ˆθ+ w, {ẋ i, y i} }. ẋ,...,ẋ X Idetifyig a elemet Ḋ Ḋw amouts to solvig a regular optimizatio problem because w ca be observed before Ḋ has to be chose. A Stackelberg equilibrium is ow idetified by backward iductio. Assumig that the data geerator will decide for ay Ḋ Ḋw, the learer has to choose model parameters w that miimize the learer s cost fuctio ˆθ for ay of the possible reactios Ḋ Ḋw that are optimal for the data geerator: w argmi max ˆθ w, Ḋ. 5 w R m Ḋ Ḋw A actio w that miimizes the learer s costs ad a correspodig optimal actio Ḋ Ḋw of the data geerator are called a Stackelberg equilibrium. The Stackelberg equilibrium is a special case of a subgame perfect equilibrium which is a extesio of the Nash equilibrium for games that are played o-simultaeously.
4 3. Fidig a Stackelberg Equilibrium Equatio 5 establishes a hierarchical mathematical program specifically, a bilevel optimizatio problem with upper-level objective ˆθ ad lower-level objective ˆθ +. mi max ˆθ w, {ẋ i, y i} 6 w R m i :ẋ i X s.t. {ẋ i} argmi ˆθ+w, {ẋ i, y i} 7 ẋ,...,ẋ X Bilevel programs are itrisically hard to solve. Eve the simplest istace i which all costraits ad objectives are liear is kow to be NP-hard [8]. The mai difficulties arise from the costraits ẋ i X of the lower-level optimizatio problem which geerally reder costrait 7 of the upperlevel optimizatio problem to be o-differetiable i w, eve if ˆθ + is cotiuously differetiable i w ad ẋ i for i =,...,. Numerous approaches that address bilevel programs have bee studied, for istace, based o gradiet descet, pealty fuctio, ad trust-regio methods; see, for istace, [2] for a detailed survey. Commoly, these methods reformulate the optimizatio problem ito a mathematical program with equilibrium costraits. I this, the lower-level optimizatio problem is replaced by its Karush-Kuh-Tucker KKT coditios. The resultig optimizatio problem with equilibrium costraits ca be solved approximately by relaxig the complemetary coditios [5]. However these methods do ot ecessarily coverge to a local optimum ad are applicable to small problems oly. That is why we focus o a special case of the above bilevel program. The followig theorem reformulates the lowerlevel optimizatio problem ito a ucostraied problem such that costrait 7 becomes cotiuously differetiable i w. This requires the feature space iduced by mappig φ, but ot ecessarily the iput space X, to be urestricted ad the data geerator s loss fuctio l +z, y tobecovex ad cotiuously differetiable i z R. Theorem. Let the leader s cost fuctio ˆθ ad the follower s cost fuctio ˆθ + be defied as i ad 2 with regularizers ˆΩ ad ˆΩ + defied as i 4 ad 3, respectively. Let feature mappig φ : X R m be surjective, let the data geerator s loss fuctio l +z,y be covex ad cotiuously differetiable with respect to z R for ay fixed y Y. Now let weight vector w R m ad factors τ,...,τ R be a solutio of the optimizatio problem mi c,il fwx i+τ i w 2 ρ, y i + w, i : τ i 2 w 2 8 s.t. i :0=τ i + c +,il + fwx i+τ i w 2, y i. The the Stackelberg predictio game i Equatio 6 attais a equilibrium at w, Ḋ with Ḋ = {ẋi, y i} ad ẋi {ẋ X : φẋ =φx i+τi w }. Proof. Costrait 7 says that {ẋ i } has to be a solutio of the restricted optimizatio problem mi i :ẋ i X c +,il +w T φẋ i, y i+ ρ+ 2 φẋi φxi 2. As the objective as well as the costraits are etirely defieditermsofẋ i = φẋ i, this coditio is equivalet to eforcig {ẋ i } to be a solutio of the urestricted optimizatio problem mi c +,il +w T ẋ i, y i+ ρ+ 2 ẋi φxi 2. 9 i : ẋ i R m This solutio is uiquely defied for ay fixed w as loss fuctio l +z, y isrequiredtobecovexiz, ad cosequetly i ẋ i,adtheterm ẋ i φx i 2 is quadratic i ẋ i ad therefore strictly covex for ay fixed φx i. Give w R m ad miimizer ẋ i R m,thesetx w i = {ẋ X : φẋ =ẋ i } cotais all istaces ẋ which correspod to the optimally trasformed istace i feature space ẋ i.siceφ is surjective, X w i is guarateed to be o-empty, ad cosequetly, for ay solutio {ẋ i }, there exist at least oe correspodig set of istaces {ẋi }. Asφ is ot required to be a bijective mappig, there may exist multiple istaces ẋ X w i which are optimal i the sese of miimizig the data geerator s loss. However, sice all of these istaces share the same feature represetatio ẋ i, the ier maximizatio of the upper-level optimizatio problem i 6 vaishes, mi max w R m i :ẋ i X w i mi w R m ˆθ w, {ẋ i, y i} = c,il w T ẋ i, y i + ρ 2 w 2, 0 where {x i } is the solutio of Optimizatio Problem 9. Sice 9 is covex, this costrait ca be replaced by its complemetary coditios which are give by ẋi ˆθ+w, Ḋ =0 for i =,..., where ẋi ˆθ+w, Ḋ =c+,il +w T ẋ i, y iw + ρ+ ẋi φxi. The mapped istace ẋ i that satisfies the i-th complemetary coditio is give by ẋ i = φx i+τ iw with τ i = c +,il + w T ẋ i, y i, = c +,il + w T φx i+τ iw T w, y i, = c +,il + fwx i+τ i w 2, y i. 2 Whe replacig ẋ i by i the upper-level Optimizatio Problem 0 ad eforcig Equatio 2, Optimizatio Problem 8 follows. Hece, a solutio w of 8 with correspodig τ,...,τ is also a solutio of 6 with ẋi X w i = {ẋ X : φẋ =φx i+τi w }. The objective as well as the costraits of the optimizatio problem i Theorem are geerally ot joitly covex i w ad τ,...,τ. However, uder the assumptios of the followig propositio, a locally optimal solutio ca still be foud efficietly by stadard SQP solvers. Propositio. Let loss fuctio l z, y be twice cotiuously differetiable ad loss fuctio l +z, y be covex ad thrice cotiuously differetiable with respect to z R for ay fixed y Y. The, a poit satisfyig the KKT coditios of the optimizatio problem i Equatio 8 ca be obtaied by sequetial quadratic programmig SQP methods.
5 The objective as well as the costraits i 8 are twice cotiuously differetiable with respect to w ad τ i for i =,...,. Hece, the correspodig complemetary coditios are cotiuously differetiable which is a sufficiet coditio to apply SQP methods; this proves Propositio. 3.2 Applyig Kerels Theorem states that a Stackelberg equilibrium with parameter vector w R m ca be obtaied by solvig the optimizatio problem i 8 which requires a explicit feature represetatio φx i of the traiig istaces. However, i some applicatios, such a feature mappig is uwieldy or eve ot existig. Istead, oe is ofte equipped with a kerel fuctio k : X X R which measures the similarity betwee two istaces. Geerally, kerel fuctio k is assumed to be a positive-semidefiite kerel such that itcabestateditermsofascalarproductithecorrespodig reproducig kerel Hilbert space; i.e., φ with kx, x =φx T φx. Makig use of the represeter theorem [3], we ca ow express weight vector w as a liear combiatio of the mapped traiig istaces; that is, w = α iφx i 3 where feature mappig φ is implicitly defied by kerel k. Whe substitutig w i 8 by 3, the squared orm of w ad decisio fuctio f w ca be completely expressed i terms of the kerel, w 2 = f wx i = α jα k kx j, x k, 4 j,k= α jkx i, x j. 5 j= Hece, the optimizatio problem i 8 ca be reformulated ito a optimizatio problem over τ,...,τ R ad the dual weights α,...,α R without the eed of a explicit feature mappig φ. However, iferrig a optimal trasformed sample Ḋ still requires the kowledge of a explicit mappig φ ad its iverse φ. Of course, this is ot a restrictio as we are iterested i the predictive model f w rather tha the trasformed sample Ḋ. Note that for computatioal reasos, it may be advisable to first costruct a explicit feature mappig from the kerel matrix ad the to trai the Stackelberg model i the primal. For istace, we ca employ the kerel PCA map φ : x K 2 [kx, x,...,kx, x ] T, 6 where K deotes the kerel matrix with K ij = kx i, x j. Withi our experimets preseted i Chapter 5 where we use liear kerels, we study all three variats: Computig the model i iput space, computig the kerelized versio, ad computig the PCA map-iduced variat. Eve though all variats yield the same solutio, usig a explicit PCA mappig is geerally fastest for reasoable. Matrix K 2 ca be computed directly from the eigevalue decompositio of the kerel matrix K; i case it is sigular we use the pseudo-iverse of K INSTANCES OF THE SPG Bythechoiceofl v, distict istaces of the Stackelberg predictio game SPG ca be idetified which, to some extet, geeralize existig predictio models such as the SVM for ivariaces [4] ad the SVM with ueve margis []. 4. SPG with Worst-Case Loss The SPG with worst-case loss is a istace of the Stackelberg predictio game that is characterized by a atagoicity of the weighted empirical costs of learer ad data geerator; that is, the data geerator employs the loss fuctio l wc +z, y = l z, y ad cost factors c +,i = c,i. Loss fuctios l wc + ad l caot both be covex at the same time except for a iappropriate liear fuctio ad so the requiremets of either Theorem or Propositio are violated. As we caot apply Theorem, we cosider the origial optimizatio problem Equatios 6-7. We substitute l wc + ad c +,i i the objective Equatio 2 of the lower-level optimizatio problem mi i :ẋ i X c +,il wc + f wẋ i, y i+ ρ+ φẋi φxi 2 2 which decouples ito maximizatio problems ρ+ max c,il fwẋi, yi ẋ i X 2 φẋi φxi 2. 7 A equivalet formulatio of 7 is give by max l f wẋ i, y i 8 ẋ i X i where X i = {ẋ X : c,i = ρ + φẋ 2 φxi 2 } are feasible sets of trasformed istaces. The differece betwee both formulatios is that i 8, regularizatio parameter ρ + explicitly restricts the amout of trasformatio of each istace x i. As ow the ier maximizatio of the upper-level optimizatio problem i 6 ca be stated i terms of the solutio of the lower-level optimizatio problem, l f wẋi, y i, the etire bilevel optimizatio problem reduces to the followig costraied miimizatio problem. mi c,iξ i + ρ w, i : ξ i 2 w 2 9 s.t. i : ξ i 0, ξ i max l f wẋ i, y i 20 ẋ i X i If the lower-level maximizatio problem 20 has a uique solutio for ay fixed w R m, the the above optimizatio problem ca be solved by gradiet descet where i each iteratio the maximizatio problem i 20 has to be solved for the curret iterate w k see, e.g., [4]. I case the learer choses the hige loss, l h z, y =max0, yz, 2 the SPG with worst-case loss reduces to a istace of the SVM for ivariaces [4]. 4.2 SPG with Liear Loss A secod istace of the Stackelberg predictio game is the SPG with liear loss i which the data geerator employs a liear loss fuctio, l li +z, y =z,
6 which pealizes high decisio values z idepedetly of the class. This choice is appropriate, for istace, i spam filterig where the data geerator is purely iterested i the delivery of a x which becomes ulikely for large values of z, idepedetly of the correspodig true class y. For the liear loss that is cotiuously differetiable ad covex, the costraits i 8 reduce to τ i = c +,i 22 for i =,...,. Whe choosig the hige loss 2 for the learer ad replacig τ i i 8 by 22 we arrive at the followig miimizatio problem. mi w, i : ξ i c,iξ i + ρ 2 w 2 s.t. i : ξ i 0, ξ i y i w T φx i The latter costraits ca be reformulated to y iw T φx i +y iκ i ξ i c +,i w 2 which amouts to the costraits of the SVM with ueve margis []. The oly sytactic distictio is that κ i = c +,i w 2 is idirectly defied by ad c +,i; however, for each choice of κ i 0 i the SVM with ueve margis, there exist appropriate parameters ad c +,i of a equivalet SPG with liear loss ad vice versa. Cosider the special case of equal factors c +,i = c +,j, ad cosequetly κ = κ i = κ j, for all i, j =,...,. The the margi of egative istaces becomes κ whereas the margi of positive istaces is +κ. I our example of spam filterig, this goes with the ituitio that the margi of spam istaces that vary greatly has to be larger tha the margi of o-spam istaces that remai almost umodified. This effect is stroger whe the data geerator s regularizatio parameter is small. By cotrast, if goes to ifiity, ad cosequetly κ attais zero, the the SPG with liear loss reduces to the regular SVM. 4.3 SPG with Logistic Loss Fially, this sectio itroduces the SPG with logistic loss. This istatiatio meets the precoditios of Theorem ad Propositio, ad the resultig optimizatio criterio ca be solved with stadard tools. The learer may use ay loss fuctio that is covex ad twice cotiuously differetiable Equatio 23 details the loss fuctio used i our experimets while the data geerator uses the logistic loss l log + z, y =log+ez which agai pealizes large decisio values z. The ratioale behid this loss fuctio is that the data geerator experieces costs whe the learer blocks a evet, i.e., produces a high decisio fuctio value for a istace. For istace, a legitimate seder experieces costs whe a legitimate is erroeously blocked just like a abusive seder, also amalgamated ito the data geerator, experieces costs whe spam messages are blocked. Cost fuctio approaches zero for small values of the decisio fuctio. Now, the costraits i 8 resolve to g iw,τ i=0for i =,..., with l log + g iw,τ i=τ i +e fwx i τ i w 2 + c +,i. Fuctios g iw,τ iareotjoitlycovexiw ad τ i.however, as they are smooth i.e., ifiitely differetiable i both argumets, their roots ca be obtaied efficietly ad, cosequetly, the resultig optimizatio problem mi w, i : τ i c,il fwx i+τ i w 2, y i + ρ 2 w 2 s.t. i :0=g iw,τ i ca be solved by stadard SQP solvers. 5. EXPERIMENTAL EVALUATION The goal of this sectio is to explore the relative stregths ad weakesses of the discussed istaces of Stackelberg predictio games ad existig baselie methods i the cotext of spam filterig. We compare a regular support vector machie SVM, logistic regressio LogReg, the SVM for ivariaces with feature scalig Ivar-SVM, [4], Nash logistic regressio Nash, [], ad the Stackelberg istaces SPG with worst-case loss SPG wc, cf. Sectio 4., SPG with liear loss SPG li, cf. Sectio 4.2, ad the SPG with logistic loss SPG log, cf. Sectio 4.3. For all Stackelberg istaces we choose the logistic loss fuctio l log z, y =log +e yz 23 for the learer which is covex ad smooth, ad cosequetly satisfies Propositio. I the absece of prior kowledge o the istace-specific costs, we set c v,i = for all v {, +}, i =,...,ad trai all methods i the PCA map iduced feature space. To solve the oliear program of the SPG with logistic loss we use the Ipopt solver [6]. We use four corpora detailed i Table : The first data set cotais s of a service provider ESP collected betwee 2007 ad 200. The secod Mailiglist is a collectio of s from publicly available mailig lists augmeted by spam s from Bruce Gueter s spam trap of the same time period. The third corpus Privatecotais ewsletters ad spam ad o-spam s of the authors. The last corpus is the NIST TREC 2007 spam corpus. All s are tokeized, coverted ito biary bag-of-word vectors, ad sorted chroologically. Table: Datasetsuseditheexperimets. data set istaces features delivery period ESP 69,62 54,73 0/06/ /04/200 Mailiglist 28,7 266,378 0/04/999-3/05/2006 Private 08,78 582,00 0/08/2005-3/03/200 TREC ,496 24,839 04/08/ /06/2007 Our evaluatio protocol is as follows. We use the 4,000 oldest s as traiig portio ad set the remaiig s aside as test istaces. We use the that is, the harmoic mea of precisio ad recall as evaluatio measure ad trai all methods 20 times o a stratified subset of 200 spam ad 200 o-spam messages sampled from the traiig portio. I order to tue the regularizatio parameters we perform a 5-fold cross validatio o the traiig sample withi each repetitio of a experimet ad for each method separately. I the first experimet, we evaluate all methods ito the future by processig the test set i chroological order. Each test sample is split ito 20 disjoit subsets. We average
7 0.95 Performace o ESP corpus Performace o Mailiglist corpus Oct07 Jul08 Apr09 Ja0 Performace o Private corpus Aug0 Ja03 Ju04 Nov05 Performace o TREC 2007 corpus Mar06 May07 Aug08 Oct09 Apr07 May07 Ju07 SVM LogReg Ivar SVM Nash SPG wc SPG li SPG log Figure : of predictive models. Error bars idicate stadard errors. the o each of those subsets over the 20 models traied o differet samples draw from the traiig portio for each method ad perform a paired t-test. Figure shows that, for all data sets, the Stackelberg predictio games with liear loss ad with logistic loss outperform the regular SVM ad logistic regressio that do ot explicitly factor the adversary ito the optimizatio criterio. O the ESP corpus, the SPG with liear loss is slightly better tha the SPG with logistic loss whereas for the Mailiglist corpus the SPG with logistic loss outperforms the SPG with liear loss. O the TREC 2007 data set, most of the methods behave comparably with a slight advatage for the Nash logistic regressio ad the SPG istaces with logistic loss ad liear loss. The period over which the TREC 2007 data have bee collected is very short; therefore we believe that the traiig ad test istaces are govered by early idetical distributios. Cosequetly the gametheoretic models do ot gai a sigificat advatage over logistic regressio that assumes iid samples. For the other three data sets, the game-theoretical models outperform the iid baselies. Table 2 shows aggregated results over all four data sets. For each poit i each of the diagrams of Figure, we coduct a pairwise compariso of all methods based o a paired t-test at a cofidece level of α = Whe a differece is sigificat, we cout this as a wi for the method that achieves a higher. Each lie of Table 2 details the wis ad, set i italics, the losses of oe method agaist all other methods. The Stackelberg predictio game with logistic loss has more wis tha it has losses agaist each of the other methods. The Stackelberg predictio game with liear loss has more wis tha losses agaist each of the other methods except for the SPG with logistic loss ad the Nash logistic regressio. The rakig cotiues with the Ivar- SVM, the SPG with worst-case loss, logistic regressio, ad the regular SVM which loses more frequetly tha it wis agaist all other methods. To study the predictive performace as well as ruig time behavior with respect to the size of the data set, we trai the baselies ad the three SPG istaces for a varyig umber of traiig examples. We report o the results for the represetative ESP data set i Figure 2. Except for SPG wc, the game models sigificatly outperform the trivial baselie methods SVM ad logistic regressio, especially for small corpus sizes. However, this comes at the price of cosiderably higher computatioal cost. For the game models, the Stackelberg istace SPG li clearly outperforms all referece methods with respect to efficiecy. Though, the larger the size of the data set, the stroger the computatioal differeces, where at the same time the discrepacy of the predictive performace dimiishes. The data geerator s regularizer that we use i the experimets does ot distiguish betwee modificatios of spam ad o-spam messages. I reality, most seders of legitimate messages do ot deliberately chage their writig behavior such as to bypass spam filters, perhaps with the exceptio of seders of legitimate ewsletters who must be careful ot to trigger filterig mechaisms. I a fial exper-
8 Performace o ESP corpus Executio time o ESP corpus time i sec umber of traiig s umber of traiig s SVM LogReg Ivar SVM Nash SPG wc SPG li SPG log Figure 2: Predictive performace left ad executio time right for varyig sizes of the traiig data set. Table 2: Results of paired t-test over all corpora: Number of trials i which each method row has sigificatly outperformed each other method colum vs. umber of times it was outperformed. method vs. method SVM LogReg Ivar-SVM Nash SPG wc SPG li SPG log SVM 0:0 6:44 2:64 0:72 8:50 6:54 6:69 LogReg 44:6 0:0 3:4 0:72 0:29 6:48 5:57 Ivar-SVM 64:2 4:3 0:0 6:40 39:0 20:23 8:30 Nash 72:0 72:0 40:6 0:0 57:2 33:7 4:6 SPG wc 50:8 29:0 0:39 2:57 0:0 7:46 9:48 SPG li 54:6 48:6 23:20 7:33 46:7 0:0 0:23 SPG log 69:6 57:5 30:8 6:4 48:9 23:0 0:0 imet, we wat to study whether the Stackelberg model reflects this aspect of reality. Table 3 shows the average umber of modificatios i.e., word additios ad deletios performed by the seder per spam ad per o-spam depedig o the seder s regularizatio parameter for fixed ρ. Table 3: Average umber of word additios ad deletios per istace for SPG log. o-spam spam additios deletios additios deletios As expected, the umber of trasformatios icreases iversely proportioal to the regularizatio parameter. Eve for equal cost factors c v,i, o-spam messages are rarely modified because the iterests of seder ad recipiet are coheret for legitimate messages. 6. CONCLUSIONS We model adversarial predictio problems as a game i which a learer has to commit to a predictive model usig past data whereas the data geerator may choose a trasformatio fuctio after the predictive model has bee disclosed which the defies the test distributio. This model reflects applicatios such as the detectio of etwork attacks ad spam filterig i which a assailat ca probe the filter. The cost fuctios of learer ad data geerator are geerally coflictig but are ot costraied to be perfectly atagoistic. Playig the Stackelberg equilibrium istead of a worst-case strategy based o a zero-sum model is advisable whe the data geerator ca be assumed to behave ratioal i the sese of miimizig a cost fuctio. However, i cotrast to the Nash strategy, the Stackelberg model does ot rely o the existece of a uique equilibrium ad the assumptios that the adversary has o iformatio about the predictive model ad is able to idetify ad follow the equilibrial strategy. We derived a compact optimizatio problem that determies the solutio of the resultig Stackelberg predictio game. We showed that the Stackelberg model geeralizes existig predictio models such as SVM with ueve margis ad SVM for ivariaces. We evaluated spam filters resultig from a regular SVM, logistic regressio, existig game-theoretical models, ad three istaces of the Stackelberg game o several spam-filterig data sets. The relative performace of the distict game-theoretic models varies, but we observe that whe compared to ay other model, the Stackelberg model with logistic loss has more wis tha it has losses agaist each of the baselie methods. Ackowledgmets This work was supported by the Germa Sciece Foudatio DFG uder grat SCHE 540/2- ad by STRATO AG.
9 7. REFERENCES [] M. Brücker ad T. Scheffer. Nash equilibria of static predictio games. I Advaces i Neural Iformatio Processig Systems. MIT Press, [2] B. Colso, P. Marcotte, ad G. Savard. A overview of bilevel optimizatio. Aals of Operatios Research, 53: , [3] O. Dekel ad O. Shamir. Learig to classify with missig ad corrupted features. I Proceedigs of the Iteratioal Coferece o Machie Learig, pages ACM, [4] O. Dekel, O. Shamir, ad L. Xiao. Learig to classify with missig ad corrupted features. Machie Learig, 82:49 78, 200. [5] L. E. Ghaoui, G. R. G. Lackriet, ad G. Natsoulis. Robust classificatio with iterval data. Techical Report UCB/CSD , EECS Departmet, Uiversity of Califoria, Berkeley, [6] A. Globerso ad S. T. Roweis. Nightmare at test time: robust learig by feature deletio. I Proceedigs of the Iteratioal Coferece o Machie Learig. ACM, [7] A.Globerso,C.H.Teo,A.J.Smola,adS.T. Roweis. Dataset Shift i Machie Learig, chapter A adversarial view of covariate shift ad a miimax approach, pages MIT Press, [8] R. Jeroslow. The polyomial hierarchy ad a simple model for competitive aalysis. Mathematical Programmig, 32:46 64, 985. [9] M. Katarcioglu, B. Xi, ad C. Clifto. Classifier evaluatio ad attribute selectio agaist active adversaries. Data Miig ad Kowledge Discovery, 22-2:29 335, 20. [0] G. R. G. Lackriet, L. E. Ghaoui, C. Bhattacharyya, ad M. I. Jorda. A robust miimax approach to classificatio. Joural of Machie Learig Research, 3: , [] Y. Li ad J. Shawe-Taylor. The SVM with ueve margis ad chiese documet categorizatio. I Proceedigs of the Pacific Asia Coferece o Laguage, Iformatio ad Computatio, pages , [2] W. Liu ad S. Chawla. A game theoretical model for adversarial learig. I ICDM Workshops, pages IEEE Computer Society, [3] B. Schölkopf, R. Herbrich, ad A. J. Smola. A geeralized represeter theorem. I COLT: Proceedigs of the Workshop o Computatioal Learig Theory, Morga Kaufma Publishers, 200. [4] C. H. Teo, A. Globerso, S. T. Roweis, ad A. J. Smola. Covex learig with ivariaces. I Advaces i Neural Iformatio Processig Systems. MIT Press, [5] S. Veelke. A New Relaxatio Scheme for Mathematical Programs with Equilibrium Costraits: Theory a Numerical Experiece. PhDthesis, Techische Uiversität Müche, [6] A. Wächter ad L. T. Biegler. O the implemetatio of a iterior-poit filter lie-search algorithm for large-scale oliear programmig. Mathematical Programmig, 06:25 57, 2006.
Modified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
Output Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
Chapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,
NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical
Department of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
Soving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
LECTURE 13: Cross-validation
LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M
1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
Determining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
Research Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2
Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,
Theorems About Power Series
Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius
Analyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
INVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
Chapter 6: Variance, the law of large numbers and the Monte-Carlo method
Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
Properties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
Asymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
Systems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: [email protected] Supervised
Totally Corrective Boosting Algorithms that Maximize the Margin
Mafred K. Warmuth [email protected] Ju Liao [email protected] Uiversity of Califoria at Sata Cruz, Sata Cruz, CA 95064, USA Guar Rätsch [email protected] Friedrich Miescher Laboratory of
Domain 1: Designing a SQL Server Instance and a Database Solution
Maual SQL Server 2008 Desig, Optimize ad Maitai (70-450) 1-800-418-6789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a
A probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
Spam Detection. A Bayesian approach to filtering spam
Spam Detectio A Bayesia approach to filterig spam Kual Mehrotra Shailedra Watave Abstract The ever icreasig meace of spam is brigig dow productivity. More tha 70% of the email messages are spam, ad it
A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 [email protected] Abstract:
Hypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
CHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
(VCP-310) 1-800-418-6789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
Sequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
Lecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis
Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
Incremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich [email protected] [email protected] Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
Research Method (I) --Knowledge on Sampling (Simple Random Sampling)
Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact
Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is
0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values
CHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand [email protected]
SOLVING THE OIL DELIVERY TRUCKS ROUTING PROBLEM WITH MODIFY MULTI-TRAVELING SALESMAN PROBLEM APPROACH CASE STUDY: THE SME'S OIL LOGISTIC COMPANY IN BANGKOK THAILAND Chatpu Khamyat Departmet of Idustrial
Review: Classification Outline
Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio
SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
THE ABRACADABRA PROBLEM
THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected
Generalization Dynamics in LMS Trained Linear Networks
Geeralizatio Dyamics i LMS Traied Liear Networks Yves Chauvi Psychology Departmet Staford Uiversity Staford, CA 94305 Abstract For a simple liear case, a mathematical aalysis of the traiig ad geeralizatio
ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC
8 th Iteratioal Coferece o DEVELOPMENT AND APPLICATION SYSTEMS S u c e a v a, R o m a i a, M a y 25 27, 2 6 ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC Vadim MUKHIN 1, Elea PAVLENKO 2 Natioal Techical
Notes on exponential generating functions and structures.
Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the
Plug-in martingales for testing exchangeability on-line
Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8
CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive
Entropy of bi-capacities
Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace [email protected] Jea-Luc Marichal Applied Mathematics
Chapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity
Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis
Joural of Machie Learig Research 8 (2007) 1027-1061 Submitted 3/06; Revised 12/06; Published 5/07 Dimesioality Reductio of Multimodal Labeled Data by Local Fisher Discrimiat Aalysis Masashi Sugiyama Departmet
Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
Confidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
Groups of diverse problem solvers can outperform groups of high-ability problem solvers
Groups of diverse problem solvers ca outperform groups of high-ability problem solvers Lu Hog ad Scott E. Page Michiga Busiess School ad Complex Systems, Uiversity of Michiga, A Arbor, MI 48109-1234; ad
Regularized Distance Metric Learning: Theory and Algorithm
Regularized Distace Metric Learig: Theory ad Algorithm Rog Ji 1 Shiju Wag 2 Yag Zhou 1 1 Dept. of Computer Sciece & Egieerig, Michiga State Uiversity, East Lasig, MI 48824 2 Radiology ad Imagig Scieces,
Estimating Probability Distributions by Observing Betting Practices
5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,
Case Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
CS103X: Discrete Structures Homework 4 Solutions
CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible six-figure salaries i whole dollar amouts are there that cotai at least
Basic Measurement Issues. Sampling Theory and Analog-to-Digital Conversion
Theory ad Aalog-to-Digital Coversio Itroductio/Defiitios Aalog-to-digital coversio Rate Frequecy Aalysis Basic Measuremet Issues Reliability the extet to which a measuremet procedure yields the same results
Chapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10
FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.
TIGHT BOUNDS ON EXPECTED ORDER STATISTICS
Probability i the Egieerig ad Iformatioal Scieces, 20, 2006, 667 686+ Prited i the U+S+A+ TIGHT BOUNDS ON EXPECTED ORDER STATISTICS DIMITRIS BERTSIMAS Sloa School of Maagemet ad Operatios Research Ceter
Ekkehart Schlicht: Economic Surplus and Derived Demand
Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/
Class Meeting # 16: The Fourier Transform on R n
MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,
Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
Universal coding for classes of sources
Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric
.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork
Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the
Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments
Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please
Maximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
Evaluating Model for B2C E- commerce Enterprise Development Based on DEA
, pp.180-184 http://dx.doi.org/10.14257/astl.2014.53.39 Evaluatig Model for B2C E- commerce Eterprise Developmet Based o DEA Weli Geg, Jig Ta Computer ad iformatio egieerig Istitute, Harbi Uiversity of
Finding the circle that best fits a set of points
Fidig the circle that best fits a set of poits L. MAISONOBE October 5 th 007 Cotets 1 Itroductio Solvig the problem.1 Priciples............................... Iitializatio.............................
Hypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
ODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
Statistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
I. Chi-squared Distributions
1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.
Baan Service Master Data Management
Baa Service Master Data Maagemet Module Procedure UP069A US Documetiformatio Documet Documet code : UP069A US Documet group : User Documetatio Documet title : Master Data Maagemet Applicatio/Package :
Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.
18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The
Designing Incentives for Online Question and Answer Forums
Desigig Icetives for Olie Questio ad Aswer Forums Shaili Jai School of Egieerig ad Applied Scieces Harvard Uiversity Cambridge, MA 0238 USA [email protected] Yilig Che School of Egieerig ad Applied
MTO-MTS Production Systems in Supply Chains
NSF GRANT #0092854 NSF PROGRAM NAME: MES/OR MTO-MTS Productio Systems i Supply Chais Philip M. Kamisky Uiversity of Califoria, Berkeley Our Kaya Uiversity of Califoria, Berkeley Abstract: Icreasig cost
Stock Market Trading via Stochastic Network Optimization
PROC. IEEE CONFERENCE ON DECISION AND CONTROL (CDC), ATLANTA, GA, DEC. 2010 1 Stock Market Tradig via Stochastic Network Optimizatio Michael J. Neely Uiversity of Souther Califoria http://www-rcf.usc.edu/
A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length
Joural o Satisfiability, Boolea Modelig ad Computatio 1 2005) 49-60 A Faster Clause-Shorteig Algorithm for SAT with No Restrictio o Clause Legth Evgey Datsi Alexader Wolpert Departmet of Computer Sciece
Engineering Data Management
BaaERP 5.0c Maufacturig Egieerig Data Maagemet Module Procedure UP128A US Documetiformatio Documet Documet code : UP128A US Documet group : User Documetatio Documet title : Egieerig Data Maagemet Applicatio/Package
Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
Overview on S-Box Design Principles
Overview o S-Box Desig Priciples Debdeep Mukhopadhyay Assistat Professor Departmet of Computer Sciece ad Egieerig Idia Istitute of Techology Kharagpur INDIA -721302 What is a S-Box? S-Boxes are Boolea
1 The Gaussian channel
ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.
