Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers


 Estella Thomas
 2 years ago
 Views:
Transcription
1 Foundatons and Trends R n Machne Learnng Vol. 3, No. 1 (2010) c 2011 S. Boyd, N. Parkh, E. Chu, B. Peleato and J. Ecksten DOI: / Dstrbuted Optmzaton and Statstcal Learnng va the Alternatng Drecton Method of Multplers Stephen Boyd 1, Neal Parkh 2, Erc Chu 3 Borja Peleato 4 and Jonathan Ecksten 5 1 Electrcal Engneerng Department, Stanford Unversty, Stanford, CA 94305, USA, 2 Computer Scence Department, Stanford Unversty, Stanford, CA 94305, USA, 3 Electrcal Engneerng Department, Stanford Unversty, Stanford, CA 94305, USA, 4 Electrcal Engneerng Department, Stanford Unversty, Stanford, CA 94305, USA, 5 Management Scence and Informaton Systems Department and RUTCOR, Rutgers Unversty, Pscataway, NJ 08854, USA,
2 Contents 1 Introducton 3 2 Precursors Dual Ascent Dual Decomposton Augmented Lagrangans and the Method of Multplers 10 3 Alternatng Drecton Method of Multplers Algorthm Convergence Optmalty Condtons and Stoppng Crteron Extensons and Varatons Notes and References 23 4 General Patterns Proxmty Operator Quadratc Objectve Terms Smooth Objectve Terms Decomposton 31 5 Constraned Convex Optmzaton Convex Feasblty Lnear and Quadratc Programmng 36
3 6 l 1 Norm Problems Least Absolute Devatons Bass Pursut General l 1 Regularzed Loss Mnmzaton Lasso Sparse Inverse Covarance Selecton 45 7 Consensus and Sharng Global Varable Consensus Optmzaton General Form Consensus Optmzaton Sharng 56 8 Dstrbuted Model Fttng Examples Splttng across Examples Splttng across Features 66 9 Nonconvex Problems Nonconvex Constrants Bconvex Problems Implementaton Abstract Implementaton MPI Graph Computng Frameworks MapReduce Numercal Examples Small Dense Lasso Dstrbuted l 1 Regularzed Logstc Regresson Group Lasso wth Feature Splttng Dstrbuted LargeScale Lasso wth MPI Regressor Selecton 100
4 12 Conclusons 103 Acknowledgments 105 A Convergence Proof 106 References 111
5 Abstract Many problems of recent nterest n statstcs and machne learnng can be posed n the framework of convex optmzaton. Due to the exploson n sze and complexty of modern datasets, t s ncreasngly mportant to be able to solve problems wth a very large number of features or tranng examples. As a result, both the decentralzed collecton or storage of these datasets as well as accompanyng dstrbuted soluton methods are ether necessary or at least hghly desrable. In ths revew, we argue that the alternatng drecton method of multplers s well suted to dstrbuted convex optmzaton, and n partcular to largescale problems arsng n statstcs, machne learnng, and related areas. The method was developed n the 1970s, wth roots n the 1950s, and s equvalent or closely related to many other algorthms, such as dual decomposton, the method of multplers, Douglas Rachford splttng, Spngarn s method of partal nverses, Dykstra s alternatng projectons, Bregman teratve algorthms for l 1 problems, proxmal methods, and others. After brefly surveyng the theory and hstory of the algorthm, we dscuss applcatons to a wde varety of statstcal and machne learnng problems of recent nterest, ncludng the lasso, sparse logstc regresson, bass pursut, covarance selecton, support vector machnes, and many others. We also dscuss general dstrbuted optmzaton, extensons to the nonconvex settng, and effcent mplementaton, ncludng some detals on dstrbuted MPI and Hadoop MapReduce mplementatons.
6 1 Introducton In all appled felds, t s now commonplace to attack problems through data analyss, partcularly through the use of statstcal and machne learnng algorthms on what are often large datasets. In ndustry, ths trend has been referred to as Bg Data, and t has had a sgnfcant mpact n areas as vared as artfcal ntellgence, nternet applcatons, computatonal bology, medcne, fnance, marketng, journalsm, network analyss, and logstcs. Though these problems arse n dverse applcaton domans, they share some key characterstcs. Frst, the datasets are often extremely large, consstng of hundreds of mllons or bllons of tranng examples; second, the data s often very hghdmensonal, because t s now possble to measure and store very detaled nformaton about each example; and thrd, because of the large scale of many applcatons, the data s often stored or even collected n a dstrbuted manner. As a result, t has become of central mportance to develop algorthms that are both rch enough to capture the complexty of modern data, and scalable enough to process huge datasets n a parallelzed or fully decentralzed fashon. Indeed, some researchers [92] have suggested that even hghly complex and structured problems may succumb most easly to relatvely smple models traned on vast datasets. 3
7 4 Introducton Many such problems can be posed n the framework of convex optmzaton. Gven the sgnfcant work on decomposton methods and decentralzed algorthms n the optmzaton communty, t s natural to look to parallel optmzaton algorthms as a mechansm for solvng largescale statstcal tasks. Ths approach also has the beneft that one algorthm could be flexble enough to solve many problems. Ths revew dscusses the alternatng drecton method of multplers (ADMM), a smple but powerful algorthm that s well suted to dstrbuted convex optmzaton, and n partcular to problems arsng n appled statstcs and machne learnng. It takes the form of a decompostoncoordnaton procedure, n whch the solutons to small local subproblems are coordnated to fnd a soluton to a large global problem. ADMM can be vewed as an attempt to blend the benefts of dual decomposton and augmented Lagrangan methods for constraned optmzaton, two earler approaches that we revew n 2. It turns out to be equvalent or closely related to many other algorthms as well, such as DouglasRachford splttng from numercal analyss, Spngarn s method of partal nverses, Dykstra s alternatng projectons method, Bregman teratve algorthms for l 1 problems n sgnal processng, proxmal methods, and many others. The fact that t has been renvented n dfferent felds over the decades underscores the ntutve appeal of the approach. It s worth emphaszng that the algorthm tself s not new, and that we do not present any new theoretcal results. It was frst ntroduced n the md1970s by Gabay, Mercer, Glownsk, and Marrocco, though smlar deas emerged as early as the md1950s. The algorthm was studed throughout the 1980s, and by the md1990s, almost all of the theoretcal results mentoned here had been establshed. The fact that ADMM was developed so far n advance of the ready avalablty of largescale dstrbuted computng systems and massve optmzaton problems may account for why t s not as wdely known today as we beleve t should be. The man contrbutons of ths revew can be summarzed as follows: (1) We provde a smple, cohesve dscusson of the extensve lterature n a way that emphaszes and unfes the aspects of prmary mportance n applcatons.
8 5 (2) We show, through a number of examples, that the algorthm s well suted for a wde varety of largescale dstrbuted modern problems. We derve methods for decomposng a wde class of statstcal problems by tranng examples and by features, whch s not easly accomplshed n general. (3) We place a greater emphass on practcal largescale mplementaton than most prevous references. In partcular, we dscuss the mplementaton of the algorthm n cloud computng envronments usng standard frameworks and provde easly readable mplementatons of many of our examples. Throughout, the focus s on applcatons rather than theory, and a man goal s to provde the reader wth a knd of toolbox that can be appled n many stuatons to derve and mplement a dstrbuted algorthm of practcal use. Though the focus here s on parallelsm, the algorthm can also be used serally, and t s nterestng to note that wth no tunng, ADMM can be compettve wth the best known methods for some problems. Whle we have emphaszed applcatons that can be concsely explaned, the algorthm would also be a natural ft for more complcated problems n areas lke graphcal models. In addton, though our focus s on statstcal learnng problems, the algorthm s readly applcable n many other cases, such as n engneerng desgn, multperod portfolo optmzaton, tme seres analyss, network flow, or schedulng. Outlne We begn n 2 wth a bref revew of dual decomposton and the method of multplers, two mportant precursors to ADMM. Ths secton s ntended manly for background and can be skmmed. In 3, we present ADMM, ncludng a basc convergence theorem, some varatons on the basc verson that are useful n practce, and a survey of some of the key lterature. A complete convergence proof s gven n appendx A. In 4, we descrbe some general patterns that arse n applcatons of the algorthm, such as cases when one of the steps n ADMM can
9 6 Introducton be carred out partcularly effcently. These general patterns wll recur throughout our examples. In 5, we consder the use of ADMM for some generc convex optmzaton problems, such as constraned mnmzaton and lnear and quadratc programmng. In 6, we dscuss a wde varety of problems nvolvng the l 1 norm. It turns out that ADMM yelds methods for these problems that are related to many stateoftheart algorthms. Ths secton also clarfes why ADMM s partcularly well suted to machne learnng problems. In 7, we present consensus and sharng problems, whch provde general frameworks for dstrbuted optmzaton. In 8, we consder dstrbuted methods for generc model fttng problems, ncludng regularzed regresson models lke the lasso and classfcaton models lke support vector machnes. In 9, we consder the use of ADMM as a heurstc for solvng some nonconvex problems. In 10, we dscuss some practcal mplementaton detals, ncludng how to mplement the algorthm n frameworks sutable for cloud computng applcatons. Fnally, n 11, we present the detals of some numercal experments.
10 2 Precursors In ths secton, we brefly revew two optmzaton algorthms that are precursors to the alternatng drecton method of multplers. Whle we wll not use ths materal n the sequel, t provdes some useful background and motvaton. 2.1 Dual Ascent Consder the equaltyconstraned convex optmzaton problem mnmze f(x) subject to Ax = b, (2.1) wth varable x R n, where A R m n and f : R n R s convex. The Lagrangan for problem (2.1) s and the dual functon s L(x,y) =f(x) +y T (Ax b) g(y) = nf L(x,y) = f ( A T y) b T y, x where y s the dual varable or Lagrange multpler, and f s the convex conjugate of f; see [20, 3.3] or [140, 12] for background. The dual 7
11 8 Precursors problem s maxmze g(y), wth varable y R m. Assumng that strong dualty holds, the optmal values of the prmal and dual problems are the same. We can recover a prmal optmal pont x from a dual optmal pont y as x = argmnl(x,y ), x provded there s only one mnmzer of L(x,y ). (Ths s the case f, e.g., f s strctly convex.) In the sequel, we wll use the notaton argmn x F (x) to denote any mnmzer of F, even when F does not have a unque mnmzer. In the dual ascent method, we solve the dual problem usng gradent ascent. Assumng that g s dfferentable, the gradent g(y) can be evaluated as follows. We frst fnd x + = argmn x L(x,y); then we have g(y)=ax + b, whch s the resdual for the equalty constrant. The dual ascent method conssts of teratng the updates x k+1 := argmnl(x,y k ) (2.2) x y k+1 := y k + α k (Ax k+1 b), (2.3) where α k > 0 s a step sze, and the superscrpt s the teraton counter. The frst step (2.2) s an xmnmzaton step, and the second step (2.3) s a dual varable update. The dual varable y can be nterpreted as a vector of prces, and the yupdate s then called a prce update or prce adjustment step. Ths algorthm s called dual ascent snce, wth approprate choce of α k, the dual functon ncreases n each step,.e., g(y k+1 ) >g(y k ). The dual ascent method can be used even n some cases when g s not dfferentable. In ths case, the resdual Ax k+1 b s not the gradent of g, but the negatve of a subgradent of g. Ths case requres a dfferent choce of the α k than when g s dfferentable, and convergence s not monotone; t s often the case that g(y k+1 ) g(y k ). In ths case, the algorthm s usually called the dual subgradent method [152]. If α k s chosen approprately and several other assumptons hold, then x k converges to an optmal pont and y k converges to an optmal
12 2.2 Dual Decomposton 9 dual pont. However, these assumptons do not hold n many applcatons, so dual ascent often cannot be used. As an example, f f s a nonzero affne functon of any component of x, then the xupdate (2.2) fals, snce L s unbounded below n x for most y. 2.2 Dual Decomposton The major beneft of the dual ascent method s that t can lead to a decentralzed algorthm n some cases. Suppose, for example, that the objectve f s separable (wth respect to a partton or splttng of the varable nto subvectors), meanng that f(x) = N f (x ), =1 where x =(x 1,...,x N ) and the varables x R n are subvectors of x. Parttonng the matrx A conformably as A =[A 1 A N ], so Ax = N =1 A x, the Lagrangan can be wrtten as L(x,y)= N L (x,y)= =1 N ( f (x )+y T A x (1/N )y T b ), =1 whch s also separable n x. Ths means that the xmnmzaton step (2.2) splts nto N separate problems that can be solved n parallel. Explctly, the algorthm s := argmnl (x,y k ) (2.4) x y k+1 := y k + α k (Ax k+1 b). (2.5) x k+1 The xmnmzaton step (2.4) s carred out ndependently, n parallel, for each =1,...,N. In ths case, we refer to the dual ascent method as dual decomposton. In the general case, each teraton of the dual decomposton method requres a broadcast and a gather operaton. In the dual update step (2.5), the equalty constrant resdual contrbutons A x k+1 are
13 10 Precursors collected (gathered) n order to compute the resdual Ax k+1 b. Once the (global) dual varable y k+1 s computed, t must be dstrbuted (broadcast) to the processors that carry out the N ndvdual x mnmzaton steps (2.4). Dual decomposton s an old dea n optmzaton, and traces back at least to the early 1960s. Related deas appear n well known work by Dantzg and Wolfe [44] and Benders [13] on largescale lnear programmng, as well as n Dantzg s semnal book [43]. The general dea of dual decomposton appears to be orgnally due to Everett [69], and s explored n many early references [107, 84, 117, 14]. The use of nondfferentable optmzaton, such as the subgradent method, to solve the dual problem s dscussed by Shor [152]. Good references on dual methods and decomposton nclude the book by Bertsekas [16, chapter 6] and the survey by Nedć and Ozdaglar [131] on dstrbuted optmzaton, whch dscusses dual decomposton methods and consensus problems. A number of papers also dscuss varants on standard dual decomposton, such as [129]. More generally, decentralzed optmzaton has been an actve topc of research snce the 1980s. For nstance, Tstskls and hs coauthors worked on a number of decentralzed detecton and consensus problems nvolvng the mnmzaton of a smooth functon f known to multple agents [160, 161, 17]. Some good reference books on parallel optmzaton nclude those by Bertsekas and Tstskls [17] and Censor and Zenos [31]. There has also been some recent work on problems where each agent has ts own convex, potentally nondfferentable, objectve functon [130]. See [54] for a recent dscusson of dstrbuted methods for graphstructured optmzaton problems. 2.3 Augmented Lagrangans and the Method of Multplers Augmented Lagrangan methods were developed n part to brng robustness to the dual ascent method, and n partcular, to yeld convergence wthout assumptons lke strct convexty or fnteness of f. The augmented Lagrangan for (2.1) s L ρ (x,y)=f(x) +y T (Ax b) +(ρ/2) Ax b 2 2, (2.6)
14 2.3 Augmented Lagrangans and the Method of Multplers 11 where ρ>0 s called the penalty parameter. (Note that L 0 s the standard Lagrangan for the problem.) The augmented Lagrangan can be vewed as the (unaugmented) Lagrangan assocated wth the problem mnmze f(x) +(ρ/2) Ax b 2 2 subject to Ax = b. Ths problem s clearly equvalent to the orgnal problem (2.1), snce for any feasble x the term added to the objectve s zero. The assocated dual functon s g ρ (y) = nf x L ρ (x,y). The beneft of ncludng the penalty term s that g ρ can be shown to be dfferentable under rather mld condtons on the orgnal problem. The gradent of the augmented dual functon s found the same way as wth the ordnary Lagrangan,.e., by mnmzng over x, and then evaluatng the resultng equalty constrant resdual. Applyng dual ascent to the modfed problem yelds the algorthm x k+1 := argmnl ρ (x,y k ) (2.7) x y k+1 := y k + ρ(ax k+1 b), (2.8) whch s known as the method of multplers for solvng (2.1). Ths s the same as standard dual ascent, except that the xmnmzaton step uses the augmented Lagrangan, and the penalty parameter ρ s used as the step sze α k. The method of multplers converges under far more general condtons than dual ascent, ncludng cases when f takes on the value + or s not strctly convex. It s easy to motvate the choce of the partcular step sze ρ n the dual update (2.8). For smplcty, we assume here that f s dfferentable, though ths s not requred for the algorthm to work. The optmalty condtons for (2.1) are prmal and dual feasblty,.e., Ax b =0, f(x )+A T y =0, respectvely. By defnton, x k+1 mnmzes L ρ (x,y k ), so 0= x L ρ (x k+1,y k ) ( ) = x f(x k+1 )+A T y k + ρ(ax k+1 b) = x f(x k+1 )+A T y k+1.
15 12 Precursors We see that by usng ρ as the step sze n the dual update, the terate (x k+1,y k+1 ) s dual feasble. As the method of multplers proceeds, the prmal resdual Ax k+1 b converges to zero, yeldng optmalty. The greatly mproved convergence propertes of the method of multplers over dual ascent comes at a cost. When f s separable, the augmented Lagrangan L ρ s not separable, so the xmnmzaton step (2.7) cannot be carred out separately n parallel for each x. Ths means that the basc method of multplers cannot be used for decomposton. We wll see how to address ths ssue next. Augmented Lagrangans and the method of multplers for constraned optmzaton were frst proposed n the late 1960s by Hestenes [97, 98] and Powell [138]. Many of the early numercal experments on the method of multplers are due to Mele et al. [124, 125, 126]. Much of the early work s consoldated n a monograph by Bertsekas [15], who also dscusses smlartes to older approaches usng Lagrangans and penalty functons [6, 5, 71], as well as a number of generalzatons.
16 3 Alternatng Drecton Method of Multplers 3.1 Algorthm ADMM s an algorthm that s ntended to blend the decomposablty of dual ascent wth the superor convergence propertes of the method of multplers. The algorthm solves problems n the form mnmze subject to f(x) +g(z) Ax + Bz = c (3.1) wth varables x R n and z R m, where A R p n, B R p m, and c R p. We wll assume that f and g are convex; more specfc assumptons wll be dscussed n 3.2. The only dfference from the general lnear equaltyconstraned problem (2.1) s that the varable, called x there, has been splt nto two parts, called x and z here, wth the objectve functon separable across ths splttng. The optmal value of the problem (3.1) wll be denoted by p = nf{f(x) +g(z) Ax + Bz = c}. As n the method of multplers, we form the augmented Lagrangan L ρ (x,z,y)=f(x) +g(z) +y T (Ax + Bz c) +(ρ/2) Ax + Bz c
17 14 Alternatng Drecton Method of Multplers ADMM conssts of the teratons x k+1 := argmnl ρ (x,z k,y k ) (3.2) x z k+1 := argmnl ρ (x k+1,z,y k ) (3.3) z y k+1 := y k + ρ(ax k+1 + Bz k+1 c), (3.4) where ρ>0. The algorthm s very smlar to dual ascent and the method of multplers: t conssts of an xmnmzaton step (3.2), a zmnmzaton step (3.3), and a dual varable update (3.4). As n the method of multplers, the dual varable update uses a step sze equal to the augmented Lagrangan parameter ρ. The method of multplers for (3.1) has the form (x k+1,z k+1 ) := argmnl ρ (x,z,y k ) x,z y k+1 := y k + ρ(ax k+1 + Bz k+1 c). Here the augmented Lagrangan s mnmzed jontly wth respect to the two prmal varables. In ADMM, on the other hand, x and z are updated n an alternatng or sequental fashon, whch accounts for the term alternatng drecton. ADMM can be vewed as a verson of the method of multplers where a sngle GaussSedel pass [90, 10.1] over x and z s used nstead of the usual jont mnmzaton. Separatng the mnmzaton over x and z nto two steps s precsely what allows for decomposton when f or g are separable. The algorthm state n ADMM conssts of z k and y k. In other words, (z k+1,y k+1 ) s a functon of (z k,y k ). The varable x k s not part of the state; t s an ntermedate result computed from the prevous state (z k 1,y k 1 ). If we swtch (relabel) x and z, f and g, and A and B n the problem (3.1), we obtan a varaton on ADMM wth the order of the x update step (3.2) and zupdate step (3.3) reversed. The roles of x and z are almost symmetrc, but not qute, snce the dual update s done after the zupdate but before the xupdate.
18 3.2 Convergence Scaled Form ADMM can be wrtten n a slghtly dfferent form, whch s often more convenent, by combnng the lnear and quadratc terms n the augmented Lagrangan and scalng the dual varable. Defnng the resdual r = Ax + Bz c, wehave y T r +(ρ/2) r 2 2 =(ρ/2) r +(1/ρ)y 2 2 (1/2ρ) y 2 2 =(ρ/2) r + u 2 2 (ρ/2) u 2 2, where u =(1/ρ)y s the scaled dual varable. Usng the scaled dual varable, we can express ADMM as ( ) x k+1 := argmn f(x) +(ρ/2) Ax + Bz k c + u k 2 2 (3.5) x ( ) z k+1 := argmn g(z) +(ρ/2) Ax k+1 + Bz c + u k 2 2 (3.6) z u k+1 := u k + Ax k+1 + Bz k+1 c. (3.7) Defnng the resdual at teraton k as r k = Ax k + Bz k c, we see that u k = u 0 + k r j, the runnng sum of the resduals. We call the frst form of ADMM above, gven by ( ), the unscaled form, and the second form ( ) the scaled form, snce t s expressed n terms of a scaled verson of the dual varable. The two are clearly equvalent, but the formulas n the scaled form of ADMM are often shorter than n the unscaled form, so we wll use the scaled form n the sequel. We wll use the unscaled form when we wsh to emphasze the role of the dual varable or to gve an nterpretaton that reles on the (unscaled) dual varable. 3.2 Convergence There are many convergence results for ADMM dscussed n the lterature; here, we lmt ourselves to a basc but stll very general result that apples to all of the examples we wll consder. We wll make one j=1
19 16 Alternatng Drecton Method of Multplers assumpton about the functons f and g, and one assumpton about problem (3.1). Assumpton 1. The (extendedrealvalued) functons f : R n R {+ } and g : R m R {+ } are closed, proper, and convex. Ths assumpton can be expressed compactly usng the epgraphs of the functons: The functon f satsfes assumpton 1 f and only f ts epgraph epf = {(x,t) R n R f(x) t} s a closed nonempty convex set. Assumpton 1 mples that the subproblems arsng n the xupdate (3.2) and zupdate (3.3) are solvable,.e., there exst x and z, not necessarly unque (wthout further assumptons on A and B), that mnmze the augmented Lagrangan. It s mportant to note that assumpton 1 allows f and g to be nondfferentable and to assume the value +. For example, we can take f to be the ndcator functon of a closed nonempty convex set C,.e., f(x) =0forx C and f(x) =+ otherwse. In ths case, the xmnmzaton step (3.2) wll nvolve solvng a constraned quadratc program over C, the effectve doman of f. Assumpton 2. The unaugmented Lagrangan L 0 has a saddle pont. Explctly, there exst (x,z,y ), not necessarly unque, for whch L 0 (x,z,y) L 0 (x,z,y ) L 0 (x,z,y ) holds for all x, z, y. By assumpton 1, t follows that L 0 (x,z,y ) s fnte for any saddle pont (x,z,y ). Ths mples that (x,z ) s a soluton to (3.1), so Ax + Bz = c and f(x ) <, g(z ) <. It also mples that y s dual optmal, and the optmal values of the prmal and dual problems are equal,.e., that strong dualty holds. Note that we make no assumptons about A, B, or c, except mplctly through assumpton 2; n partcular, nether A nor B s requred to be full rank.
20 3.2 Convergence Convergence Under assumptons 1 and 2, the ADMM terates satsfy the followng: Resdual convergence. r k 0 as k,.e., the terates approach feasblty. Objectve convergence. f(x k )+g(z k ) p as k,.e., the objectve functon of the terates approaches the optmal value. Dual varable convergence. y k y as k, where y s a dual optmal pont. A proof of the resdual and objectve convergence results s gven n appendx A. Note that x k and z k need not converge to optmal values, although such results can be shown under addtonal assumptons Convergence n Practce Smple examples show that ADMM can be very slow to converge to hgh accuracy. However, t s often the case that ADMM converges to modest accuracy suffcent for many applcatons wthn a few tens of teratons. Ths behavor makes ADMM smlar to algorthms lke the conjugate gradent method, for example, n that a few tens of teratons wll often produce acceptable results of practcal use. However, the slow convergence of ADMM also dstngushes t from algorthms such as Newton s method (or, for constraned problems, nterorpont methods), where hgh accuracy can be attaned n a reasonable amount of tme. Whle n some cases t s possble to combne ADMM wth a method for producng a hgh accuracy soluton from a low accuracy soluton [64], n the general case ADMM wll be practcally useful mostly n cases when modest accuracy s suffcent. Fortunately, ths s usually the case for the knds of largescale problems we consder. Also, n the case of statstcal and machne learnng problems, solvng a parameter estmaton problem to very hgh accuracy often yelds lttle to no mprovement n actual predcton performance, the real metrc of nterest n applcatons.
21 18 Alternatng Drecton Method of Multplers 3.3 Optmalty Condtons and Stoppng Crteron The necessary and suffcent optmalty condtons for the ADMM problem (3.1) are prmal feasblty, and dual feasblty, Ax + Bz c =0, (3.8) 0 f(x )+A T y (3.9) 0 g(z )+B T y. (3.10) Here, denotes the subdfferental operator; see, e.g., [140, 19, 99]. (When f and g are dfferentable, the subdfferentals f and g can be replaced by the gradents f and g, and can be replaced by =.) Snce z k+1 mnmzes L ρ (x k+1,z,y k ) by defnton, we have that 0 g(z k+1 )+B T y k + ρb T (Ax k+1 + Bz k+1 c) = g(z k+1 )+B T y k + ρb T r k+1 = g(z k+1 )+B T y k+1. Ths means that z k+1 and y k+1 always satsfy (3.10), so attanng optmalty comes down to satsfyng (3.8) and (3.9). Ths phenomenon s analogous to the terates of the method of multplers always beng dual feasble; see page 11. Snce x k+1 mnmzes L ρ (x,z k,y k ) by defnton, we have that or equvalently, 0 f(x k+1 )+A T y k + ρa T (Ax k+1 + Bz k c) = f(x k+1 )+A T (y k + ρr k+1 + ρb(z k z k+1 )) = f(x k+1 )+A T y k+1 + ρa T B(z k z k+1 ), ρa T B(z k+1 z k ) f(x k+1 )+A T y k+1. Ths means that the quantty s k+1 = ρa T B(z k+1 z k ) can be vewed as a resdual for the dual feasblty condton (3.9). We wll refer to s k+1 as the dual resdual at teraton k + 1, and to r k+1 = Ax k+1 + Bz k+1 c as the prmal resdual at teraton k +1.
22 3.3 Optmalty Condtons and Stoppng Crteron 19 In summary, the optmalty condtons for the ADMM problem consst of three condtons, ( ). The last condton (3.10) always holds for (x k+1,z k+1,y k+1 ); the resduals for the other two, (3.8) and (3.9), are the prmal and dual resduals r k+1 and s k+1, respectvely. These two resduals converge to zero as ADMM proceeds. (In fact, the convergence proof n appendx A shows B(z k+1 z k ) converges to zero, whch mples s k converges to zero.) Stoppng Crtera The resduals of the optmalty condtons can be related to a bound on the objectve suboptmalty of the current pont,.e., f(x k )+g(z k ) p. As shown n the convergence proof n appendx A, we have f(x k )+g(z k ) p (y k ) T r k +(x k x ) T s k. (3.11) Ths shows that when the resduals r k and s k are small, the objectve suboptmalty also must be small. We cannot use ths nequalty drectly n a stoppng crteron, however, snce we do not know x. But f we guess or estmate that x k x 2 d, we have that f(x k )+g(z k ) p (y k ) T r k + d s k 2 y k 2 r k 2 + d s k 2. The mddle or rghthand terms can be used as an approxmate bound on the objectve suboptmalty (whch depends on our guess of d). Ths suggests that a reasonable termnaton crteron s that the prmal and dual resduals must be small,.e., r k 2 ɛ pr and s k 2 ɛ dual, (3.12) where ɛ pr > 0 and ɛ dual > 0 are feasblty tolerances for the prmal and dual feasblty condtons (3.8) and (3.9), respectvely. These tolerances can be chosen usng an absolute and relatve crteron, such as ɛ pr = pɛ abs + ɛ rel max{ Ax k 2, Bz k 2, c 2 }, ɛ dual = nɛ abs + ɛ rel A T y k 2, where ɛ abs > 0 s an absolute tolerance and ɛ rel > 0 s a relatve tolerance. (The factors p and n account for the fact that the l 2 norms are n R p and R n, respectvely.) A reasonable value for the relatve stoppng
23 20 Alternatng Drecton Method of Multplers crteron mght be ɛ rel =10 3 or 10 4, dependng on the applcaton. The choce of absolute stoppng crteron depends on the scale of the typcal varable values. 3.4 Extensons and Varatons Many varatons on the classc ADMM algorthm have been explored n the lterature. Here we brefly survey some of these varants, organzed nto groups of related deas. Some of these methods can gve superor convergence n practce compared to the standard ADMM presented above. Most of the extensons have been rgorously analyzed, so the convergence results descrbed above are stll vald (n some cases, under some addtonal condtons) Varyng Penalty Parameter A standard extenson s to use possbly dfferent penalty parameters ρ k for each teraton, wth the goal of mprovng the convergence n practce, as well as makng performance less dependent on the ntal choce of the penalty parameter. In the context of the method of multplers, ths approach s analyzed n [142], where t s shown that superlnear convergence may be acheved f ρ k. Though t can be dffcult to prove the convergence of ADMM when ρ vares by teraton, the fxedρ theory stll apples f one just assumes that ρ becomes fxed after a fnte number of teratons. A smple scheme that often works well s (see, e.g., [96, 169]): τ ncr ρ k f r k 2 >µ s k 2 ρ k+1 := ρ k /τ decr f s k 2 >µ r k 2 (3.13) ρ k otherwse, where µ>1, τ ncr > 1, and τ decr > 1 are parameters. Typcal choces mght be µ = 10 and τ ncr = τ decr = 2. The dea behnd ths penalty parameter update s to try to keep the prmal and dual resdual norms wthn a factor of µ of one another as they both converge to zero. The ADMM update equatons suggest that large values of ρ place a large penalty on volatons of prmal feasblty and so tend to produce
Support Vector Machines
Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.
More informationAn Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
More information8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by
6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng
More informationThe Development of Web Log Mining Based on ImproveKMeans Clustering Analysis
The Development of Web Log Mnng Based on ImproveKMeans Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More informationL10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
More information1 Approximation Algorithms
CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons
More informationInstitute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic
Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange
More informationBERNSTEIN POLYNOMIALS
OnLne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationA Computer Technique for Solving LP Problems with Bounded Variables
Dhaka Unv. J. Sc. 60(2): 163168, 2012 (July) A Computer Technque for Solvng LP Problems wth Bounded Varables S. M. Atqur Rahman Chowdhury * and Sanwar Uddn Ahmad Department of Mathematcs; Unversty of
More informationNonlinear data mapping by neural networks
Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal
More informationOn the Solution of Indefinite Systems Arising in Nonlinear Optimization
On the Soluton of Indefnte Systems Arsng n Nonlnear Optmzaton Slva Bonettn, Valera Ruggero and Federca Tnt Dpartmento d Matematca, Unverstà d Ferrara Abstract We consder the applcaton of the precondtoned
More informationProject Networks With MixedTime Constraints
Project Networs Wth MxedTme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa
More informationv a 1 b 1 i, a 2 b 2 i,..., a n b n i.
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are
More informationFisher Markets and Convex Programs
Fsher Markets and Convex Programs Nkhl R. Devanur 1 Introducton Convex programmng dualty s usually stated n ts most general form, wth convex objectve functons and convex constrants. (The book by Boyd and
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract  Stock market s one of the most complcated systems
More informationLogistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification
Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson
More informationPowerofTwo Policies for Single Warehouse MultiRetailer Inventory Systems with Order Frequency Discounts
Powerofwo Polces for Sngle Warehouse MultRetaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)
More informationarxiv:1311.2444v1 [cs.dc] 11 Nov 2013
FLEXIBLE PARALLEL ALGORITHMS FOR BIG DATA OPTIMIZATION Francsco Facchne 1, Smone Sagratella 1, Gesualdo Scutar 2 1 Dpt. of Computer, Control, and Management Eng., Unversty of Rome La Sapenza", Roma, Italy.
More informationQuestions that we may have about the variables
Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent
More informationFeature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College
Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure
More informationLoop Parallelization
  Loop Parallelzaton C52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I,J]+B[I,J] ED FOR ED FOR analyze
More informationJoint Scheduling of Processing and Shuffle Phases in MapReduce Systems
Jont Schedulng of Processng and Shuffle Phases n MapReduce Systems Fangfe Chen, Mural Kodalam, T. V. Lakshman Department of Computer Scence and Engneerng, The Penn State Unversty Bell Laboratores, AlcatelLucent
More information8 Algorithm for Binary Searching in Trees
8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the
More informationIMPROVEMENT OF CONVERGENCE CONDITION OF THE SQUAREROOT INTERVAL METHOD FOR MULTIPLE ZEROS 1
Nov Sad J. Math. Vol. 36, No. 2, 2006, 009 IMPROVEMENT OF CONVERGENCE CONDITION OF THE SQUAREROOT INTERVAL METHOD FOR MULTIPLE ZEROS Modrag S. Petkovć 2, Dušan M. Mloševć 3 Abstract. A new theorem concerned
More information6. EIGENVALUES AND EIGENVECTORS 3 = 3 2
EIGENVALUES AND EIGENVECTORS The Characterstc Polynomal If A s a square matrx and v s a nonzero vector such that Av v we say that v s an egenvector of A and s the correspondng egenvalue Av v Example :
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More information9.1 The Cumulative Sum Control Chart
Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s
More informationGraph Theory and Cayley s Formula
Graph Theory and Cayley s Formula Chad Casarotto August 10, 2006 Contents 1 Introducton 1 2 Bascs and Defntons 1 Cayley s Formula 4 4 Prüfer Encodng A Forest of Trees 7 1 Introducton In ths paper, I wll
More informationCapital asset pricing model, arbitrage pricing theory and portfolio management
Captal asset prcng model, arbtrage prcng theory and portfolo management Vnod Kothar The captal asset prcng model (CAPM) s great n terms of ts understandng of rsk decomposton of rsk nto securtyspecfc rsk
More informationThe Analysis of Outliers in Statistical Data
THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate
More informationThe eigenvalue derivatives of linear damped systems
Control and Cybernetcs vol. 32 (2003) No. 4 The egenvalue dervatves of lnear damped systems by YeongJeu Sun Department of Electrcal Engneerng IShou Unversty Kaohsung, Tawan 840, R.O.C emal: yjsun@su.edu.tw
More informationgreatest common divisor
4. GCD 1 The greatest common dvsor of two ntegers a and b (not both zero) s the largest nteger whch s a common factor of both a and b. We denote ths number by gcd(a, b), or smply (a, b) when there s no
More informationA Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy Scurve Regression
Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy Scurve Regresson ChengWu Chen, Morrs H. L. Wang and TngYa Hseh Department of Cvl Engneerng, Natonal Central Unversty,
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? ChuShu L Department of Internatonal Busness, Asa Unversty, Tawan ShengChang
More informationMultiplePeriod Attribution: Residuals and Compounding
MultplePerod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens
More informationInequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.
Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.
More informationA hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):18841889 Research Artcle ISSN : 09757384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel
More informationParallel Algorithms for Big Data Optimization
Parallel Algorthms for Bg Data Optmzaton 1 Francsco Facchne, Smone Sagratella, and Gesualdo Scutar Senor Member, IEEE Abstract We propose a decomposton framework for the parallel optmzaton of the sum of
More informationHow Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence
1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh
More informationStudy on CET4 Marks in China s Graded English Teaching
Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes causeandeffect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationHYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION
HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION Abdul Ghapor Hussn Centre for Foundaton Studes n Scence Unversty of Malaya 563 KUALA LUMPUR Emal: ghapor@umedumy Abstract Ths paper
More informationOptimal resource capacity management for stochastic networks
Submtted for publcaton. Optmal resource capacty management for stochastc networks A.B. Deker H. Mlton Stewart School of ISyE, Georga Insttute of Technology, Atlanta, GA 30332, ton.deker@sye.gatech.edu
More informationOn the Optimal Control of a Cascade of HydroElectric Power Stations
On the Optmal Control of a Cascade of HydroElectrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;
More informationDescriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications
CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary
More informationA Probabilistic Theory of Coherence
A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want
More informationMultivariate EWMA Control Chart
Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant
More informationwhere the coordinates are related to those in the old frame as follows.
Chapter 2  Cartesan Vectors and Tensors: Ther Algebra Defnton of a vector Examples of vectors Scalar multplcaton Addton of vectors coplanar vectors Unt vectors A bass of noncoplanar vectors Scalar product
More informationState function: eigenfunctions of hermitian operators> normalization, orthogonality completeness
Schroednger equaton Basc postulates of quantum mechancs. Operators: Hermtan operators, commutators State functon: egenfunctons of hermtan operators> normalzaton, orthogonalty completeness egenvalues and
More informationThe Greedy Method. Introduction. 0/1 Knapsack Problem
The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton
More information2.4 Bivariate distributions
page 28 2.4 Bvarate dstrbutons 2.4.1 Defntons Let X and Y be dscrete r.v.s defned on the same probablty space (S, F, P). Instead of treatng them separately, t s often necessary to thnk of them actng together
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMISP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationRobust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School
Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management
More informationPoint cloud to point cloud rigid transformations. Minimizing Rigid Registration Errors
Pont cloud to pont cloud rgd transformatons Russell Taylor 600.445 1 600.445 Fall 000014 Copyrght R. H. Taylor Mnmzng Rgd Regstraton Errors Typcally, gven a set of ponts {a } n one coordnate system and
More informationI. SCOPE, APPLICABILITY AND PARAMETERS Scope
D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable
More informationIntroduction: Analysis of Electronic Circuits
/30/008 ntroducton / ntroducton: Analyss of Electronc Crcuts Readng Assgnment: KVL and KCL text from EECS Just lke EECS, the majorty of problems (hw and exam) n EECS 3 wll be crcut analyss problems. Thus,
More informationThe covariance is the two variable analog to the variance. The formula for the covariance between two variables is
Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.
More informationCS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements
Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there
More informationWe are now ready to answer the question: What are the possible cardinalities for finite fields?
Chapter 3 Fnte felds We have seen, n the prevous chapters, some examples of fnte felds. For example, the resdue class rng Z/pZ (when p s a prme) forms a feld wth p elements whch may be dentfed wth the
More informationQuality Adjustment of Secondhand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index
Qualty Adustment of Secondhand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan
More informationYves Genin, Yurii Nesterov, Paul Van Dooren. CESAME, Universite Catholique de Louvain. B^atiment Euler, Avenue G. Lema^tre 46
Submtted to ECC 99 as a regular paper n Lnear Systems Postve transfer functons and convex optmzaton 1 Yves Genn, Yur Nesterov, Paul Van Dooren CESAME, Unverste Catholque de Louvan B^atment Euler, Avenue
More informationNew Approaches to Support Vector Ordinal Regression
New Approaches to Support Vector Ordnal Regresson We Chu chuwe@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt, Unversty College London, London, WCN 3AR, UK S. Sathya Keerth selvarak@yahoonc.com
More informationOn the Interaction between Load Balancing and Speed Scaling
On the Interacton between Load Balancng and Speed Scalng Ljun Chen, Na L and Steven H. Low Engneerng & Appled Scence Dvson, Calforna Insttute of Technology, USA Abstract Speed scalng has been wdely adopted
More informationANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING
ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 6105194390,
More informationAryabhata s Root Extraction Methods. Abhishek Parakh Louisiana State University Aug 31 st 2006
Aryabhata s Root Extracton Methods Abhshek Parakh Lousana State Unversty Aug 1 st 1 Introducton Ths artcle presents an analyss of the root extracton algorthms of Aryabhata gven n hs book Āryabhatīya [1,
More informationForecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network
700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School
More informationPeriod and Deadline Selection for Schedulability in RealTime Systems
Perod and Deadlne Selecton for Schedulablty n RealTme Systems Thdapat Chantem, Xaofeng Wang, M.D. Lemmon, and X. Sharon Hu Department of Computer Scence and Engneerng, Department of Electrcal Engneerng
More informationCalculating the high frequency transmission line parameters of power cables
< ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,
More informationSection B9: Zener Diodes
Secton B9: Zener Dodes When we frst talked about practcal dodes, t was mentoned that a parameter assocated wth the dode n the reverse bas regon was the breakdown voltage, BR, also known as the peaknverse
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul  PUCRS Av. Ipranga,
More informationAnts Can Schedule Software Projects
Ants Can Schedule Software Proects Broderck Crawford 1,2, Rcardo Soto 1,3, Frankln Johnson 4, and Erc Monfroy 5 1 Pontfca Unversdad Católca de Valparaíso, Chle FrstName.Name@ucv.cl 2 Unversdad Fns Terrae,
More information"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *
Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC
More informationTHE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek
HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo
More informationLatent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006
Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model
More informationJ. Parallel Distrib. Comput.
J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n
More informationLinear Regression, Regularization BiasVariance Tradeoff
HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton BasVarance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng
More informationAnswer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy
4.02 Quz Solutons Fall 2004 MultpleChoce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multplechoce questons. For each queston, only one of the answers s correct.
More informationEE201 Circuit Theory I 2015 Spring. Dr. Yılmaz KALKAN
EE201 Crcut Theory I 2015 Sprng Dr. Yılmaz KALKAN 1. Basc Concepts (Chapter 1 of Nlsson  3 Hrs.) Introducton, Current and Voltage, Power and Energy 2. Basc Laws (Chapter 2&3 of Nlsson  6 Hrs.) Voltage
More informationAn MILP model for planning of batch plants operating in a campaignmode
An MILP model for plannng of batch plants operatng n a campagnmode Yanna Fumero Insttuto de Desarrollo y Dseño CONICET UTN yfumero@santafeconcet.gov.ar Gabrela Corsano Insttuto de Desarrollo y Dseño
More information1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)
6.3 /  Communcaton Networks II (Görg) SS20  www.comnets.unbremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes
More informationTo Fill or not to Fill: The Gas Station Problem
To Fll or not to Fll: The Gas Staton Problem Samr Khuller Azarakhsh Malekan Julán Mestre Abstract In ths paper we study several routng problems that generalze shortest paths and the Travelng Salesman Problem.
More informationAn Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services
An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsnyng Wu b a Professor (Management Scence), Natonal Chao
More informationData Broadcast on a MultiSystem Heterogeneous Overlayed Wireless Network *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819840 (2008) Data Broadcast on a MultSystem Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,
More informationVision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION
Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble
More informationx f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60
BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true
More informationSimple Interest Loans (Section 5.1) :
Chapter 5 Fnance The frst part of ths revew wll explan the dfferent nterest and nvestment equatons you learned n secton 5.1 through 5.4 of your textbook and go through several examples. The second part
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a twostage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationAn Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems
STANCS73355 I SUSE73013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part
More informationSection 5.4 Annuities, Present Value, and Amortization
Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today
More information2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet
2008/8 An ntegrated model for warehouse and nventory plannng Géraldne Strack and Yves Pochet CORE Voe du Roman Pays 34 B1348 LouvanlaNeuve, Belgum. Tel (32 10) 47 43 04 Fax (32 10) 47 43 01 Emal: corestatlbrary@uclouvan.be
More informationMANY machine learning and pattern recognition applications
1 Trace Rato Problem Revsted Yangqng Ja, Fepng Ne, and Changshu Zhang Abstract Dmensonalty reducton s an mportant ssue n many machne learnng and pattern recognton applcatons, and the trace rato problem
More informationA Secure PasswordAuthenticated Key Agreement Using Smart Cards
A Secure PasswordAuthentcated Key Agreement Usng Smart Cards Ka Chan 1, WenChung Kuo 2 and JnChou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,
More informationDiVA Digitala Vetenskapliga Arkivet
DVA Dgtala Vetenskaplga Arkvet http://umudvaportalorg Ths s a book chapter publshed n Hghperformance scentfc computng: algorthms and applcatons (ed Berry, MW; Gallvan, KA; Gallopoulos, E; Grama, A; Phlppe,
More informationPassive Filters. References: Barbow (pp 265275), Hayes & Horowitz (pp 3260), Rizzoni (Chap. 6)
Passve Flters eferences: Barbow (pp 6575), Hayes & Horowtz (pp 360), zzon (Chap. 6) Frequencyselectve or flter crcuts pass to the output only those nput sgnals that are n a desred range of frequences (called
More information行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告
行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告 畫 類 別 : 個 別 型 計 畫 半 導 體 產 業 大 型 廠 房 之 設 施 規 劃 計 畫 編 號 :NSC 962628E009026MY3 執 行 期 間 : 2007 年 8 月 1 日 至 2010 年 7 月 31 日 計 畫 主 持 人 : 巫 木 誠 共 同
More informationStatistical Methods to Develop Rating Models
Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and
More information