Multi-Class Deep Boosting

Size: px
Start display at page:

Download "Multi-Class Deep Boosting"

Transcription

1 Multi-Class Deep Boosting Vitaly Kuznetsov Courant Institute 25 Mercer Street New York, NY 002 Mehryar Mohri Courant Institute & Google Research 25 Mercer Street New York, NY 002 Uar Syed Google Research 76 Ninth Avenue New York, NY 00 Abstract We present new enseble learning algoriths for ulti-class classification. Our algoriths can use as a base classifier set a faily of deep decision trees or other rich or coplex failies and yet benefit fro strong generalization guarantees. We give new data-dependent learning bounds for convex ensebles in the ulticlass classification setting expressed in ters of the Radeacher coplexities of the sub-failies coposing the base classifier set, and the ixture weight assigned to each sub-faily. These bounds are finer than existing ones both thanks to an iproved dependency on the nuber of classes and, ore crucially, by virtue of a ore favorable coplexity ter expressed as an average of the Radeacher coplexities based on the enseble s ixture weights. We introduce and discuss several new ulti-class enseble algoriths benefiting fro these guarantees, prove positive results for the H-consistency of several of the, and report the results of experients showing that their perforance copares favorably with that of ulti-class versions of AdaBoost and Logistic Regression and their L - regularized counterparts. Introduction Devising ensebles of base predictors is a standard approach in achine learning which often helps iprove perforance in practice. Enseble ethods include the faily of boosting eta-algoriths aong which the ost notable and widely used one is AdaBoost [Freund and Schapire, 997], also known as forward stagewise additive odeling [Friedan et al., 998]. AdaBoost and its other variants learn convex cobinations of predictors. They seek to greedily iniize a convex surrogate function upper bounding the isclassification loss by augenting, at each iteration, the current enseble, with a new suitably weighted predictor. One key advantage of AdaBoost is that, since it is based on a stagewise procedure, it can learn an effective enseble of base predictors chosen fro a very large and potentially infinite faily, provided that an efficient algorith is available for selecting a good predictor at each stage. Furtherore, AdaBoost and its L -regularized counterpart [Rätsch et al., 200a] benefit fro favorable learning guarantees, in particular theoretical argin bounds [Schapire et al., 997, Koltchinskii and Panchenko, 2002]. However, those bounds depend not just on the argin and the saple size, but also on the coplexity of the base hypothesis set, which suggests a risk of overfitting when using too coplex base hypothesis sets. And indeed, overfitting has been reported in practice for AdaBoost in the past [Grove and Schuurans, 998, Schapire, 999, Dietterich, 2000, Rätsch et al., 200b]. Cortes, Mohri, and Syed [204] introduced a new enseble algorith, DeepBoost, which they proved to benefit fro finer learning guarantees, including favorable ones even when using as base classifier set relatively rich failies, for exaple a faily of very deep decision trees, or other siilarly coplex failies. In DeepBoost, the decisions in each iteration of which classifier to add to the enseble and which weight to assign to that classifier, depend on the data-dependent) coplexity

2 of the sub-faily to which the classifier belongs one interpretation of DeepBoost is that it applies the principle of structural risk iniization to each iteration of boosting. Cortes, Mohri, and Syed [204] further showed that epirically DeepBoost achieves a better perforance than AdaBoost, Logistic Regression, and their L -regularized variants. The ain contribution of this paper is an extension of these theoretical, algorithic, and epirical results to the ulti-class setting. Two distinct approaches have been considered in the past for the definition and the design of boosting algoriths in the ulti-class setting. One approach consists of cobining base classifiers apping each exaple x to an output label y. This includes the SAMME algorith [Zhu et al., 2009] as well as the algorith of Mukherjee and Schapire [203], which is shown to be, in a certain sense, optial for this approach. An alternative approach, often ore flexible and ore widely used in applications, consists of cobining base classifiers apping each pair x, y) fored by an exaple x and a label y to a real-valued score. This is the approach adopted in this paper, which is also the one used for the design of AdaBoost.MR [Schapire and Singer, 999] and other variants of that algorith. In Section 2, we prove a novel generalization bound for ulti-class classification ensebles that depends only on the Radeacher coplexity of the hypothesis classes to which the classifiers in the enseble belong. Our result generalizes the ain result of Cortes et al. [204] to the ulti-class setting, and also represents an iproveent on the ulti-class generalization bound due to Koltchinskii and Panchenko [2002], even if we disregard our finer analysis related to Radeacher coplexity. In Section 3, we present several ulti-class surrogate losses that are otivated by our generalization bound, and discuss and copare their functional and consistency properties. In particular, we prove that our surrogate losses are realizable H-consistent, a hypothesis-set-specific notion of consistency that was recently introduced by Long and Servedio [203]. Our results generalize those of Long and Servedio [203] and adit sipler proofs. We also present a faily of ulti-class DeepBoost learning algoriths based on each of these surrogate losses, and prove general convergence guarantee for the. In Section 4, we report the results of experients deonstrating that ulti-class DeepBoost outperfors AdaBoost.MR and ultinoial additive) logistic regression, as well as their L -nor regularized variants, on several datasets. 2 Multi-class data-dependent learning guarantee for convex ensebles In this section, we present a data-dependent learning bound in the ulti-class setting for convex ensebles based on ultiple base hypothesis sets. Let X denote the input space. We denote by Y = {,..., c} a set of c 2 classes. The label associated by a hypothesis f : X Y R to x X is given by argax y Y fx, y). The argin ρ f x, y) of the function f for a labeled exaple x, y) X Y is defined by ρ f x, y) = fx, y) ax y y fx, y ). ) Thus, f isclassifies x, y) iff ρ f x, y) 0. We consider p failies H,..., H p of functions apping fro X Y to [0, ] and the enseble faily F = conv p k= H k), that is the faily of functions f of the for f = T α th t, where α = α,..., α T ) is in the siplex and where, for each t [, T ], h t is in H kt for soe k t [, p]. We assue that training and test points are drawn i.i.d. according to soe distribution D over X Y and denote by S = x, y ),..., x, y )) a training saple of size drawn according to D. For any ρ > 0, the generalization error Rf), its ρ-argin error R ρ f) and its epirical argin error are defined as follows: Rf) = E [ ρ f x,y) 0], R ρ f) = E [ ρ f x,y) ρ], and R S,ρ f) = x,y) D x,y) D E x,y) S [ ρ f x,y) ρ], 2) where the notation x, y) S indicates that x, y) is drawn according to the epirical distribution defined by S. For any faily of hypotheses G apping X Y to R, we define Π G) by Π G) = {x hx, y): y Y, h G}. 3) The following theore gives a argin-based Radeacher coplexity bound for learning with ensebles of base classifiers with ultiple hypothesis sets. As with other Radeacher coplexity learning guarantees, our bound is data-dependent, which is an iportant and favorable characteristic of our results. 2

3 Theore. Assue p > and let H,..., H p be p failies of functions apping fro X Y to [0, ]. Fix ρ > 0. Then, for any δ > 0, with probability at least δ over the choice of a saple S of size drawn i.i.d. according to D, the following inequality holds for all f = T α th t F: Rf) R S,ρ f)+ 8c ρ α t R Π H kt ))+ 2 log p cρ + 4 ρ log ) c 2 ρ 2 log p 2 4 log p + log 2 δ 2, Thus, Rf) R S,ρ f) + 8c T log p [ ] ) ρ α tr H kt ) + O ρ 2 log ρ 2 c 2 4 log p. The full proof of theore 3 is given in Appendix B. Even for p =, that is for the special case of a single hypothesis set, our analysis iproves upon the ulti-class argin bound of Koltchinskii and Panchenko [2002] since our bound adits only a linear dependency on the nuber of classes c instead of a quadratic one. However, the ain rearkable benefit of this learning bound is that its coplexity ter adits an explicit dependency on the ixture coefficients α t. It is a weighted average of Radeacher coplexities with ixture weights α t, t [, T ]. Thus, the second ter of the bound suggests that, while soe hypothesis sets H k used for learning could have a large Radeacher coplexity, this ay not negatively affect generalization if the corresponding total ixture weight su of α t s corresponding to that hypothesis set) is relatively sall. Using such potentially coplex failies could help achieve a better argin on the training saple. The theore cannot be proven via the standard Radeacher coplexity analysis of Koltchinskii and Panchenko [2002] since the coplexity ter of the bound would then be R conv p k= H k)) = R p k= H k) which does not adit an explicit dependency on the ixture weights and is lower bounded by T α tr H kt ). Thus, the theore provides a finer learning bound than the one obtained via a standard Radeacher coplexity analysis. 3 Algoriths In this section, we will use the learning guarantees just described to derive several new enseble algoriths for ulti-class classification. 3. Optiization proble Let H,..., H p be p disjoint failies of functions taking values in [0, ] with increasing Radeacher coplexities R H k ), k [, p]. For any hypothesis h p k= H k, we denote by dh) the index of the hypothesis set it belongs to, that is h H dh). The bound of Theore 3 holds uniforly for all ρ > 0 and functions f conv p k= H k). Since the last ter of the bound does not depend on α, it suggests selecting α that would iniize: Gα) = ρf x i,y i) ρ + 8c ρ α t r t, where r t = R H dht)) and α. Since for any ρ > 0, f and f/ρ adit the sae generalization error, we can instead search for α 0 with T α t /ρ, which leads to in α 0 ρf x i,y i) + 8c α t r t s.t. α t ρ. 4) The first ter of the objective is not a convex function of α and its iniization is known to be coputationally hard. Thus, we will consider instead a convex upper bound. Let u Φ u) be a non-increasing convex function upper-bounding u u 0 over R. Φ ay be selected to be The condition P T αt = of Theore 3 can be relaxed to P T a null hypothesis h t = 0 for soe t). αt. To see this, use for exaple 3

4 for exaple the exponential function as in AdaBoost [Freund and Schapire, 997] or the logistic function. Using such an upper bound, we obtain the following convex optiization proble: in α 0 ) Φ ρ f x i, y i ) + λ α t r t s.t. α t ρ, 5) where we introduced a paraeter λ 0 controlling the balance between the agnitude of the values taken by function Φ and the second ter. 2 Introducing a Lagrange variable β 0 associated to the constraint in 5), the proble can be equivalently written as in α 0 Φ in [ T ]) α t h t x i, y i ) α t h t x i, y) + λr t + β)α t. Here, β is a paraeter that can be freely selected by the algorith since any choice of its value is equivalent to a choice of ρ in 5). Since Φ is a non-decreasing function, the proble can be equivalently written as in α 0 [ T ]) ax Φ α t h t x i, y i ) α t h t x i, y) + λr t + β)α t. Let {h,..., h N } be the set of distinct base functions, and let F ax be the objective function based on that expression: F ax α) = ) ax Φ α j h j x i, y i, y) + Λ j α j, 6) with α = α,..., α N ) R N, h j x i, y i, y) = h j x i, y i ) h j x i, y), and Λ j = λr j + β for all j [, N]. Then, our optiization proble can be rewritten as in α 0 F ax α). This defines a convex optiization proble since the doain {α 0} is a convex set and since F ax is convex: each ter of the su in its definition is convex as a pointwise axiu of convex functions coposition of the convex function Φ with an affine function) and the second ter is a linear function of α. In general, F ax is not differentiable even when Φ is, but, since it is convex, it adits a sub-differential at every point. Additionally, along each direction, F ax adits left and right derivatives both nonincreasing and a differential everywhere except for a set that is at ost countable. 3.2 Alternative objective functions We now consider the following three natural upper bounds on F ax which adit useful properties that we will discuss later, the third one valid when Φ can be written as the coposition of two function Φ and Φ 2 with Φ a non-increasing function: F su α) = F axsu α) = F copsu α) = Φ Φ Φ Φ 2 ) α j h j x i, y i, y) + ) α j ρ hj x i, y i ) + Λ j α j 7) Λ j α j 8) )) α j h j x i, y i, y) + Λ j α j. 9) F su is obtained fro F ax siply by replacing in the definition of F ax the ax operator by a su. Clearly, function F su is convex and inherits the differentiability properties of Φ. A drawback of F su is that for probles with very large c as in structured prediction, the coputation of the su 2 Note that this is a standard practice in the field of optiization. The optiization proble in 4) is equivalent to a vector optiization proble, where P ρ f x i,y i ), P T αtrt) is iniized over α. The latter proble can be scalarized leading to the introduction of a paraeter λ in 5). 4

5 ay require resorting to approxiations. F axsu is obtained fro F ax by noticing that, by the sub-additivity of the ax operator, the following inequality holds: ax α j h j x i, y i, y) ax α j h j x i, y i, y) = α j ρ hj x i, y i ). As with F su, function F axsu is convex and adits the sae differentiability properties as Φ. Unlike F su, F axsu does not require coputing a su over the classes. Furtherore, note that the expressions ρ hj x i, y i ), i [, ], can be pre-coputed prior to the application of any optiization algorith. Finally, for Φ = Φ Φ 2 with Φ non-increasing, the ax operator can be replaced by a su before applying φ, as follows: ax Φ ) fx i, y i, y) = Φ ax Φ 2 fxi, y i, y) )) Φ Φ 2 where fx i, y i, y) = N α jh j x i, y i, y). This leads to the definition of F copsu. fxi, y i, y) )), In Appendix C, we discuss the consistency properties of the loss functions just introduced. In particular, we prove that the loss functions associated to F ax and F su are realizable H-consistent see Long and Servedio [203]) in the coon cases where the exponential or logistic losses are used and that, siilarly, in the coon case where Φ u) = log + u) and Φ 2 u) = expu + ), the loss function associated to F copsu is H-consistent. Furtherore, in Appendix D, we show that, under soe ild assuptions, the objective functions we just discussed are essentially within a constant factor of each other. Moreover, in the case of binary classification all of these objectives coincide. 3.3 Multi-class DeepBoost algoriths In this section, we discuss in detail a faily of ulti-class DeepBoost algoriths, which are derived by application of coordinate descent to the objective functions discussed in the previous paragraphs. We will assue that Φ is differentiable over R and that Φ u) 0 for all u. This condition is not necessary, in particular, our presentation can be extended to non-differentiable functions such as the hinge loss, but it siplifies the presentation. In the case of the objective function F axsu, we will assue that both Φ and Φ 2, where Φ = Φ Φ 2, are differentiable. Under these assuptions, F su, F axsu, and F copsu are differentiable. F ax is not differentiable due to the presence of the ax operators in its definition, but it adits a sub-differential at every point. For convenience, let α t = α t,,..., α t,n ) denote the vector obtained after t iterations and let α 0 = 0. Let e k denote the kth unit vector in R N, k [, N]. For a differentiable objective F, we denote by F α, e j ) the directional derivative of F along the direction e j at α. Our coordinate descent algorith consists of first deterining the direction of axial descent, that is k = argax j [,N] F α t, e j ), next of deterining the best step η along that direction that preserves non-negativity of α, η = argin αt +ηe k 0 F α t + ηe k ), and updating α t to α t = α t + ηe k. We will refer to this ethod as projected coordinate descent. The following theore provides a convergence guarantee for our algoriths in that case. Theore 2. Assue that Φ is twice differentiable and that Φ u) > 0 for all u R. Then, the projected coordinate descent algorith applied to F converges to the solution α of the optiization ax α 0 F α) for F = F su, F = F axsu, or F = F copsu. If additionally Φ is strongly convex over the path of the iterates α t, then there exists τ > 0 and γ > 0 such that for all t > τ, F α t+ ) F α ) γ )F α t) F α )). 0) The proof is given in Appendix I and is based on the results of Luo and Tseng [992]. The theore can in fact be extended to the case where instead of the best direction, the derivative for the direction selected at each round is within a constant threshold of the best [Luo and Tseng, 992]. The conditions of Theore 2 hold for any cases in practice, in particular in the case of the exponential loss Φ = exp) or the logistic loss Φ x) = log 2 + e x )). In particular, linear convergence is guaranteed in those cases since both the exponential and logistic losses are strongly convex over a copact set containing the converging sequence of α t s. 5

6 MDEEPBOOSTSUMS = x, y ),..., x, y ))) for i to do 2 for y Y {y i } do 3 D i, y) c ) 4 for t to T do 5 k argin ɛ t,j + Λ j j [,N] 2S t 6 if ɛ t,k )e α t,k ɛ t,k e α t,k < Λ ) k S t then 7 η t α t,k [ [ Λ k 2ɛ ts t + Λk ] ] 2 2ɛ ts t + ɛ t ɛ t 8 else η t log 9 α t α t + η t e k 0 S t+ for i to do 2 for y Y {y i } do 3 D t+ i, y) Φ P ) N αt,jhjxi,yi,y) 4 f N α t,jh j 5 return f Φ N α t,jh j x i, y i, y) ) S t+ Figure : Pseudocode of the MDeepBoostSu algorith for both the exponential loss and the logistic loss. The expression of the weighted error ɛ t,j is given in 2). We will refer to the algorith defined by projected coordinate descent applied to F su by MDeep- BoostSu, to F axsu by MDeepBoostMaxSu, to F copsu by MDeepBoostCopSu, and to F ax by MDeepBoostMax. In the following, we briefly describe MDeepBoostSu, including its pseudocode. We give a detailed description of all of these algoriths in the suppleentary aterial: MDeepBoostSu Appendix E), MDeepBoostMaxSu Appendix F), MDeepBoostCopSu Appendix G), MDeepBoostMax Appendix H). Define f t = N α t,jh j. Then, F su α t ) can be rewritten as follows: F su α t ) = Φ ) f t x i, y i, y) + Λ j α t,j. For any t [, T ], we denote by D t the distribution over [, ] [, c] defined for all i [, ] and y Y {y i } by D t i, y) = Φ f t x i, y i, y) ), ) S t where S t is a noralization factor, S t = Φ f t x i, y i, y)). For any j [, N] and s [, T ], we also define the weighted error ɛ s,j as follows: ɛ s,j = 2 [ E i,y) D s [ hj x i, y i, y) ]]. 2) Figure gives the pseudocode of the MDeepBoostSu algorith. The details of the derivation of the expressions are given in Appendix E. In the special cases of the exponential loss Φ u) = exp u)) or the logistic loss Φ u) = log 2 + exp u))), a closed-for expression is given for the step size lines 6-8), which is the sae in both cases see Sections E.2. and E.2.2). In the generic case, the step size can be found using a line search or other nuerical ethods. The algoriths presented above have several connections with other boosting algoriths, particularly in the absence of regularization. We discuss these connections in detail in Appendix K. 6

7 4 Experients The algoriths presented in the previous sections can be used with a variety of different base classifier sets. For our experients, we used ulti-class binary decision trees. A ulti-class binary decision tree in diension d can be defined by a pair t, h), where t is a binary tree with a variablethreshold question at each internal node, e.g., X j θ, j [, d], and h = h l ) l Leavest) a vector of distributions over the leaves Leavest) of t. At any leaf l Leavest), h l y) [0, ] for all y Y and y Y h ly) =. For convenience, we will denote by tx) the leaf l Leavest) associated to x by t. Thus, the score associated by t, h) to a pair x, y) X Y is h l y) where l = tx). Let T n denote the faily of all ulti-class decision trees with n internal nodes in diension d. In Appendix J, we derive the following upper bound on the Radeacher coplexity of T n : 4n + 2) log2 d + 2) log + ) RΠ T n )). 3) All of the experients in this section use T n as the faily of base hypothesis sets paraetrized by n). Since T n is a very large hypothesis set when n is large, for the sake of coputational efficiency we ake a few approxiations. First, although our MDeepBoost algoriths were derived in ters of Radeacher coplexity, we use the upper bound in Eq. 3) in place of the Radeacher coplexity thus, in Algorith we let Λ n = λb n + β, where B n is the bound given in Eq. 3)). Secondly, instead of exhaustively searching for the best decision tree in T n for each possible size n, we use the following greedy procedure: Given the best decision tree of size n starting with n = ), we find the best decision tree of size n+ that can be obtained by splitting one leaf, and continue this procedure until soe axiu depth K. Decision trees are coonly learned in this anner, and so in this context our Radeacher-coplexity-based bounds can be viewed as a novel stopping criterion for decision tree learning. Let HK be the set of trees found by the greedy algorith just described. In each iteration t of MDeepBoost, we select the best tree in the set HK {h,..., h t }, where h,..., h t are the trees selected in previous iterations. While we described any objective functions that can be used as the basis of a ulti-class deep boosting algorith, the experients in this section focus on algoriths derived fro F su. We also refer the reader to Table 3 in Appendix A for results of experients with F copsu objective functions. The F su and F copsu objectives cobine several advantages that suggest they will perfor well epirically. F su is consistent and both F su and F copsu are by Theore 4) H-consistent. Also, unlike F ax both of these objectives are differentiable, and therefore the convergence guarantee in Theore 2 applies. Our preliinary findings also indicate that algoriths based on F su and F copsu objectives perfor better than those derived fro F ax and F axsu. All of our objective functions require a choice for Φ, the loss function. Since Cortes et al. [204] reported coparable results for exponential and logistic loss for the binary version of DeepBoost, we let Φ be the exponential loss in all of our experients with MDeepBoostSu. For MDeepBoostCopSu we select Φ u) = log 2 + u) and Φ 2 u) = exp u). In our experients, we used 8 UCI data sets: abalone, handwritten, letters, pageblocks, pendigits, satiage, statlog and yeast see ore details on these datasets in Table 4, Appendix L. In Appendix K, we explain that when λ = β = 0 then MDeepBoostSu is equivalent to AdaBoost.MR. Also, if we set λ = 0 and β 0 then the resulting algorith is an L -nor regularized variant of AdaBoost.MR. We copared MDeepBoostSu to these two algoriths, with the results also reported in Table and Table 2 in Appendix A. Likewise, we copared MDeepBoost- CopSu with ultinoial additive) logistic regression, LogReg, and its L -regularized version LogReg-L, which, as discussed in Appendix K, are equivalent to MDeepBoostCopSu when λ = β = 0 and λ = 0, β 0 respectively. Finally, we reark that it can be argued that the paraeter optiization procedure described below) significantly extends AdaBoost.MR since it effectively ipleents structural risk iniization: for each tree depth, the epirical error is iniized and we choose the depth to achieve the best generalization error. All of these algoriths use axiu tree depth K as a paraeter. L -nor regularized versions adit two paraeters: K and β 0. Deep boosting algoriths have a third paraeter, λ 0. To set these paraeters, we used the following paraeter optiization procedure: we randoly partitioned each dataset into 4 folds and, for each tuple λ, β, K) in the set of possible paraeters described below), we ran MDeepBoostSu, with a different assignent of folds to the training 7

8 Table : Epirical results for MDeepBoostSu, Φ = exp. AB stands for AdaBoost. abalone AB.MR AB.MR-L MDeepBoost handwritten AB.MR AB.MR-L MDeepBoost Error Error std dev) 0.006) ) ) std dev) 0.00) 0.008) 0.005) letters AB.MR AB.MR-L MDeepBoost pageblocks AB.MR AB.MR-L MDeepBoost Error Error std dev) 0.008) ) ) std dev) ) 0.003) 0.004) pendigits AB.MR AB.MR-L MDeepBoost satiage AB.MR AB.MR-L MDeepBoost Error Error std dev) ) 0.003) 0.00) std dev) 0.023) ) ) statlog AB.MR AB.MR-L MDeepBoost yeast AB.MR AB.MR-L MDeepBoost Error Error std dev) ) 0.007) ) std dev) ) ) ) set, validation set and test set for each run. Specifically, for each run i {0,, 2, 3}, fold i was used for testing, fold i + od 4) was used for validation, and the reaining folds were used for training. For each run, we selected the paraeters that had the lowest error on the validation set and then easured the error of those paraeters on the test set. The average test error and the standard deviation of the test error over all 4 runs is reported in Table. Note that an alternative procedure to copare algoriths that is adopted in a nuber of previous studies of boosting [Li, 2009a,b, Sun et al., 202] is to siply record the average test error of the best paraeter tuples over all runs. While it is of course possible to overestiate the perforance of a learning algorith by optiizing hyperparaeters on the test set, this concern is less valid when the size of the test set is large relative to the coplexity of the hyperparaeter space. We report results for this alternative procedure in Table 2 and Table 3, Appendix A. For each dataset, the set of possible values for λ and β was initialized to {0 5, 0 6,..., 0 0 }, and to {, 2, 3, 4, 5} for the axiu tree depth K. However, if we found an optial paraeter value to be at the end point of these ranges, we extended the interval in that direction by an order of agnitude for λ and β, and by for the axiu tree depth K) and re-ran the experients. We have also experiented with 200 and 500 iterations but we have observed that the errors do not change significantly and the ranking of the algoriths reains the sae. The results of our experients show that, for each dataset, deep boosting algoriths outperfor the other algoriths evaluated in our experients. Let us point out that, even though not all of our results are statistically significant, MDeepBoostSu outperfors AdaBoost.MR and AdaBoost.MR- L and, hence, effectively structural risk iniization) on each dataset. More iportantly, for each dataset MDeepBoostSu outperfors other algoriths on ost of the individual runs. Moreover, results for soe datasets presented here naely pendigits) appear to be state-of-the-art. We also refer our reader to experiental results suarized in Table 2 and Table 3 in Appendix A. These results provide further evidence in favor of DeepBoost algoriths. The consistent perforance iproveent by MDeepBoostSu over AdaBoost.MR or its L-nor regularized variant shows the benefit of the new coplexity-based regularization we introduced. 5 Conclusion We presented new data-dependent learning guarantees for convex ensebles in the ulti-class setting where the base classifier set is coposed of increasingly coplex sub-failies, including very deep or coplex ones. These learning bounds generalize to the ulti-class setting the guarantees presented by Cortes et al. [204] in the binary case. We also introduced and discussed several new ulti-class enseble algoriths benefiting fro these guarantees and proved positive results for the H-consistency and convergence of several of the. Finally, we reported the results of several experients with DeepBoost algoriths, and copared their perforance with that of AdaBoost.MR and additive ultinoial Logistic Regression and their L -regularized variants. Acknowledgents We thank Andres Muñoz Medina and Scott Yang for discussions and help with the experients. This work was partly funded by the NSF award IIS-759 and supported by a NSERC PGS grant. 8

9 References P. Bühlann and B. Yu. Boosting with the L2 loss. J. of the Aer. Stat. Assoc., 98462): , M. Collins, R. E. Schapire, and Y. Singer. Logistic regression, Adaboost and Bregan distances. Machine Learning, 48: , Septeber C. Cortes, M. Mohri, and U. Syed. Deep boosting. In ICML, pages 79 87, 204. T. G. Dietterich. An experiental coparison of three ethods for constructing ensebles of decision trees: Bagging, boosting, and randoization. Machine Learning, 402):39 57, J. C. Duchi and Y. Singer. Boosting with structural sparsity. In ICML, page 38, N. Duffy and D. P. Helbold. Potential boosters? In NIPS, pages , 999. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Coputer Syste Sciences, 55):9 39, 997. J. H. Friedan. Greedy function approxiation: A gradient boosting achine. Annals of Statistics, 29:89 232, J. H. Friedan, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Annals of Statistics, 28:2000, 998. A. J. Grove and D. Schuurans. Boosting in the liit: Maxiizing the argin of learned ensebles. In AAAI/IAAI, pages , 998. J. Kivinen and M. K. Waruth. Boosting as entropy projection. In COLT, pages 34 44, 999. V. Koltchinskii and D. Panchenko. Epirical argin distributions and bounding the generalization error of cobined classifiers. Annals of Statistics, 30, M. Ledoux and M. Talagrand. Probability in Banach Spaces: Isoperietry and Processes. Springer, 99. P. Li. ABC-boost: adaptive base class boost for ulti-class classification. In ICML, page 79, 2009a. P. Li. ABC-logitboost for ulti-class classification. Technical report, Rutgers University, 2009b. P. M. Long and R. A. Servedio. Consistency versus realizable H-consistency for ulticlass classification. In ICML 3), pages , 203. Z.-Q. Luo and P. Tseng. On the convergence of coordinate descent ethod for convex differentiable iniization. Journal of Optiization Theory and Applications, 72):7 35, 992. L. Mason, J. Baxter, P. L. Bartlett, and M. R. Frean. Boosting algoriths as gradient descent. In NIPS, 999. M. Mohri, A. Rostaizadeh, and A. Talwalkar. Foundations of Machine Learning. The MIT Press, 202. I. Mukherjee and R. E. Schapire. A theory of ulticlass boosting. JMLR, 4): , 203. G. Rätsch and M. K. Waruth. Maxiizing the argin with boosting. In COLT, pages , G. Rätsch and M. K. Waruth. Efficient argin axiizing with boosting. JMLR, 6:23 252, G. Rätsch, S. Mika, and M. K. Waruth. On the convergence of leveraging. In NIPS, pages , 200a. G. Rätsch, T. Onoda, and K.-R. Müller. Soft argins for AdaBoost. Machine Learning, 423): , 200b. R. E. Schapire. Theoretical views of boosting and applications. In Proceedings of ALT 999, volue 720 of Lecture Notes in Coputer Science, pages Springer, 999. R. E. Schapire and Y. Freund. Boosting: Foundations and Algoriths. The MIT Press, 202. R. E. Schapire and Y. Singer. Iproved boosting algoriths using confidence-rated predictions. Machine Learning, 373): , 999. R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the argin: A new explanation for the effectiveness of voting ethods. In ICML, pages , 997. P. Sun, M. D. Reid, and J. Zhou. Aoso-logitboost: Adaptive one-vs-one logitboost for ulti-class proble. In ICML, 202. A. Tewari and P. L. Bartlett. On the consistency of ulticlass classification ethods. JMLR, 8: , M. K. Waruth, J. Liao, and G. Rätsch. Totally corrective boosting algoriths that axiize the argin. In ICML, pages , T. Zhang. Statistical analysis of soe ulti-category large argin classification ethods. JMLR, 5:225 25, 2004a. T. Zhang. Statistical behavior and consistency of classification ethods based on convex risk iniization. Annals of Statistics, 32):56 85, 2004b. J. Zhu, H. Zou, S. Rosset, and T. Hastie. Multi-class adaboost. Statistics and Its Interface, H. Zou, J. Zhu, and T. Hastie. New ulticategory boosting algoriths based on ulticategory fisher-consistent losses. Annals of Statistics, 24): ,

Online Bagging and Boosting

Online Bagging and Boosting Abstract Bagging and boosting are two of the ost well-known enseble learning ethods due to their theoretical perforance guarantees and strong experiental results. However, these algoriths have been used

More information

AUC Optimization vs. Error Rate Minimization

AUC Optimization vs. Error Rate Minimization AUC Optiization vs. Error Rate Miniization Corinna Cortes and Mehryar Mohri AT&T Labs Research 180 Park Avenue, Florha Park, NJ 0793, USA {corinna, ohri}@research.att.co Abstract The area under an ROC

More information

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization?

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization? Partitioning Data on Features or Saples in Counication-Efficient Distributed Optiization? Chenxin Ma Industrial and Systes Engineering Lehigh University, USA ch54@lehigh.edu Martin Taáč Industrial and

More information

Applying Multiple Neural Networks on Large Scale Data

Applying Multiple Neural Networks on Large Scale Data 0 International Conference on Inforation and Electronics Engineering IPCSIT vol6 (0) (0) IACSIT Press, Singapore Applying Multiple Neural Networks on Large Scale Data Kritsanatt Boonkiatpong and Sukree

More information

Machine Learning Applications in Grid Computing

Machine Learning Applications in Grid Computing Machine Learning Applications in Grid Coputing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartouth College Hanover, NH 03755, USA gvc@dartouth.edu, guofei.jiang@dartouth.edu

More information

arxiv:0805.1434v1 [math.pr] 9 May 2008

arxiv:0805.1434v1 [math.pr] 9 May 2008 Degree-distribution stability of scale-free networs Zhenting Hou, Xiangxing Kong, Dinghua Shi,2, and Guanrong Chen 3 School of Matheatics, Central South University, Changsha 40083, China 2 Departent of

More information

Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning

Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning Bang Zhang, Yi Wang 2, Yang Wang, Fang Chen 2 National ICT Australia 2 School of

More information

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Airline Yield Manageent with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Integral Developent Corporation, 301 University Avenue, Suite 200, Palo Alto, California 94301 SHALER STIDHAM

More information

Data Set Generation for Rectangular Placement Problems

Data Set Generation for Rectangular Placement Problems Data Set Generation for Rectangular Placeent Probles Christine L. Valenzuela (Muford) Pearl Y. Wang School of Coputer Science & Inforatics Departent of Coputer Science MS 4A5 Cardiff University George

More information

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network 2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona

More information

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation Media Adaptation Fraework in Biofeedback Syste for Stroke Patient Rehabilitation Yinpeng Chen, Weiwei Xu, Hari Sundara, Thanassis Rikakis, Sheng-Min Liu Arts, Media and Engineering Progra Arizona State

More information

Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints

Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints Journal of Machine Learning Research 13 2012) 2503-2528 Subitted 8/11; Revised 3/12; Published 9/12 rading Regret for Efficiency: Online Convex Optiization with Long er Constraints Mehrdad Mahdavi Rong

More information

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut)

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut) Sandia is a ultiprogra laboratory operated by Sandia Corporation, a Lockheed Martin Copany, Reconnect 04 Solving Integer Progras with Branch and Bound (and Branch and Cut) Cynthia Phillips (Sandia National

More information

Preference-based Search and Multi-criteria Optimization

Preference-based Search and Multi-criteria Optimization Fro: AAAI-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Preference-based Search and Multi-criteria Optiization Ulrich Junker ILOG 1681, route des Dolines F-06560 Valbonne ujunker@ilog.fr

More information

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks Reliability Constrained acket-sizing for inear Multi-hop Wireless Networks Ning Wen, and Randall A. Berry Departent of Electrical Engineering and Coputer Science Northwestern University, Evanston, Illinois

More information

Modeling operational risk data reported above a time-varying threshold

Modeling operational risk data reported above a time-varying threshold Modeling operational risk data reported above a tie-varying threshold Pavel V. Shevchenko CSIRO Matheatical and Inforation Sciences, Sydney, Locked bag 7, North Ryde, NSW, 670, Australia. e-ail: Pavel.Shevchenko@csiro.au

More information

Binary Embedding: Fundamental Limits and Fast Algorithm

Binary Embedding: Fundamental Limits and Fast Algorithm Binary Ebedding: Fundaental Liits and Fast Algorith Xinyang Yi The University of Texas at Austin yixy@utexas.edu Eric Price The University of Texas at Austin ecprice@cs.utexas.edu Constantine Caraanis

More information

Searching strategy for multi-target discovery in wireless networks

Searching strategy for multi-target discovery in wireless networks Searching strategy for ulti-target discovery in wireless networks Zhao Cheng, Wendi B. Heinzelan Departent of Electrical and Coputer Engineering University of Rochester Rochester, NY 467 (585) 75-{878,

More information

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes Coent on On Discriinative vs. Generative Classifiers: A Coparison of Logistic Regression and Naive Bayes Jing-Hao Xue (jinghao@stats.gla.ac.uk) and D. Michael Titterington (ike@stats.gla.ac.uk) Departent

More information

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY Y. T. Chen Departent of Industrial and Systes Engineering Hong Kong Polytechnic University, Hong Kong yongtong.chen@connect.polyu.hk

More information

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS JIE HAN AND YI ZHAO Abstract. We show that for sufficiently large n, every 3-unifor hypergraph on n vertices with iniu

More information

Resource Allocation in Wireless Networks with Multiple Relays

Resource Allocation in Wireless Networks with Multiple Relays Resource Allocation in Wireless Networks with Multiple Relays Kağan Bakanoğlu, Stefano Toasin, Elza Erkip Departent of Electrical and Coputer Engineering, Polytechnic Institute of NYU, Brooklyn, NY, 0

More information

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS Artificial Intelligence Methods and Techniques for Business and Engineering Applications 210 INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE

More information

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Vol. 9, No. 5 (2016), pp.303-312 http://dx.doi.org/10.14257/ijgdc.2016.9.5.26 Analyzing Spatioteporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Chen Yang, Renjie Zhou

More information

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that Online Appendix I: A Model of Household Bargaining ith Violence In this appendix I develop a siple odel of household bargaining that incorporates violence and shos under hat assuptions an increase in oen

More information

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Use of extrapolation to forecast the working capital in the mechanical engineering companies ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL 2014. Vol. 1. No. 1. 23 28 Use of extrapolation to forecast the working capital in the echanical engineering copanies A. Cherep, Y. Shvets Departent of finance

More information

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol., No. 3, Suer 28, pp. 429 447 issn 523-464 eissn 526-5498 8 3 429 infors doi.287/so.7.8 28 INFORMS INFORMS holds copyright to this article and distributed

More information

Fuzzy Sets in HR Management

Fuzzy Sets in HR Management Acta Polytechnica Hungarica Vol. 8, No. 3, 2011 Fuzzy Sets in HR Manageent Blanka Zeková AXIOM SW, s.r.o., 760 01 Zlín, Czech Republic blanka.zekova@sezna.cz Jana Talašová Faculty of Science, Palacký Univerzity,

More information

Image restoration for a rectangular poor-pixels detector

Image restoration for a rectangular poor-pixels detector Iage restoration for a rectangular poor-pixels detector Pengcheng Wen 1, Xiangjun Wang 1, Hong Wei 2 1 State Key Laboratory of Precision Measuring Technology and Instruents, Tianjin University, China 2

More information

Efficient Key Management for Secure Group Communications with Bursty Behavior

Efficient Key Management for Secure Group Communications with Bursty Behavior Efficient Key Manageent for Secure Group Counications with Bursty Behavior Xukai Zou, Byrav Raaurthy Departent of Coputer Science and Engineering University of Nebraska-Lincoln Lincoln, NE68588, USA Eail:

More information

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION Henrik Kure Dina, Danish Inforatics Network In the Agricultural Sciences Royal Veterinary and Agricultural University

More information

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs Send Orders for Reprints to reprints@benthascience.ae 206 The Open Fuels & Energy Science Journal, 2015, 8, 206-210 Open Access The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic

More information

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Data Streaming Algorithms for Estimating Entropy of Network Traffic Data Streaing Algoriths for Estiating Entropy of Network Traffic Ashwin Lall University of Rochester Vyas Sekar Carnegie Mellon University Mitsunori Ogihara University of Rochester Jun (Ji) Xu Georgia

More information

Factored Models for Probabilistic Modal Logic

Factored Models for Probabilistic Modal Logic Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008 Factored Models for Probabilistic Modal Logic Afsaneh Shirazi and Eyal Air Coputer Science Departent, University of Illinois

More information

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model Evaluating Inventory Manageent Perforance: a Preliinary Desk-Siulation Study Based on IOC Model Flora Bernardel, Roberto Panizzolo, and Davide Martinazzo Abstract The focus of this study is on preliinary

More information

An Innovate Dynamic Load Balancing Algorithm Based on Task

An Innovate Dynamic Load Balancing Algorithm Based on Task An Innovate Dynaic Load Balancing Algorith Based on Task Classification Hong-bin Wang,,a, Zhi-yi Fang, b, Guan-nan Qu,*,c, Xiao-dan Ren,d College of Coputer Science and Technology, Jilin University, Changchun

More information

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization Ipact of Processing Costs on Service Chain Placeent in Network Functions Virtualization Marco Savi, Massio Tornatore, Giacoo Verticale Dipartiento di Elettronica, Inforazione e Bioingegneria, Politecnico

More information

2. FINDING A SOLUTION

2. FINDING A SOLUTION The 7 th Balan Conference on Operational Research BACOR 5 Constanta, May 5, Roania OPTIMAL TIME AND SPACE COMPLEXITY ALGORITHM FOR CONSTRUCTION OF ALL BINARY TREES FROM PRE-ORDER AND POST-ORDER TRAVERSALS

More information

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO Bulletin of the Transilvania University of Braşov Series I: Engineering Sciences Vol. 4 (53) No. - 0 PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO V. CAZACU I. SZÉKELY F. SANDU 3 T. BĂLAN Abstract:

More information

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes On Coputing Nearest Neighbors with Applications to Decoding of Binary Linear Codes Alexander May and Ilya Ozerov Horst Görtz Institute for IT-Security Ruhr-University Bochu, Gerany Faculty of Matheatics

More information

Stochastic Online Scheduling on Parallel Machines

Stochastic Online Scheduling on Parallel Machines Stochastic Online Scheduling on Parallel Machines Nicole Megow 1, Marc Uetz 2, and Tark Vredeveld 3 1 Technische Universit at Berlin, Institut f ur Matheatik, Strasse des 17. Juni 136, 10623 Berlin, Gerany

More information

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128 ON SELF-ROUTING IN CLOS CONNECTION NETWORKS BARRY G. DOUGLASS Electrical Engineering Departent Texas A&M University College Station, TX 778-8 A. YAVUZ ORUÇ Electrical Engineering Departent and Institute

More information

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks Cooperative Caching for Adaptive Bit Rate Streaing in Content Delivery Networs Phuong Luu Vo Departent of Coputer Science and Engineering, International University - VNUHCM, Vietna vtlphuong@hciu.edu.vn

More information

Work Travel and Decision Probling in the Network Marketing World

Work Travel and Decision Probling in the Network Marketing World TRB Paper No. 03-4348 WORK TRAVEL MODE CHOICE MODELING USING DATA MINING: DECISION TREES AND NEURAL NETWORKS Chi Xie Research Assistant Departent of Civil and Environental Engineering University of Massachusetts,

More information

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 Exploiting Hardware Heterogeneity within the Sae Instance Type of Aazon EC2 Zhonghong Ou, Hao Zhuang, Jukka K. Nurinen, Antti Ylä-Jääski, Pan Hui Aalto University, Finland; Deutsch Teleko Laboratories,

More information

PREDICTION OF MILKLINE FILL AND TRANSITION FROM STRATIFIED TO SLUG FLOW

PREDICTION OF MILKLINE FILL AND TRANSITION FROM STRATIFIED TO SLUG FLOW PREDICTION OF MILKLINE FILL AND TRANSITION FROM STRATIFIED TO SLUG FLOW ABSTRACT: by Douglas J. Reineann, Ph.D. Assistant Professor of Agricultural Engineering and Graee A. Mein, Ph.D. Visiting Professor

More information

FilterBoost: Regression and Classification on Large Datasets

FilterBoost: Regression and Classification on Large Datasets FilterBoost: Regression and Classification on Large Datasets Joseph K. Bradley Machine Learning Department Carnegie Mellon University Pittsburgh, PA 523 jkbradle@cs.cmu.edu Robert E. Schapire Department

More information

Bayes Point Machines

Bayes Point Machines Journal of Machine Learning Research (2) 245 279 Subitted 2/; Published 8/ Bayes Point Machines Ralf Herbrich Microsoft Research, St George House, Guildhall Street, CB2 3NH Cabridge, United Kingdo Thore

More information

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013 Calculation Method for evaluating Solar Assisted Heat Pup Systes in SAP 2009 15 July 2013 Page 1 of 17 1 Introduction This docuent describes how Solar Assisted Heat Pup Systes are recognised in the National

More information

Cross-Domain Metric Learning Based on Information Theory

Cross-Domain Metric Learning Based on Information Theory Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Cross-Doain Metric Learning Based on Inforation Theory Hao Wang,2, Wei Wang 2,3, Chen Zhang 2, Fanjiang Xu 2. State Key Laboratory

More information

Online Methods for Multi-Domain Learning and Adaptation

Online Methods for Multi-Domain Learning and Adaptation Online Methods for Multi-Doain Learning and Adaptation Mark Dredze and Koby Craer Departent of Coputer and Inforation Science University of Pennsylvania Philadelphia, PA 19104 USA {dredze,craer}@cis.upenn.edu

More information

Halloween Costume Ideas for the Wii Game

Halloween Costume Ideas for the Wii Game Algorithica 2001) 30: 101 139 DOI: 101007/s00453-001-0003-0 Algorithica 2001 Springer-Verlag New York Inc Optial Search and One-Way Trading Online Algoriths R El-Yaniv, 1 A Fiat, 2 R M Karp, 3 and G Turpin

More information

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1 International Journal of Manageent & Inforation Systes First Quarter 2012 Volue 16, Nuber 1 Proposal And Effectiveness Of A Highly Copelling Direct Mail Method - Establishent And Deployent Of PMOS-DM Hisatoshi

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Support Vector Machine Soft Margin Classifiers: Error Analysis

Support Vector Machine Soft Margin Classifiers: Error Analysis Journal of Machine Learning Research? (2004)?-?? Subitted 9/03; Published??/04 Support Vector Machine Soft Margin Classifiers: Error Analysis Di-Rong Chen Departent of Applied Matheatics Beijing University

More information

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment 6 JOURNAL OF SOFTWARE, VOL. 4, NO. 3, MAY 009 The AGA Evaluating Model of Custoer Loyalty Based on E-coerce Environent Shaoei Yang Econoics and Manageent Departent, North China Electric Power University,

More information

Information Processing Letters

Information Processing Letters Inforation Processing Letters 111 2011) 178 183 Contents lists available at ScienceDirect Inforation Processing Letters www.elsevier.co/locate/ipl Offline file assignents for online load balancing Paul

More information

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises Advance Journal of Food Science and Technology 9(2): 964-969, 205 ISSN: 2042-4868; e-issn: 2042-4876 205 Maxwell Scientific Publication Corp. Subitted: August 0, 205 Accepted: Septeber 3, 205 Published:

More information

How Boosting the Margin Can Also Boost Classifier Complexity

How Boosting the Margin Can Also Boost Classifier Complexity Lev Reyzin lev.reyzin@yale.edu Yale University, Department of Computer Science, 51 Prospect Street, New Haven, CT 652, USA Robert E. Schapire schapire@cs.princeton.edu Princeton University, Department

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and migration algorithms

Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and migration algorithms Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and igration algoriths Chaia Ghribi, Makhlouf Hadji and Djaal Zeghlache Institut Mines-Téléco, Téléco SudParis UMR CNRS 5157 9, Rue

More information

Method of supply chain optimization in E-commerce

Method of supply chain optimization in E-commerce MPRA Munich Personal RePEc Archive Method of supply chain optiization in E-coerce Petr Suchánek and Robert Bucki Silesian University - School of Business Adinistration, The College of Inforatics and Manageent

More information

Dynamic Placement for Clustered Web Applications

Dynamic Placement for Clustered Web Applications Dynaic laceent for Clustered Web Applications A. Karve, T. Kibrel, G. acifici, M. Spreitzer, M. Steinder, M. Sviridenko, and A. Tantawi IBM T.J. Watson Research Center {karve,kibrel,giovanni,spreitz,steinder,sviri,tantawi}@us.ib.co

More information

- 265 - Part C. Property and Casualty Insurance Companies

- 265 - Part C. Property and Casualty Insurance Companies Part C. Property and Casualty Insurance Copanies This Part discusses proposals to curtail favorable tax rules for property and casualty ("P&C") insurance copanies. The syste of reserves for unpaid losses

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Nathanaël Cheriere Departent of Coputer Science ENS Rennes Rennes, France nathanael.cheriere@ens-rennes.fr

More information

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA Audio Engineering Society Convention Paper Presented at the 119th Convention 2005 October 7 10 New York, New York USA This convention paper has been reproduced fro the authors advance anuscript, without

More information

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing Real Tie Target Tracking with Binary Sensor Networks and Parallel Coputing Hong Lin, John Rushing, Sara J. Graves, Steve Tanner, and Evans Criswell Abstract A parallel real tie data fusion and target tracking

More information

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument Ross [1],[]) presents the aritrage pricing theory. The idea is that the structure of asset returns leads naturally to a odel of risk preia, for otherwise there would exist an opportunity for aritrage profit.

More information

A Scalable Application Placement Controller for Enterprise Data Centers

A Scalable Application Placement Controller for Enterprise Data Centers W WWW 7 / Track: Perforance and Scalability A Scalable Application Placeent Controller for Enterprise Data Centers Chunqiang Tang, Malgorzata Steinder, Michael Spreitzer, and Giovanni Pacifici IBM T.J.

More information

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Optimal Resource-Constraint Project Scheduling with Overlapping Modes Optial Resource-Constraint Proect Scheduling with Overlapping Modes François Berthaut Lucas Grèze Robert Pellerin Nathalie Perrier Adnène Hai February 20 CIRRELT-20-09 Bureaux de Montréal : Bureaux de

More information

Models and Algorithms for Stochastic Online Scheduling 1

Models and Algorithms for Stochastic Online Scheduling 1 Models and Algoriths for Stochastic Online Scheduling 1 Nicole Megow Technische Universität Berlin, Institut für Matheatik, Strasse des 17. Juni 136, 10623 Berlin, Gerany. eail: negow@ath.tu-berlin.de

More information

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel Recent Advances in Counications Adaptive odulation and Coding for Unanned Aerial Vehicle (UAV) Radio Channel Airhossein Fereidountabar,Gian Carlo Cardarilli, Rocco Fazzolari,Luca Di Nunzio Abstract In

More information

An improved TF-IDF approach for text classification *

An improved TF-IDF approach for text classification * Zhang et al. / J Zheiang Univ SCI 2005 6A(1:49-55 49 Journal of Zheiang University SCIECE ISS 1009-3095 http://www.zu.edu.cn/zus E-ail: zus@zu.edu.cn An iproved TF-IDF approach for text classification

More information

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1. Secure Wireless Multicast for Delay-Sensitive Data via Network Coding

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1. Secure Wireless Multicast for Delay-Sensitive Data via Network Coding IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1 Secure Wireless Multicast for Delay-Sensitive Data via Network Coding Tuan T. Tran, Meber, IEEE, Hongxiang Li, Senior Meber, IEEE,

More information

Partitioned Elias-Fano Indexes

Partitioned Elias-Fano Indexes Partitioned Elias-ano Indexes Giuseppe Ottaviano ISTI-CNR, Pisa giuseppe.ottaviano@isti.cnr.it Rossano Venturini Dept. of Coputer Science, University of Pisa rossano@di.unipi.it ABSTRACT The Elias-ano

More information

Source. The Boosting Approach. Example: Spam Filtering. The Boosting Approach to Machine Learning

Source. The Boosting Approach. Example: Spam Filtering. The Boosting Approach to Machine Learning Source The Boosting Approach to Machine Learning Notes adapted from Rob Schapire www.cs.princeton.edu/~schapire CS 536: Machine Learning Littman (Wu, TA) Example: Spam Filtering problem: filter out spam

More information

The Virtual Spring Mass System

The Virtual Spring Mass System The Virtual Spring Mass Syste J. S. Freudenberg EECS 6 Ebedded Control Systes Huan Coputer Interaction A force feedbac syste, such as the haptic heel used in the EECS 6 lab, is capable of exhibiting a

More information

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index Analog Integrated Circuits and Signal Processing, vol. 9, no., April 999. Abstract Modified Latin Hypercube Sapling Monte Carlo (MLHSMC) Estiation for Average Quality Index Mansour Keraat and Richard Kielbasa

More information

AdaBoost. Jiri Matas and Jan Šochman. Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz

AdaBoost. Jiri Matas and Jan Šochman. Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz AdaBoost Jiri Matas and Jan Šochman Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz Presentation Outline: AdaBoost algorithm Why is of interest? How it works? Why

More information

Equivalent Tapped Delay Line Channel Responses with Reduced Taps

Equivalent Tapped Delay Line Channel Responses with Reduced Taps Equivalent Tapped Delay Line Channel Responses with Reduced Taps Shweta Sagari, Wade Trappe, Larry Greenstein {shsagari, trappe, ljg}@winlab.rutgers.edu WINLAB, Rutgers University, North Brunswick, NJ

More information

Energy Proportionality for Disk Storage Using Replication

Energy Proportionality for Disk Storage Using Replication Energy Proportionality for Disk Storage Using Replication Jinoh Ki and Doron Rote Lawrence Berkeley National Laboratory University of California, Berkeley, CA 94720 {jinohki,d rote}@lbl.gov Abstract Energy

More information

ASIC Design Project Management Supported by Multi Agent Simulation

ASIC Design Project Management Supported by Multi Agent Simulation ASIC Design Project Manageent Supported by Multi Agent Siulation Jana Blaschke, Christian Sebeke, Wolfgang Rosenstiel Abstract The coplexity of Application Specific Integrated Circuits (ASICs) is continuously

More information

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.

More information

Construction Economics & Finance. Module 3 Lecture-1

Construction Economics & Finance. Module 3 Lecture-1 Depreciation:- Construction Econoics & Finance Module 3 Lecture- It represents the reduction in arket value of an asset due to age, wear and tear and obsolescence. The physical deterioration of the asset

More information

Table 1: Summary of the settings and parameters employed by the additive PA algorithm for classification, regression, and uniclass.

Table 1: Summary of the settings and parameters employed by the additive PA algorithm for classification, regression, and uniclass. Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

SAMPLING METHODS LEARNING OBJECTIVES

SAMPLING METHODS LEARNING OBJECTIVES 6 SAMPLING METHODS 6 Using Statistics 6-6 2 Nonprobability Sapling and Bias 6-6 Stratified Rando Sapling 6-2 6 4 Cluster Sapling 6-4 6 5 Systeatic Sapling 6-9 6 6 Nonresponse 6-2 6 7 Suary and Review of

More information

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers Perforance Evaluation of Machine Learning Techniques using Software Cost Drivers Manas Gaur Departent of Coputer Engineering, Delhi Technological University Delhi, India ABSTRACT There is a treendous rise

More information

6. Time (or Space) Series Analysis

6. Time (or Space) Series Analysis ATM 55 otes: Tie Series Analysis - Section 6a Page 8 6. Tie (or Space) Series Analysis In this chapter we will consider soe coon aspects of tie series analysis including autocorrelation, statistical prediction,

More information

Managing Complex Network Operation with Predictive Analytics

Managing Complex Network Operation with Predictive Analytics Managing Coplex Network Operation with Predictive Analytics Zhenyu Huang, Pak Chung Wong, Patrick Mackey, Yousu Chen, Jian Ma, Kevin Schneider, and Frank L. Greitzer Pacific Northwest National Laboratory

More information

Pricing Asian Options using Monte Carlo Methods

Pricing Asian Options using Monte Carlo Methods U.U.D.M. Project Report 9:7 Pricing Asian Options using Monte Carlo Methods Hongbin Zhang Exaensarbete i ateatik, 3 hp Handledare och exainator: Johan Tysk Juni 9 Departent of Matheatics Uppsala University

More information

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS Isaac Zafrany and Sa BenYaakov Departent of Electrical and Coputer Engineering BenGurion University of the Negev P. O. Box

More information

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor J. Peraire, S. Widnall 16.07 Dynaics Fall 008 Lecture L6-3D Rigid Body Dynaics: The Inertia Tensor Version.1 In this lecture, we will derive an expression for the angular oentu of a 3D rigid body. We shall

More information

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS 641 CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS Marketa Zajarosova 1* *Ph.D. VSB - Technical University of Ostrava, THE CZECH REPUBLIC arketa.zajarosova@vsb.cz Abstract Custoer relationship

More information

Near-Optimal Power Control in Wireless Networks: A Potential Game Approach

Near-Optimal Power Control in Wireless Networks: A Potential Game Approach Near-Optial Power Control in Wireless Networks: A Potential Gae Approach Utku Ozan Candogan, Ishai Menache, Asuan Ozdaglar and Pablo A. Parrilo Laboratory for Inforation and Decision Systes Massachusetts

More information

Endogenous Market Structure and the Cooperative Firm

Endogenous Market Structure and the Cooperative Firm Endogenous Market Structure and the Cooperative Fir Brent Hueth and GianCarlo Moschini Working Paper 14-WP 547 May 2014 Center for Agricultural and Rural Developent Iowa State University Aes, Iowa 50011-1070

More information

Markovian inventory policy with application to the paper industry

Markovian inventory policy with application to the paper industry Coputers and Cheical Engineering 26 (2002) 1399 1413 www.elsevier.co/locate/copcheeng Markovian inventory policy with application to the paper industry K. Karen Yin a, *, Hu Liu a,1, Neil E. Johnson b,2

More information

Position Auctions and Non-uniform Conversion Rates

Position Auctions and Non-uniform Conversion Rates Position Auctions and Non-unifor Conversion Rates Liad Blurosen Microsoft Research Mountain View, CA 944 liadbl@icrosoft.co Jason D. Hartline Shuzhen Nong Electrical Engineering and Microsoft AdCenter

More information

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects Lucas Grèze Robert Pellerin Nathalie Perrier Patrice Leclaire February 2011 CIRRELT-2011-11 Bureaux

More information

5.7 Chebyshev Multi-section Matching Transformer

5.7 Chebyshev Multi-section Matching Transformer /9/ 5_7 Chebyshev Multisection Matching Transforers / 5.7 Chebyshev Multi-section Matching Transforer Reading Assignent: pp. 5-55 We can also build a ultisection atching network such that Γ f is a Chebyshev

More information