Dual unification of biclass Support Vector Machine formulations


 Norman Hancock
 2 years ago
 Views:
Transcription
1 Dual unification of biclass Support Vector Machine formulations L. González a,c.angulo b,f.velasco a and A. Català b a COSDE Group. Depto. de Economía Aplicada I, Universidad de Sevilla, E408, Sevilla, Spain b GREC, Universitat Politècnica de Catalunya, E08800, Vilanova i Geltrú, Spain Abstract Support Vector Machine (SVM) theory was originally developed on the basis of a linearly separable binary classification problem, and other approaches have been later introduced for this problem. In this paper it is demonstrated that all these approaches admit the same dual problem formulation in the linearly separable case and that all the solutions are equivalent. For the nonlinearly separable case, all the approaches can also be formulated as a unique dual optimization problem, however their solutions are not equivalent. Discussions and remarks in the article point to an indepth comparison between SVM formulations and associated parameters. Key words: SVM; large margin principle; biclassification; optimization; convex hull. Introduction Support Vector Machines are learning machines which implement the structural risk imization inductive principle to obtain good generalizations on a limited number of learning patterns. This theory was originally developed by V. Vapnik on the basis of a linearly separable binary classification problem with signed outputs ± []. SVM presents sound theoretical properties and behavior in problems of binary classification []. Many papers generalizing the original biclass approach to multiclassification problems [3,4] through different algorithms exist, such as vr SVM or v SVM. This paper unifies known dual formulations for biclass SVM approaches and improves their generalization ability when the proposed approach is used for multiclassification problems. Preprint submitted to Elsevier Science st August 005
2 This paper is organized as follows: in Section, the standard SVM classification learning paradigm is briefly presented in order to introduce some notation. Several SVM approaches are shown and it is demonstrated that all the approaches can be formulated as a unique dual optimization problem in the linearly separable case, where all the solutions are equivalent. Section 3 is devoted to the nonlinearly separable case, such that a theorem is derived which indicates that all the approaches can be formulated as a unique dual optimization problem in the nonlinearly separable case, however their solutions are not equivalent. Finally, some concluding remarks are made. BiClass SVM Learning The SVM is an implementation of a more general regularization principle known as the large margin principle []. Let Z =((x,y ),, (x n,y n )) = (z,,z n ) (X Y) n be a training set, with X as the input space and Y = {θ,θ } = {, +} the output space. Let φ : X F R d,withφ =(φ,,φ d ), be a feature mapping for the usual kernel trick. F is named feature space. Letx def = φ(x) F be the representation of x X. A (binary) linear classifier, f w (x) = φ(x), w b = x, w b, is sought in the space F, withf w : X F R, and outputs are obtained by thresholding f w, h w (x) =sign(f w (x)). Term b is called bias. The optimal separating hyperplane π b identified by the linear classifier is given by {x X : φ(x), w = b}. It is in canonical form [5] w.r.t. Z when z i Z x i, w b =. () The exact definition of margin between classes varies according to the authors. Usually margin is defined [6] as by assug that the restriction in () is w achieved at some point in both classes. However, the exact equality for both classes is not absolutely necessary; it could be attained in only one class when nonlinear problems are considered. Hence, in [5] the margin is defined as the distance from the point of both classes which is closest to the hyperplane π b, and is given by, whereas in [7] the margin is defined as twice this distance. w In this work, classes will be initially considered to be linearly separable in the feature space. Let w be the director vector of a separating hyperplane, which exists since classes are linearly separable. Let β and α be the imum and maximum values for each class in Y = {θ,θ } = {+, }, effectively
3 attained for some patterns z Z and z Z since both sets are finite, and β = z i Z x i, w = z i Z y i x i, w α =max z Z x, w = z Z y x, w () where Z and Z are respectively the patterns belonging to the classes labeled as {θ,θ } = {+, }. We can consider α β, otherwise we can consider the vector w. A natural choice for the bias, ensuring positive and negative outputs for the patterns in the respective classes, would be b = α + β. (3) In this papare, margin is defined as the distance between parallel hyperplanes π α : w, x α =0andπ β : w, x β =0, d(π α,π β )= β α w. (4) Furthermore, it is always possible to scale up w in π b such that w =;in this case, difference β α is called the geometrical margin.. Finding the large margin classifier From (4), it follows that the classifier w with the largest margin on a given training sample Z is def β α w LM = argmax w F;α,β IR w. (5) From (), it is derived β α w = { } w y i x i, w +y x, w z i Z z Z. (6) Many possibilities exist to translate this problem into an optimization problem. Several formulation approaches are presented below, and some remarks and improvements are made. 3
4 .. Standard primal SVM norm formulation The classifier w associated to (6) can be interpreted as [8] def w SV M =argmax w F;b IR w y i x i, w. z i Z A computationally straightforward method of casting problem (5) is to imize the norm w while the margin is fixed to β α =.Theoptimal separating hyperplane obtained must be in canonical form if the bias term is introduced by defining β = b + and α = b. Hence, the problem is translated into the optimization problem w F;b IR w y i ( x i, w b) z i Z, (7) and the usual formulation for SVMs [] is obtained. The bias term is calculated a posteriori by using the KarushKuhnTucker (KKT) conditions or other more computationally robust expressions [9]... classes ordinal regression formulation Alternatively, (5) can be solved by maximizing the difference β α while w is fixed to the unity, i.e. maximizing the geometrical margin. This is a nonlinear nonconvex restriction and hence leads to an associated optimization problem which is harder to solve than (7). Recently, it has been proved in [0] that this nonconvex restriction can be replaced by the convex constraint, w w, since the optimal solution has unit magnitude in order to optimize the obective function. Hence, the resulting optimization problem can be expressed, α β w F;α,β IR x i, w β z i Z x, w α z Z w, w which is as straightforward to solve as that in (7). (8) The original formulation in [0] is designed for multiclass ordinal regression, therefore an additional constraint, α β, appears to prevent overlapping between classes. We establish that this restriction is not longer necessary when only two linearly separable classes are considered, since ordination only depends on the sign of vector w. 4
5 ..3 CMargin formulation A geometrical approach to solving (5) is presented in []. The optimization problem can be written as w F;α,β IR w + α β x i, w β z i Z x, w α z Z. (9) Moreover, it is also shown that the standard primal SVM norm formulation (7) can be derived from (9) by setting β α = and using (3) for the bias choice. In a similar direct way, it could now be shown that the problem leading toformulationin(8)canbeobtainedfrom(9)byfixing w, w =.Amore general approach would be to consider the obective function in (9) as follows, w + C (α β) (0) where a tunable tradeoff exists between w and α β by means of C > Exact margin maximization The alternative geometrical framework introduced in [] and developed in a more general form in [] has been employed in [7] for Banach spaces. Hence, the problem becomes: w F;α,β IR w β α x i, w β z i Z x, w α z Z () as a result for Hilbert spaces, which is the same as in (7) when a different bias, b = α+β is defined. β α. Dualization of all the approaches The most important contribution in this article is the following demonstration of all the proposed Quadratic Programg problem formulations to be the same when dualized. Firstly, a proposition will be demonstrated. Henceforth, notation is simplified by obviating ranges of subscripts and xy = x, y. 5
6 Proposition Let the problem be xi C, x C x i x provided that C and C are convex sets with C C =. Let us assume that x C and x C are solution points for each set; therefore by defining w 0 = x x, it verifies that: x w 0 = x C xw 0 x w 0 =max x C xw 0. () PROOF. If we define β 0 = x w 0 and α 0 = x w 0,thenβ 0 α 0 = w 0 > 0 since C C =. Letπ a : xw 0 a =0beahyperplanewitha R. Let S = { x R d : x x = w 0 } be the surface of the sphere centered at x. Then it verifies two properties: Property : Let π β be an hyperplane with β>β 0 therefore since x π α0 and d(π α0,π β )= β α 0 > β 0 α 0 = w w 0 w 0 0 then π β S =. Property : S C = {x }. Proof: Obviously x S since x x = w 0. Let us suppose that x S C, x x, exists; hence for 0 λ, x λ = λx +( λ)x C since C is a convex set, with x λ x w 0 (triangular inequality). Strict inequality is not possible: if λ exists such that x λ x < w 0, then x is not a solution point. On the other hand, if λ does not exist such that x λ x < w 0,thenx λ S C for 0 λ. However x λ r is a line, and the intersection of this line with the surface of a sphere contains, at most, two points which is a contradiction. r P Q r 50 % P Figure. Geometrical representation of Proposition on R. Notation is P x, P x,r π β0,r π α0 and Q=x 0. Let us see that x w 0 = x C xw 0 = β 0. Let us suppose that x C xw 0 = β<β 0,thenx 0 C exists such that x 0w 0 = β<β 0 (Figure ). We consider Note that if x x is considered instead of x x,thenβ 0 <α 0. 6
7 the line r : x + λ(x 0 x ) C for 0 λ, x r and r π β0, therefore 0 <λ 0 < exists such that x λ0 S. Hence by Property it follows that x λ0 = x, which is a contradiction. The proof that x w 0 =max x C xw 0 = α 0, is similar to the former proof but considers the surface of the sphere whose center is in x. Let x C and x C be those points which provide the distance between C and C. The norm of the vector w 0 is therefore the distance between C and C, i.e. d(c,c )=d(x, x )= x x = w 0. On the other hand, by using β 0 α 0 = w 0, it follows that, w 0 = β 0 α 0 w 0 = d(π α 0,π β0 ). (3) The main result is demonstrated below. Theorem Dual expressions of the optimization problems (7), (8), (9) and () for linearly separable classes can be formulated as u IR n,v IR n n n u i x i v x i= = n n (4) u i = v = u i,v 0 z i Z,z Z i= = and solutions for all the approaches are equivalent. PROOF. Let us suppose that {u i0} n i= and { } v0 n = unified dual problem (4). Hence, the vector n are the solution of the w 0 = u i0 x i v0 x (5) i= and parameters α 0 =max z Z w 0 x and β 0 = zi Z w 0 x i are considered. Bounds are certainly attained for at least one pattern in each class since sets Z and Z are finite. The standardized SVM primal QP problem (7) is first considered. Constraint y i (x i w b) can be written as x i w (b+) 0, z i Z and (b ) x w 0, z Z. Hence, the associated Lagrangian is n = L(w,b)= ww ( i u i x i v x )w + b ( i u i v )+( i u i + v ) 7
8 and its partial derivatives are w =0 w ( i u i x i v x )=0 w = i u i x i v x b =0 u i v =0 u i = v. i i By substitution, L(u i,v )= i u i x i v x +( i u i + v )and since max ( f(x)) is the same problem as f(x), then the dual problem can be written u i,v u i x i v x ( i u i + v ) i u i = v u i,v 0. i (6) Dual problem (6) is now transformed by defining λ = i u i and dual variables u i = u i, λ v = v. It can be supposed that λ>0, since λ = 0 is the trivial λ solution. The new dual problem becomes λ,u i,v λ u i x i v x λ i u i = (7) v = u i,v,λ 0 i where λ does not depend on u i,v. Let us define a = i u i x i v x, and therefore function f(λ, a) = λ a λ is increasing in a, since it verifies f = a λ > 0. Furthermore, f = aλ =0and f = a>0implyλ λ λ 0 = is a imum. Therefore, a λ,u i,v f(λ, a) =f(λ 0,a)= u i,v u i,v a = u i,v u i x i i v x and the dual problem is in terms of u i,v which is identical to problem (4). On the other hand, if w 0 is the vector defined in (5) on the solution of dual problem (4), then the solution of dual problem (7) can be expressed as w = i u i0 x i v 0 x =( i u i0 ) ( i u i0 x i v 0 x ) = λ0 w 0.Moreover, using constraints of primal problem (7) and the definition of the bounds in Note that u =(u,,u n ) IR n and v =(v,,v n ) IR n verify u = v =. 8
9 (5) lead to λ 0 β 0 = b + and λ 0 α 0 = b, which is an equation system whose solution is λ 0 = β 0 α 0 and b = β 0+α 0 β 0 α 0. The classes ordinal regression formulation problem 3 will now be solved. The Lagrangian of the primal problem is: L(w,α,β)=(α β) i u i (x i w β) v (α x w) γ( ww) with partial derivatives, =0 α v =0 v = =0 β i u i =0 i u i = i u i = v = and by considering that γ = 0 is not a valid solution 4, therefore w =0 γw ( i u i x i v x )=0 w = γ i u i x i v x. Hence, the dual obective function is L(γ,u i,v )= 4γ i u i x i v x γ and as max ( f(x)) is the same problem as f(x) then the dual problem becomes, γ,u i,v u i x i v x + γ 4γ i u i = (8) v = u i,v,γ 0 i where γ does not depend on u i,v. Let us define a = i u i x i, v x and hence function f(γ,a) = 4γ a + γ verifies f = a f and = a +. Therefore, for a > 0, the function a γ γ 4γ is increasing with respect to a and γ,a f(γ,a) = γ f(γ, a a). The equation f = a + = 0 has two solutions: γ γ 4γ 0 = a, which is not possible since γ>0; and γ 0 = a, which verifies that this value is a imum, f a γ 3 0 > 0. Hence, f(γ 0,a)= 4γ 0 a + γ 0 = a implies, γ,a f(γ,a) = γ f(γ, a a)= a a = u i,v u i x i i γ (γ 0)= v x and as arg f(x) =argf (x), the final dual problem is problem (4). 3 A more general demonstration is displayed in [0]. 4 Demonstration can be found in [0]. 9
10 On the other hand, if w 0 is the solution of dual problem (4), then the solution of dual problem (8) is w = w γ 0 = w w 0 0, i.e. the norm of the solution vector is one, and by using the constraint of primal problem (8), α = α w 0 0, β = β w 0 0. Thirdly, we solve the CMargin formulation by using obective function (0) for C > 0, w F;α,β IR w + C (α β) x i, w β z i Z (9) x, w α z Z. The Lagrangian is L(w,α,β)= ww + C (α β) i u i (x i w β) v (α x w) and its partial derivatives are α =0 C v =0 v = C =0 C β + i u i =0 i u i = C i u i = v = C w =0 w ( u i x i v x )=0 w = u i x i v x i i which leads to the dual obective function, L(u i,v )= i u i x i v x. By considering dual variables u i = u i C, v = v C, the dual function is therefore max u i,v C u i x i v x i u i = (0) v = u i,v 0 i and as max ( Af(x)) is the same problem as f(x) fora>0, then the final dual problem is problem (4). By following the same line of reasoning as above, the solution for problem 5 (9) is: w = C w 0, α = C α 0 and β = C β 0. Finally, the exact margin maximization problem will be considered. The associated Lagrangian to the primal problem is L(w,α,β)= w β α i 5 Problem (9) is obtained for C =. u i (x i w β) v (α x w) 0
11 and its partial derivatives are, =0 α v = =0 β i u i = w (β α) w (β α) i u i = v = w (β α) w =0 w w =(β α)( i which leads to the dual obective function, max u i,v u i x i v x ) i u i i u i x i v x u i = v, u i,v 0. i () By considering dual variables u i = u i i, u v i = v ( i u i u i i > 0, otherwise if i u i = 0, then the solution is trivial), the dual function is therefore max u i,v i u i x i v x u i = v = u i,v 0 i and by applying arg max f(x) =arg =arg f(x) f (x) then the final dual problem becomes problem (4). when f(x) > 0, On the other hand, value i u i is calculated using i u i = w, w = (β α) ( i u i ) w 0, α =( i u i ) α 0 and β =( i u i ) β 0, hence w0 i u i = β 0 α 0 is obtained and by applying (3) then i u i =. Hence, the solution to problem () w 0 3 is given in the form, w = w w 0 3 0, α = α w and β = β w Discussion Given Z and Z,theconvex hull of Z k is defined in [] as { nk } n k C k = u i x i, 0 u i, u i =, x i Z k, k =, i= i= i.e. the smallest convex sets which contain the set of points for each class: It is demonstrated that if Z and Z are linearly separable then maximizing the margin separation of two sets is equivalent to imizing the distance between the two closest points of the convex hulls. By using this approach, it is shown that optimization problem (9) leads to problems (7) and (8).
12 It has been demonstrated in Theorem that this fact is more general: all the existing approaches have the same form in the dualization. Moreover, the expression for each solution has been obtained: the result in (3) enables the comparision of which scale factor is used for each SVM formulation studied. Given λ>0, the solutions are similar, verifying w = λw 0, β = λβ 0 and α = λα 0, therefore: w 0 = β α = d(π w α,π β ). Hence, it is deduced that the scale factor for all the problems can be obtained from w, β and α in the form λ = w β α. () For standard primal SVM problem (7), the scale is provided by β α =;by using equality (): λ = w = λw 0 λ =. Analogously, for problem w 0 (8) the scale is provided by w =, and therefore λ = β α = λ(β 0 α 0 ) λ =. This result also provides a mean for parameter C w 0 in problem (9) when obective function (0) is used: λ = C. For exact margin maximization, the scale is w The nonlinearly separable case Let us now suppose that sets Z and Z are nonlinearly separable. By using slack variables, constraints are relaxed in the form 6, x i, w β ξ i z i Z x, w α + ξ z Z ξ i,ξ 0 (3) and a penalty term is added to the functional for the primal optimization problem, n n C ξ i + (4) i= ξ = where C>0, which quantifies the loss produced when ξ i,ξ > 0. The constraints in the dual optimization problems for the primal expressions considered are the same as in (6), (8), (0) and (), except that the dual variables are upperbounded, 0 u i,v C, i,. Theorem 3 Dual expression of the optimization problems (7), (8), (9) and () with slack variables (3) in the constraints and penalty term (4) in the 6 For the Standard SVM, α and β must be replaced with b andb+, respectively.
13 functional, for nonlinearly separable classes, can be formulated as u IR n,v IR n n u i x n i v x i= = n u n i = v = 0 u i,v A z i Z z Z. i= = (5) where A depends on n i= u i x i n = v x. However, the solutions for these approaches are not equivalent. PROOF. The dual obective functions to be imized for each formulation can easily be derived, in a similar form to the Theorem counterpart, by using the parameter λ = i u i when necessary, Standard primal SVM norm, f(λ, u i,v )= λ u i x i i v x λ classes ordinal regression and CMargin, f(λ, u i,v )= λ u i x i i v x Exact margin maximization, f(λ, u i,v )= u i x i i v x are now dependent variables, in contrast with the linearly sep where λ, u i,v arable case. The set of constraints for all the problems is n u i = n v =, 0 u i,v C i= = λ, 0 <λ NC, z i Z z Z being N =(n,n ). Hence, by nag A = C, the set of constraints in λ (5) is obtained for all the cases. However, as A = C,theC parameter is implicitly used in the dualized form, λ or, identically, a condition exists in the original problem on the λ parameter, therefore it is no longer possible to affirm that all the approaches lead to a similar solution. Let us see this situation. 3
14 Let us suppose that {u i0 }n i= and { v0 are the solution of (5) and x = = n u i0 x i, x = n v0 x, w 0 = x x are considered together with the two i= sets 7 C k (A) = = { nk } n } n k u i x i, 0 u i A, u i =,z i Z k, k =,. i= i= When C (A) andc (A) are disoint sets, Proposition verifies that α 0 = max w 0x = x w 0, β 0 = w 0x i = x w 0, β 0 α 0 = w 0. x C (A) x C (A) In a similar form to the linear case, the solution for each problem verifies w = λw 0, β = λβ 0 and α = λα 0. Therefore, β α = w 0 λ = β α and λ w 0 so A = C depends on n λ i= u i x i n = v. x On the other hand, constraint β α = implies that λ = is verified for w 0 the standard primal SVM norm; in classes ordinal regression, the vector accomplishes w = λ = w 0 w 0 3/ ; and for the CMargin, λ = C. ; in the exact margin, λ = w (β α) λ = 3. Discussion Two restrictions must be considered when selecting A. IfN =(n,n ) then i u i = v = n A, n A NA and therefore A N, and since i u i =and0 u i, it implies that A. Hence A. (6) (n,n ) On the other hand, it is known that parameter C in the original primal optimization problem is a tradeoff between the smoothness of the solution and the number of errors allowed in the classification problem. Nevertheless, by using (6), it also follows that it is necessary to impose C in the classes ordinal regression formulation and C C in the CMargin expression. 7 These sets are called softconvex hulls in []. 4
15 4 Conclusions and Future Work Several approaches have been introduced which deal with the large margin principle, specially related with the SVM theory. It has been demonstrated in this paper that all these approaches admit the same dual problem formulation in the linearly separable case, and all the solutions are equivalent. For the nonlinearly separable case, all the approaches can also be formulated as a unique dual optimization problem, however the solutions are not equivalent. Relations between all the approaches and the proposed dual unifying method indicate the convex hull formulation to be the most interesting approach in order to deal with the most flexible framework. From the proposed unified dual approach it will be possible to formulate a new SVM with greater power to interpret the geometry of the solution, the margin and associated parameters. 5 Acknowledgements This study was partially supported by Junta de Andalucía grant ACPAI 003/4, and Spanish MCyT grant TIC C00. References [] V. Vapnik, Statistical Learning Theory, John Wiley & Sons, Inc, 998. [] N. Cristianini, J. ShaweTaylor, An introduction to Support Vector Machines and other kernelbased learning methods, Cambridge University Press, 000. [3] U. Kressel, Pairwise classification and support vector machine, In B. Schölkopf, C. Burgues and A. Smola, editors Advances in Kernel Methods: support Vector Learning. MIT Press. Cambridge, MA (999) [4] E. Mayoraz, E. Alpaydin, Support vector machines for multiclass classification, in: IWANN (), 999, pp URL citeseer.ist.psu.edu/mayoraz98support.html [5] B. Schölkopf, A. J. Smola, Learning with Kernels, The MIT Press, Cambridge, MA, 00. [6] B. Schölkopf, C. J. Burgues, A. J. Smola, Intorduction to support vector learning, In B. Schölkopf, C. Burgues and A.J. Smola, editors Advances in Kernel Methods: support Vector Learning. MIT Press. Cambridge, MA (999) 5. 5
16 [7] M. Hein, O. Bousquet, Maximal margin classification for metric spaces, in: B. Schölkopf, M. Warmuth (Eds.), Learning Theory and Kernel Machines, Springer Verlag, Heidelberg, Germany, 003, pp [8] R. Hebrich, Learning Kernel Classifiers. Theory and Algorithms, The MIT Press, 00. [9] T. Joachims, Advances in kernel methods: support vector learning, MIT Press, 999, Ch. Making largescale support vector machine learning practical, pp [0] A. Shashua, A. Levin, Taxonomy of large margin principle algorithms for ordinal regression problems (00). URL citeseer.ist.psu.edu/shashua0taxonomy.html [] K. P. Bennett, E. J. Bredensteiner, Duality and geometry in SVM classifiers, in: Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., 000, pp [] D. Zhou, B. Xiao, H. Zhou, R. Dai, Global geometry of SVM classifiers, Technical report, AI Lab, Institute of Automation, Chinese Academy of Sciences (00). 6
17 6 Algunos comentarios He desarrollado los problemas primales de las cuatro aproximaciones en el caso no separable (si quieres te lo mando por fax) y he llegado, de nuevo, a lo formulado en el Teorema 3, pero pensando en la psvm dual, he encontrado algunas cosas que te comento para ver si coincides conmigo. En primer lugar se debe tener en cuenta que el valor 0 <λ= i u i NC Caso Ordinal Regression and CMargin: El problema dual queda u IR n,v IR n n n u i x i v x i= = n n (7) u i = v = 0 u i,v C z i Z,z Z i= = en el caso C Margin con C =yenelcasoordinal regression es: n n u IR n,v IR n u i x i v x i= = n n (8) u i = v = 0 u i,v C z i Z,z Z i= = que evidentemente proporcionan la misma solución. Ya sabemos que si el problema de clasificación original es no separable, entonces debemos imponer que C<. De esta forma, a la hora de resolver el problema original tendremos que dar un valor a C y al final hemos de comprobar si hemos conseguido una solución no trivial (w 0), ya que puede que no se logre separar los convexhull con dicho valor de C. Por otro lado, sabemos que /N C y puede ocurrir que incluso en el caso extremo. C =/N la solución sea la trivial (basta con que la media aritmética de todos los valores de entrenamiento en cada clase coincidan). Esto ya lo sabíamos pero también ocurren cosas feas en la aproximación... Exact margin: En este caso el problema dual para el caso no separable queda: n u i x i n v x i= = u IR n,v IR n n u i i= n n u i = v 0 u i,v C z i Z,z Z i= = (9) 7
18 (VEAMOSSIMEEXPLICOBIEN)Como0<λ= i u i NC y N>, voy a tomar como valor de λ = C y resuelvo el problema de optimización anterior, con lo cual como estoy restringiendo el problema, la solución u, v aporta un valor en la función obetivo mayor o igual que el aportado por la solución cuando λ es variable. Ahora bien, en este nuevo problema si se considera u i = u i/λ y v = v /λ se tiene el problema (dado en el teorema ): u IR n,v IR n n n u i x i v x i= = n n (30) u i = v = u i,v 0 z i Z,z Z i= = y como el conunto de entrenamiento es no separable, entonces la solución a este problema proporciona un vector w trivial (ESTO LO HE COMPRO BADO EN MATHEMATICA CON UN EJEMPLO) y por tanto su norma es cero y ya que las funciones obetivos son positivas es el menor valor que puede alcanzar la función obetivo en el problema (9). Respecto a la aproximación clásica: Standard SVM: En esta aproximación si el problema es no separable se puede conseguir como en el caso anterior que w = 0 pero no tiene porque ser la solución óptima ya que en la función obetivo se tiene un segundo sumando i u i, de tal forma que (como ocurre en la práctica) se tenga otro vector w 0con i u i grande con lo cual se tiene un valor más pequeño (negativo) en la función obetivo que en el caso anterior. Por todo ello, la aproximación acertada es la standard ya que evita el problema de que la solución sea trivial, y esto lo consigue al imponer que β α =, ya que de esta forma se esta asegurando que los convex hull sean siempre disuntos. Por supuesto que no es necesario que β α = basta con imponer que β α = γ con γ fiado a priori y positivo. Comentarios: Yocreoqueelquizdelasformulaciones(nolastandard)está en que se utilizan núcleos gaussiano y por tanto eligiendo adecuadamente los parámetros del problema de optimización se consigue que el problema sea separable (recordando que el núcleo de Gauss genera un espacio de dimensión infinita) y por tanto todas las aproximaciones son equivalentes. Por otro lado, como se trabaa numéricamente los errores de redondeo hacen que no se alcance exactamente el vector trivial. Pregunta: Están mal mis razonamientos? 8
Support Vector Machine (SVM)
Support Vector Machine (SVM) CE725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept HardMargin SVM SoftMargin SVM Dual Problems of HardMargin
More informationSupport Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multiclass classification.
More informationRegression Using Support Vector Machines: Basic Foundations
Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering
More informationSupport Vector Machines
Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric
More informationIntroduction to Machine Learning NPFL 054
Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká hladka@ufal.mff.cuni.cz Martin Holub holub@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationSupport Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@neclabs.com 1 Support Vector Machines: history SVMs introduced
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Largemargin linear
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationMassive Data Classification via Unconstrained Support Vector Machines
Massive Data Classification via Unconstrained Support Vector Machines Olvi L. Mangasarian and Michael E. Thompson Computer Sciences Department University of Wisconsin 1210 West Dayton Street Madison, WI
More informationDuality in General Programs. Ryan Tibshirani Convex Optimization 10725/36725
Duality in General Programs Ryan Tibshirani Convex Optimization 10725/36725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationThe Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
More informationBy W.E. Diewert. July, Linear programming problems are important for a number of reasons:
APPLIED ECONOMICS By W.E. Diewert. July, 3. Chapter : Linear Programming. Introduction The theory of linear programming provides a good introduction to the study of constrained maximization (and minimization)
More information1 Polyhedra and Linear Programming
CS 598CSC: Combinatorial Optimization Lecture date: January 21, 2009 Instructor: Chandra Chekuri Scribe: Sungjin Im 1 Polyhedra and Linear Programming In this lecture, we will cover some basic material
More informationConvex Optimization SVM s and Kernel Machines
Convex Optimization SVM s and Kernel Machines S.V.N. Vishy Vishwanathan vishy@axiom.anu.edu.au National ICT of Australia and Australian National University Thanks to Alex Smola and Stéphane Canu S.V.N.
More information2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More informationDuality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
More informationSupport Vector Machines
CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best)
More informationROLLING HORIZON PROCEDURES FOR THE SOLUTION OF AN OPTIMAL REPLACEMENT
REVISTA INVESTIGACIÓN OPERACIONAL VOL 34, NO 2, 105116, 2013 ROLLING HORIZON PROCEDURES FOR THE SOLUTION OF AN OPTIMAL REPLACEMENT PROBLEM OF nmachines WITH RANDOM HORIZON Rocio Ilhuicatzi Roldán and
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationExemplar for Internal Achievement Standard. Spanish Level 1
Exemplar for Internal Achievement Standard Spanish Level 1 This exemplar supports assessment against: Achievement Standard 90910 Interact using spoken Spanish to communicate personal information, ideas
More informationSUPPORT vector machine (SVM) formulation of pattern
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 3, MAY 2006 671 A Geometric Approach to Support Vector Machine (SVM) Classification Michael E. Mavroforakis Sergios Theodoridis, Senior Member, IEEE Abstract
More informationSales Management Main Features
Sales Management Main Features Optional Subject (4 th Businesss Administration) Second Semester 4,5 ECTS Language: English Professor: Noelia Sánchez Casado email: noelia.sanchez@upct.es Objectives Description
More informationSPANISH MOOD SELECTION: Probablemente Subjunctive, Posiblemente Indicative
SPANISH MOOD SELECTION: Probablemente, Posiblemente Hilary Miller April 26, 2013 Spanish Mood Selection According to Spanish textbooks: = doubt, indicative = reality/certainty Es probable que/es posible
More informationA Simple Observation Concerning Contraction Mappings
Revista Colombiana de Matemáticas Volumen 46202)2, páginas 229233 A Simple Observation Concerning Contraction Mappings Una simple observación acerca de las contracciones German LozadaCruz a Universidade
More informationA simple application of the implicit function theorem
Boletín de la Asociación Matemática Venezolana, Vol. XIX, No. 1 (2012) 71 DIVULGACIÓN MATEMÁTICA A simple application of the implicit function theorem Germán LozadaCruz Abstract. In this note we show
More informationExtracting the roots of septics by polynomial decomposition
Lecturas Matemáticas Volumen 29 (2008), páginas 5 12 ISSN 0120 1980 Extracting the roots of septics by polynomial decomposition Raghavendra G. Kulkarni HMC Division, Bharat Electronics Ltd., Bangalore,
More informationCambridge IGCSE. www.cie.org.uk
Cambridge IGCSE About University of Cambridge International Examinations (CIE) Acerca de la Universidad de Cambridge Exámenes Internacionales. CIE examinations are taken in over 150 different countries
More informationENVIRONMENT: Collaborative Learning Environment
Guía Integrada de Actividades Contexto de la estrategia de aprendizaje a desarrollar en el curso: The activity focuses on the Task Based Language Learning (TBLL). The task is used by the student in order
More informationOn the existence of multiple principal eigenvalues for some indefinite linear eigenvalue problems
RACSAM Rev. R. Acad. Cien. Serie A. Mat. VOL. 97 (3), 2003, pp. 461 466 Matemática Aplicada / Applied Mathematics Comunicación Preliminar / Preliminary Communication On the existence of multiple principal
More information1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.
Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S
More informationGeometrical Characterization of RNoperators between Locally Convex Vector Spaces
Geometrical Characterization of RNoperators between Locally Convex Vector Spaces OLEG REINOV St. Petersburg State University Dept. of Mathematics and Mechanics Universitetskii pr. 28, 198504 St, Petersburg
More informationA fast multiclass SVM learning method for huge databases
www.ijcsi.org 544 A fast multiclass SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and TalebAhmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationCopyright 2016123TeachMe.com 242ea 1
Sentence Match Quiz for Category: por_vs_para_1 1) Son las habitaciones accesibles para discapacitados?  A: Are the rooms handicapped accessible?  B: You must fill out this form in order to get work
More informationOnline learning of multiclass Support Vector Machines
IT 12 061 Examensarbete 30 hp November 2012 Online learning of multiclass Support Vector Machines Xuan Tuan Trinh Institutionen för informationsteknologi Department of Information Technology Abstract
More informationCHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
More informationMATH 289 PROBLEM SET 1: INDUCTION. 1. The induction Principle The following property of the natural numbers is intuitively clear:
MATH 89 PROBLEM SET : INDUCTION The induction Principle The following property of the natural numbers is intuitively clear: Axiom Every nonempty subset of the set of nonnegative integers Z 0 = {0,,, 3,
More informationOn closedform solutions of a resource allocation problem in parallel funding of R&D projects
Operations Research Letters 27 (2000) 229 234 www.elsevier.com/locate/dsw On closedform solutions of a resource allocation problem in parallel funding of R&D proects Ulku Gurler, Mustafa. C. Pnar, Mohamed
More informationGeometry of Linear Programming
Chapter 2 Geometry of Linear Programming The intent of this chapter is to provide a geometric interpretation of linear programming problems. To conceive fundamental concepts and validity of different algorithms
More informationMultiplicative Relaxation with respect to Thompson s Metric
Revista Colombiana de Matemáticas Volumen 48(2042, páginas 227 Multiplicative Relaxation with respect to Thompson s Metric Relajamiento multiplicativo con respecto a la métrica de Thompson Gerd Herzog
More informationA Random Sampling Technique for Training Support Vector Machines (For PrimalForm MaximalMargin Classifiers)
A Random Sampling Technique for Training Support Vector Machines (For PrimalForm MaximalMargin Classifiers) Jose Balcázar 1, Yang Dai 2, and Osamu Watanabe 2 1 Dept. Llenguatges i Sistemes Informatics,
More informationLecture 2: August 29. Linear Programming (part I)
10725: Convex Optimization Fall 2013 Lecture 2: August 29 Lecturer: Barnabás Póczos Scribes: Samrachana Adhikari, Mattia Ciollaro, Fabrizio Lecci Note: LaTeX template courtesy of UC Berkeley EECS dept.
More informationSection 1.1. Introduction to R n
The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to
More informationAP SPANISH LANGUAGE 2011 PRESENTATIONAL WRITING SCORING GUIDELINES
AP SPANISH LANGUAGE 2011 PRESENTATIONAL WRITING SCORING GUIDELINES SCORE DESCRIPTION TASK COMPLETION TOPIC DEVELOPMENT LANGUAGE USE 5 Demonstrates excellence 4 Demonstrates command 3 Demonstrates competence
More informationDensity Level Detection is Classification
Density Level Detection is Classification Ingo Steinwart, Don Hush and Clint Scovel Modeling, Algorithms and Informatics Group, CCS3 Los Alamos National Laboratory {ingo,dhush,jcs}@lanl.gov Abstract We
More informationNonlinear Optimization: Algorithms 3: Interiorpoint methods
Nonlinear Optimization: Algorithms 3: Interiorpoint methods INSEAD, Spring 2006 JeanPhilippe Vert Ecole des Mines de Paris JeanPhilippe.Vert@mines.org Nonlinear optimization c 2006 JeanPhilippe Vert,
More informationNew words to remember
Finanza Toolbox Materials Credit Cards, Debit Cards and ATM Cards New words to remember charging minimum payment credit limit interest PIN check register What is a Credit Card? A credit card is a thin
More informationMachine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
More informationSupport Vector Machines
Support Vector Machines Here we approach the twoclass classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two
More informationMACHINE LEARNING. Introduction. Alessandro Moschitti
MACHINE LEARNING Introduction Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Email: moschitti@disi.unitn.it Course Schedule Lectures Tuesday, 14:0016:00
More informationA. Before you read the text, answer the following question: What should a family do before starting to look for a new home?
UNIT 1: A PLAN FOR BUYING English for Real Estate Materials A. Before you read the text, answer the following question: What should a family do before starting to look for a new home? Read the following
More informationNew words to remember
Finanza Toolbox Materials What is a Bank Loan? Borrowing money from the bank is called a bank loan. People borrow money from the bank for many reasons. One reason to get a bank loan might be to buy a car.
More informationLINIO COLOMBIA. StartingUp & Leading ECommerce. www.linio.com.co. Luca Ranaldi, CEO. Pedro Freire, VP Marketing and Business Development
LINIO COLOMBIA StartingUp & Leading ECommerce Luca Ranaldi, CEO Pedro Freire, VP Marketing and Business Development 22 de Agosto 2013 www.linio.com.co QUÉ ES LINIO? Linio es la tienda online #1 en Colombia
More informationmax cx s.t. Ax c where the matrix A, cost vector c and right hand side b are given and x is a vector of variables. For this example we have x
Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study
More informationNumerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number
More information0530 SPANISH (FOREIGN LANGUAGE)
CAMBRIDGE INTERNATIONAL EXAMINATIONS International General Certificate of Secondary Education MARK SCHEME for the October/November 2012 series 0530 SPANISH (FOREIGN LANGUAGE) 0530/22 Paper 2 (Reading and
More informationNotes on Support Vector Machines
Notes on Support Vector Machines Fernando Mira da Silva Fernando.Silva@inesc.pt Neural Network Group I N E S C November 1998 Abstract This report describes an empirical study of Support Vector Machines
More informationConvex Programming Tools for Disjunctive Programs
Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible
More informationApplication of Support Vector Machines to Fault Diagnosis and Automated Repair
Application of Support Vector Machines to Fault Diagnosis and Automated Repair C. Saunders and A. Gammerman Royal Holloway, University of London, Egham, Surrey, England {C.Saunders,A.Gammerman}@dcs.rhbnc.ac.uk
More informationSUPPORT VECTOR MACHINE (SVM) is the optimal
130 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 Multiclass Posterior Probability Support Vector Machines Mehmet Gönen, Ayşe Gönül Tanuğur, and Ethem Alpaydın, Senior Member, IEEE
More informationMetric Spaces. Chapter 7. 7.1. Metrics
Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some
More informationDefinition of a Linear Program
Definition of a Linear Program Definition: A function f(x 1, x,..., x n ) of x 1, x,..., x n is a linear function if and only if for some set of constants c 1, c,..., c n, f(x 1, x,..., x n ) = c 1 x 1
More informationDOCUMENTS DE TREBALL DE LA FACULTAT D ECONOMIA I EMPRESA. Col.lecció d Economia
DOCUMENTS DE TREBALL DE LA FACULTAT D ECONOMIA I EMPRESA Col.lecció d Economia E13/89 On the nucleolus of x assignment games F. Javier Martínez de Albéniz Carles Rafels Neus Ybern F. Javier Martínez de
More informationMinimize subject to. x S R
Chapter 12 Lagrangian Relaxation This chapter is mostly inspired by Chapter 16 of [1]. In the previous chapters, we have succeeded to find efficient algorithms to solve several important problems such
More informationPráctica 1: PL 1a: Entorno de programación MathWorks: Simulink
Práctica 1: PL 1a: Entorno de programación MathWorks: Simulink 1 Objetivo... 3 Introducción Simulink... 3 Open the Simulink Library Browser... 3 Create a New Simulink Model... 4 Simulink Examples... 4
More informationSequences and Convergence in Metric Spaces
Sequences and Convergence in Metric Spaces Definition: A sequence in a set X (a sequence of elements of X) is a function s : N X. We usually denote s(n) by s n, called the nth term of s, and write {s
More informationMemorial Health Care System Catholic Health Initiatives Financial Assistance Application Form
B Please note  Memorial Hospital may access external validation resources to assist in determining whether a full application for assistance is required. Financial Assistance Application 1) Patient Name
More informationChapter 15 Introduction to Linear Programming
Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2014 WeiTa Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of
More informationAbsolute Value Programming
Computational Optimization and Aplications,, 1 11 (2006) c 2006 Springer Verlag, Boston. Manufactured in The Netherlands. Absolute Value Programming O. L. MANGASARIAN olvi@cs.wisc.edu Computer Sciences
More informationMultiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John ShaweTaylor
More informationOptimization of Design. Lecturer:DungAn Wang Lecture 12
Optimization of Design Lecturer:DungAn Wang Lecture 12 Lecture outline Reading: Ch12 of text Today s lecture 2 Constrained nonlinear programming problem Find x=(x1,..., xn), a design variable vector of
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationEE 1130 Freshman Eng. Design for Electrical and Computer Eng.
EE 1130 Freshman Eng. Design for Electrical and Computer Eng. Signal Processing Module (DSP). Module Project. Class 5 C2. Use knowledge, methods, processes and tools to create a design. I1. Identify and
More informationBig Data  Lecture 1 Optimization reminders
Big Data  Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data  Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
More informationLecture 6: Logistic Regression
Lecture 6: CS 19410, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,
More informationResearch Article Stability Analysis for HigherOrder Adjacent Derivative in Parametrized Vector Optimization
Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 2010, Article ID 510838, 15 pages doi:10.1155/2010/510838 Research Article Stability Analysis for HigherOrder Adjacent Derivative
More informationdemonstrates competence in
AP SPANISH LANGUAGE 2012 INTERPERSONAL WRITING SCORING GUIDELINES SCORE DESCRIPTION TASK COMPLETION/TOPIC DEVELOPMENT LANGUAGE USE 5 excellence 4 command 3 competence 2 Suggests lack of competence 1 lack
More informationCopyright 2016123TeachMe.com 4ea67 1
Sentence Match Quiz for Category: hacer_make_do_1 1) Nosotros hacemos todo lo posible para proporcionar un buen servicio.  A: We do our best to provide good service.  B: These chores are done each time.
More informationLinear Programming Notes V Problem Transformations
Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material
More informationLarge Margin DAGs for Multiclass Classification
S.A. Solla, T.K. Leen and K.R. Müller (eds.), 57 55, MIT Press (000) Large Margin DAGs for Multiclass Classification John C. Platt Microsoft Research Microsoft Way Redmond, WA 9805 jplatt@microsoft.com
More informationAP SPANISH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES
AP SPANISH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES Identical to Scoring Guidelines used for French, German, and Italian Language and Culture Exams Interpersonal Writing: Email Reply 5: STRONG
More informationIncreasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.
1. Differentiation The first derivative of a function measures by how much changes in reaction to an infinitesimal shift in its argument. The largest the derivative (in absolute value), the faster is evolving.
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network?  Perceptron learners  Multilayer networks What is a Support
More informationME128 ComputerAided Mechanical Design Course Notes Introduction to Design Optimization
ME128 Computerided Mechanical Design Course Notes Introduction to Design Optimization 2. OPTIMIZTION Design optimization is rooted as a basic problem for design engineers. It is, of course, a rare situation
More informationA NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
More informationAsk your child what he or she is learning to say in Spanish at school. Encourage your child to act as if he or she is your teacher.
Welcome to Descubre el español con Santillana! This year, your child will be learning Spanish by exploring the culture of eight Spanishspeaking countries. Please join us as we travel through each of the
More informationICT education and motivating elderly people
Ariadna; cultura, educación y tecnología. Vol. I, núm. 1, jul. 2013 htpp://ariadna.uji.es 3 RD International Conference on Elderly and New Technologies pp. 8892 DOI: http://dx.doi.org/10.6035/ariadna.2013.1.15
More information24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NPcomplete. Then one can conclude according to the present state of science that no
More informationFOR TEACHERS ONLY The University of the State of New York
FOR TEACHERS ONLY The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION S COMPREHENSIVE EXAMINATION IN SPANISH Wednesday, January 24, 2007 9:15 a.m. to 12:15 p.m., only SCORING KEY Updated
More informationThe Set Covering Machine
Journal of Machine Learning Research 3 (2002) 723746 Submitted 12/01; Published 12/02 The Set Covering Machine Mario Marchand School of Information Technology and Engineering University of Ottawa Ottawa,
More informationAdapting Codes and Embeddings for Polychotomies
Adapting Codes and Embeddings for Polychotomies Gunnar Rätsch, Alexander J. Smola RSISE, CSL, Machine Learning Group The Australian National University Canberra, 2 ACT, Australia Gunnar.Raetsch, Alex.Smola
More informationAn interval linear programming contractor
An interval linear programming contractor Introduction Milan Hladík Abstract. We consider linear programming with interval data. One of the most challenging problems in this topic is to determine or tight
More informationFurther Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
More informationIntesisBox PARC2xxx1 SANYO compatibilities
IntesisBox PARC2xxx1 SANYO compatibilities In this document the compatible SANYO models with the following IntesisBox RC2 interfaces are listed: / En éste documento se listan los modelos SANYO compatibles
More informationPUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.
PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include
More informationBig Data: A Geometric Explanation of a Seemingly Counterintuitive Strategy
Big Data: A Geometric Explanation of a Seemingly Counterintuitive Strategy Olga Kosheleva and Vladik Kreinovich University of Texas at El Paso 500 W. University El Paso, TX 79968, USA olgak@utep.edu, vladik@utep.edu
More information