# Lecture notes 2 February 1, Convex optimization

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Lecture notes 2 February 1, 2016 Notation Convex optimization Matrices are written in uppercase: A, vectors are written in lowercase: a. A ij denotes the element of A in position (i, j), A i denotes the ith column of A (it s a vector!). Beware that x i may denote the ith entry of a vector x or a the ith vector in a list depending on the context. I denotes a subvector of x that contains the entries listed in the set I. For example, x 1:n contains the first n entries of x. 1 Convexity 1.1 Convex sets A set is convex if it contains all segments connecting points that belong to it. Definition 1.1 (Convex set). A convex set S is any set such that for any x, y S and θ (0, 1) θx + (1 θ) y S. (1) Figure 1 shows a simple example of a convex and a nonconvex set. The following lemma establishes that the intersection of convex sets is convex. Lemma 1.2 (Intersection of convex sets). Let S 1,..., S m be convex subsets of R n, m S i is convex. Proof. Any x, y m S i also belong to S 1. By convexity of S 1 θx + (1 θ) y belongs to S 1 for any θ (0, 1) and therefore also to m S i. The following theorem shows that projection onto non-empty closed convex sets is unique. The proof is in Section B.1 of the appendix. Theorem 1.3 (Projection onto convex set). Let S R n be a non-empty closed convex set. The projection of any vector x R n onto S exists and is unique. P S (x) := arg min s S x s 2 (2)

2 Nonconvex Convex Figure 1: An example of a nonconvex set (left) and a convex set (right). A convex combination of n points is any linear combination of the points with nonnegative coefficients that add up to one. In the case of two points, this is just the segment between the points. Definition 1.4 (Convex combination). Given n vectors x 1, x 2,..., x n R n, x := θ i x i (3) is a convex combination of x 1, x 2,..., x n as along as the real numbers θ 1, θ 2,..., θ n are nonnegative and add up to one, θ i 0, 1 i n, (4) θ i = 1. (5) The convex hull of a set S contains all convex combination of points in S. Intuitively, it is the smallest convex set that contains S. Definition 1.5 (Convex hull). The convex hull of a set S is the set of all convex combinations of points in S. A justification of why we penalize the l 1 -norm to promote sparse structure is that the l 1 - norm ball is the convex hull of the intersection between the l 0 norm ball and the l -norm ball. The lemma is illustrated in 2D in Figure 2 and proved in Section 1.6 of the appendix. Lemma 1.6 (l 1 -norm ball). The l 1 -norm ball is the convex hull of the intersection between the l 0 norm ball and the l -norm ball. 2

3 Figure 2: Illustration of Lemma (1.6) The l 0 norm ball is shown in black, the l -norm ball in blue and the l 1 -norm ball in a reddish color. 1.2 Convex functions We now define convexity for functions. Definition 1.7 (Convex function). A function f : R n R is convex if for any x, y R n and any θ (0, 1), θf (x) + (1 θ) f (y) f (θx + (1 θ) y). (6) The function is strictly convex if the inequality is always strict, i.e. if x y implies that θf (x) + (1 θ) f (y) > f (θx + (1 θ) y). (7) A concave function is a function f such that f is convex. Remark 1.8 (Extended-value functions). We can also consider an arbitrary function f that is only defined in a subset of R n. In that case f is convex if and only if its extension f is convex, where { f (x) if x dom (f), f (x) := (8) if x / dom (f). Equivalently, f is convex if and only if its domain dom (f) is convex and any two points in dom (f) satisfy (6). 3

4 f (y) θf (x) + (1 θ)f (y) f (θx + (1 θ)y) f (x) Figure 3: Illustration of condition (6) in Definition 1.7. The curve corresponding to the function must lie below any chord joining two of its points. Condition (6) is illustrated in Figure 3. The curve corresponding to the function must lie below any chord joining two of its points. It is therefore not surprising that we can determine whether a function is convex by restricting our attention to its behavior along lines in R n. This is established by the following lemma, which is proved formally in Section B.2 of the appendix. Lemma 1.9 (Equivalent definition of convex functions). A function f : R n R is convex if and only if for any two points x, y R n the univariate function g x,y : [0, 1] R defined by g x,y (α) := f (αx + (1 α) y) (9) is convex. Similarly, f is strictly convex if and only if g x,y is strictly convex for any a, b. Section A in the appendix provides a definition of the norm of a vector and lists the most common ones. It turns out that all norms are convex. Lemma 1.10 (Norms are convex). Any valid norm is a convex function. Proof. By the triangle inequality inequality and homogeneity of the norm, for any x, y R n and any θ (0, 1) θx + (1 θ) y θx + (1 θ) y = θ x + (1 θ) y. (10) The l 0 norm is not really norm, as explained in Section A, and is not convex either. 4

5 Lemma 1.11 (l 0 norm ). The l 0 norm is not convex. Proof. We provide a simple counterexample. θ (0, 1) Let x := ( 1 0 ) and y := ( 0 1 ), then for any θx + (1 θ) y 0 = 2 > 1 = θ x 0 + (1 θ) y 0. (11) We end the section by establishing a property of convex functions that is crucial in optimization. Theorem 1.12 (Local minima are global). Any local minimum of a convex function f : R n R is also a global minimum. We defer the proof of the theorem to Section B.4 of the appendix. 1.3 Sublevel sets and epigraph In this section we define two sets associated to a function that are very useful when reasoning geometrically about convex functions. Definition 1.13 (Sublevel set). The γ-sublevel set of a function f : R n R, where γ R, is the set of points in R n at which the function is smaller or equal to γ, C γ := {x f (x) γ}. (12) Lemma 1.14 (Sublevel sets of convex functions). The sublevel sets of a convex function are convex. Proof. If x, y R n belong to the γ-sublevel set of a convex function f then for any θ (0, 1) f (θx + (1 θ) y) θf (x) + (1 θ) f (y) by convexity of f (13) γ (14) because both x and y belong to the γ-sublevel set. We conclude that any convex combination of x and y also belongs to the γ-sublevel set. Recall that the graph of a function f : R n R is the curve in R n+1 graph (f) := {x f (x 1:n ) = x n+1 }, (15) where x 1:n R n contains the first n entries of x. The epigraph of a function is the set in R n+1 that lies above the graph of the function. An example is shown in Figure 4. 5

6 epi (f) f Figure 4: Epigraph of a function. Definition 1.15 (Epigraph). The epigraph of a function f : R n R is defined as epi (f) := {x f (x 1:n ) x n+1 }. (16) Epigraphs allow to reason geometrically about convex functions. The following basic result is proved in Section B.5 of the appendix. Lemma 1.16 (Epigraphs of convex functions are convex). A function is convex if and only if its epigraph is convex. 1.4 Operations that preserve convexity It may be challenging to determine whether a function of interest is convex or not by using the definition directly. Often, an easier alternative is to express the function in terms of simpler functions that are known to be convex. In this section we list some operations that preserve convexity. Lemma 1.17 (Composition of convex and affine function). If f : R n R is convex, then for any A R n m and any b R n, the function is convex. h (x) := f (Ax + b) (17) 6

7 Proof. By convexity of f, for any x, y R m and any θ (0, 1) h (θx + (1 θ) y) = f (θ (Ax + b) + (1 θ) (Ay + b)) (18) θf (Ax + b) + (1 θ) f (Ay + b) (19) = θ h (x) + (1 θ) h (y). (20) Corollary 1.18 (Least squares). For any A R n m and any y R n the least-squares cost function is convex. Ax y 2 (21) Lemma 1.19 (Nonnegative weighted sums). The weighted sum of m convex functions f 1,..., f m f := is convex as long as the weights α 1,..., α R are nonnegative. m α i f i (22) Proof. By convexity of f 1,..., f m, for any x, y R m and any θ (0, 1) f (θx + (1 θ) y) = m α i f i (θx + (1 θ) y) (23) m α i (θf i (x) + (1 θ) f i (y)) (24) = θ f (x) + (1 θ) f (y). (25) Corollary 1.20 (Regularized least squares). Regularized least-squares cost functions of the form where is an arbitrary norm, are convex. Ax y x, (26) Proposition 1.21 (Pointwise maximum/supremum of convex functions). The pointwise maximum of m convex functions f 1,..., f m is convex f max (x) := max 1 i m f i (x). (27) 7

8 The pointwise supremum of a family of convex functions indexed by a set I is convex. f sup (x) := sup f i (x). (28) i I Proof. We prove that the supremum is unique, as it implies the result for the maximum. For any 0 θ 1 and any x, y R, f sup (θx + (1 θ) y) = sup f i (θx + (1 θ) y) (29) i I sup θf i (x) + (1 θ) f i (y) by convexity of the f i (30) i I θ sup i I f i (x) + (1 θ) sup f j (y) (31) j I = θf sup (x) + (1 θ) f sup (y). (32) 2 Differentiable functions In this section we characterize the convexity of differentiable functions in terms of the behavior of their first and second order Taylor expansions, or equivalently in terms of their gradient and Hessian. 2.1 First-order conditions Consider the first-order Taylor expansion of f : R n R at x, f 1 x (y) := f (x) + f (x) (y x). (33) Note that this first-order approximation is a linear function. The following proposition, proved in Section B.6 of the appendix, establishes that a function f is convex if and only if fx 1 is a lower bound for f for any x R n. Figure 5 illustrates the condition with an example. Proposition 2.1 (First-order condition). A differentiable function f : R n R is convex if and only if for every x, y R n f (y) f (x) + f (x) T (y x). (34) It is strictly convex if and only if f (y) > f (x) + f (x) T (y x). (35) 8

9 x f 1 x (y) f (y) Figure 5: An example of the first-order condition for convexity. The first-order approximation at any point is a lower bound of the function. An immediate corollary is that for a convex function, any point at which the gradient is zero is a global minimum. If the function is strictly convex, the minimum is unique. Corollary 2.2. If a differentiable function f is convex and f (x) = 0, then for any y R If f is strictly convex then for any y R f (y) f (x). (36) f (y) > f (x). (37) For any differentiable function f and any x R n let us define the hyperplane H f,x R n+1 that corresponds to the first-order approximation of f at x, H f,x := { y y n+1 = f 1 x (y 1:n ) }. (38) Geometrically, Proposition 2.1 establishes that H f,x lies above the epigraph of f. In addition, the hyperplane and epi (f) intersect at x. In convex analysis jargon, H f,x is a supporting hyperplane of epi (f) at x. Definition 2.3 (Supporting hyperplane). A hyperplane H is a supporting hyperplane of a set S at x if H and S intersect at x, S is contained in one of the half-spaces bounded by H. The optimality condition has a very intuitive geometric interpretation in terms of the supporting hyperplane H f,x. f = 0 implies that H f,x is horizontal if the vertical dimension corresponds to the n + 1th coordinate. Since the epigraph lies above hyperplane, the point at which they intersect must be a minimum of the function. 9

10 f 2 x (y) x f (y) Figure 6: An example of the second-order condition for convexity. The second-order approximation at any point is convex. 2.2 Second-order conditions For univariate functions that are twice differentiable, convexity is dictated by the curvature of the function. As you might recall from basic calculus, curvature is the rate of change of the slope of the function and is consequently given by its second derivative. The following lemma, proved in Section B.8 of the appendix, establishes that univariate functions are convex if and only if their curvature is always nonnegative. Lemma 2.4. A twice-differentiable function g : R R is convex if and only if g (α) 0 for all α R. By Lemma 1.9, we can establish convexity in R n by considering the restriction of the function along an arbitrary line. By multivariable calculus, the second directional derivative of f at a point x in the direction of a unit vector is equal to u T 2 f (x) u. As a result, if the Hessian is positive semidefinite, the curvatures is nonnegative in every direction and the function is convex. Corollary 2.5. A twice-differentiable function f : R n R is convex if and only if for every x R n, the Hessian matrix 2 f (x) is positive semidefinite. Proof. By Lemma 1.9 we just need to show that the univariate function g a,b defined by (9) is convex for all a, b R n. By Lemma 2.4 this holds if and only if the second derivative of g a,b is nonnegative. Applying some basic multivariate calculus, we have that for any α (0, 1) g a,b (α) := (a b) T 2 f (αa + (1 α) b) (a b). (39) This quantity is nonnegative for all a, b R n if and only if 2 f (x) is positive semidefinite for any x R n. 10

11 Convex Concave Neither Figure 7: Quadratic forms for which the Hessian is positive definite (left), negative definite (center) and neither positive nor negative definite (right). Remark 2.6 (Strict convexity). If the Hessian is positive definite, then the function is strictly convex (the proof is essentially the same). However, there are functions that are strictly convex for which the Hessian may equal zero at some points. An example is the univariate function f (x) = x 4, for which f (0) = 0. We can interpret Corollary 2.5 in terms of the second-order Taylor expansion of f : R n R at x. Definition 2.7 (Second-order approximation). The second-order or quadratic approximation of f at x is f 2 x (y) := f (x) + f (x) (y x) (y x)t 2 f (x) (y x). (40) fx 2 is a quadratic form that shares the same value at x. By the corollary, f is convex if and only if this quadratic approximation is always convex. This is illustrated in Figure 6. Figure 7 shows some examples of quadratic forms in two dimensions. 3 Nondifferentiable functions If a function is not differentiable we cannot use its gradient to check whether it is convex or not. However, we can extend the first-order characterization derived in Section 2.1 by checking whether the function has a supporting hyperplane at every point. If a supporting hyperplane exists at x, the gradient of the hyperplane is called a subgradient of f at x. 11

13 An important nondifferentiable convex function in optimization-based data analysis is the l 1 norm. The following proposition characterizes its subdifferential. Proposition 3.5 (Subdifferential of l 1 norm). The subdifferential of the l 1 norm at x R n is the set of vectors q R n that satisfy q i = sign (x i ) if x i 0, (48) q i 1 if x i = 0. (49) Proof. The proof relies on the following simple lemma, proved in Section B.10. Lemma 3.6. q is a subgradient of 1 : R n R at x if and only if q i is a subgradient of : R R at x i for all 1 i n. If x i 0 the absolute-value function is differentiable at x i, so by Proposition (3.4), q i is equal to the derivative q i = sign (x i ). If x i = 0, q i is a subgradient of the absolute-value function if and only if α q i α for any α R, which holds if and only if q i 1. 4 Optimization problems 4.1 Definition We start by defining a canonical optimization problem. The vast majority of optimization problems (certainly all of the ones that we will study in the course) can be cast in this form. Definition 4.1 (Optimization problem). where f 0, f 1,..., f m, h 1,..., h p : R n R. minimize f 0 (x) (50) subject to f i (x) 0, 1 i m, (51) h i (x) = 0, 1 i p, (52) The problem consists of a cost function f 0, inequality constraints and equality constraints. Any vector that satisfies all the constraints in the problem is said to be feasible. A solution to the problem is any vector x such that for all feasible vectors x f 0 (x) f 0 (x ). (53) If a solution exists f (x ) is the optimal value or optimum of the optimization problem. An optimization problem is convex if it satisfies the following conditions: 13

14 The cost function f 0 is convex. The functions that determine the inequality constraints f 1,..., f m are convex. The functions that determine the equality constraints h 1,..., h p are affine, i.e. h i (x) = a T i x + b i for some a i R n and b i R. Note that under these assumptions the feasibility set is convex. Indeed, it corresponds to the intersection of several convex sets: the 0-sublevel sets of f 1,..., f m, which are convex by Lemma 1.14, and the hyperplanes h i (x) = a T i x+b i. The intersection is convex by Lemma 1.2. If both the cost function and the constraint functions are all affine, the problem is a linear program (LP). Definition 4.2 (Linear program). minimize a T x (54) subject to c T i x d i, 1 i m, (55) Ax = b. (56) It turns out that l 1 -norm minimization can be cast as an LP. The theorem is proved in Section B.11 of the appendix. Theorem 4.3 (l 1 -norm minimization as an LP). The optimization problem can be recast as the linear program minimize x 1 (57) subject to Ax = b (58) minimize t i (59) subject to t i x i, (60) t i x i, (61) Ax = b. (62) If the cost function is a positive semidefinite quadratic form and the constraints are affine the problem is a quadratic program (QP). Definition 4.4 (Quadratic program). where Q R n n is positive semidefinite. minimize x T Qx + a T x (63) subject to c T i x d i, 1 i m, (64) Ax = b, (65) 14

15 A corollary of Theorem 4.3 is that l 1 -norm regularized least squares can be cast as a QP. Corollary 4.5 (l 1 -norm regularized least squares as a QP). The optimization problem minimize Ax y λ x 1 (66) can be recast as the quadratic program minimize x T A T Ax 2y T x + λ t i (67) subject to t i x i, (68) t i x i. (69) We will discuss other types of convex optimization problems, such as semidefinite programs later on in the course. 4.2 Duality The Lagrangian of the optimization problem in Definition 4.1 is defined as the cost function augmented by a weighted linear combination of the constraint functions, L (x, λ, ν) := f 0 (x) + m p λ i f i (x) + ν j h j (x), (70) j=1 where the vectors λ R m, ν R p are called Lagrange multipliers or dual variables. contrast, x is the primal variable. Note that as long as λ i 0 for 1 i m, the Lagrangian is a lower bound for the value of the cost function at any feasible point. Indeed, if x is feasible and λ i 0 for 1 i m then for all 1 i m, 1 j p. This immediately implies, λ i f i (x) 0, (71) ν j h j (x) = 0, (72) L (x, λ, ν) f 0 (x). (73) The Lagrange dual function is the infimum of the Lagrangian over the primal variable x l (λ, ν) := inf x R n f 0 (x) + m λ i f i (x) + In p ν j h j (x). (74) j=1 15

16 Proposition 4.6 (Lagrange dual function as a lower bound of the primal optimum). Let p denote an optimal value of the optimization problem in Definition 4.1, as long as λ i 0 for 1 i n. l (λ, ν) p, (75) Proof. The result follows directly from (73), p = f 0 (x ) (76) L (x, λ, ν) (77) l (λ, ν). (78) Optimizing this lower bound on the primal optimum over the Lagrange multipliers yields the dual problem of the original optimization problem, which is called the primal problem in this context. Definition 4.7 (Dual problem). The dual problem of the optimization problem from Definition 4.1 is maximize l (λ, ν) (79) subject to λ i 0, 1 i m. (80) Note that the cost function is a pointwise supremum of linear (and hence convex) functions, so by Proposition 1.21 the dual problem is a convex optimization problem even if the primal is nonconvex! The following result, which is an immediate corollary to Proposition 4.6, states that the optimum of the dual problem is a lower bound for the primal optimum. This is known as weak duality. Corollary 4.8 (Weak duality). Let p denote an optimum of the optimization problem in Definition 4.1 and d an optimum of the corresponding dual problem, d p. (81) In the case of convex functions, the optima of the primal and dual problems are often equal, i.e. d = p. (82) This is known as strong duality. A simple condition that guarantees strong duality for convex optimization problems is Slater s condition. 16

17 Definition 4.9 (Slater s condition). A vector x R n satisfies Slater s condition for a convex optimization problem if f i (x) < 0, 1 i m, (83) Ax = b. (84) A proof of strong duality under Slater s condition can be found in Section of [1]. References A very readable and exhaustive reference is Boyd and Vandenberghe s seminal book on convex optimization [1], which unfortunately does not cover subgradients. 1 Nesterov s book [2] and Rockafellar s book [3] do cover subgradients. [1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, [2] Y. Nesterov. Introductory lectures on convex optimization: A basic course. [3] R. T. Rockafellar and R. J.-B. Wets. Variational analysis. Springer Science & Business Media, A Norms The norm of a vector is a generalization of the concept of length. Definition A.1 (Norm). Let V be a vector space, a norm is a function from V to R that satisfies the following conditions. It is homogeneous. For all α R and x V α x = α x. (85) It satisfies the triangle inequality x + y x + y. (86) In particular, it is nonnegative (set y = x). x = 0 implies that x is the zero vector 0. 1 However, see 17

18 The l 2 norm is induced by the inner product x, y = x T y x 2 := x T x. (87) Definition A.2 (l 2 norm). The l 2 norm of a vector x R n is defined as x 2 := n x 2 i. (88) Definition A.3 (l 1 norm). The l 1 norm of a vector x R n is defined as x 1 := x i. (89) Definition A.4 (l norm). The l norm of a vector x R n is defined as x := max 1 i n x i. (90) Remark A.5. The l 0 norm is not a norm, as it is not homogeneous. For example, if x is not the zero vector, 2x 0 = x 0 2 x 0. (91) B Proofs B.1 Proof of Theorem 1.3 Existence Since S is non-empty we can choose an arbitrary point s S. Minimizing x s 2 over S is equivalent to minimizing x s 2 over S {y x y 2 x s 2 }. Indeed, the solution cannot be a point that is farther away from x than s. By Weierstrass s extremevalue theorem, the optimization problem minimize x s 2 2 (92) subject to s S {y x y 2 x s 2 } (93) has a solution because x s 2 2 is a continuous function and the feasibility set is bounded and closed, and hence compact. Note that this also holds if S is not convex. Uniqueness 18

19 Assume that there are two distinct projections s 1 s 2. Consider the point s := s 1 + s 2, (94) 2 which belongs to S because S is convex. The difference between x and s and the difference between s 1 and s are orthogonal vectors, x s, s 1 s = x s 1 + s 2, s 1 s 1 + s 2 (95) 2 2 x s1 = + x s 2, x s 1 x s 2 (96) = 1 ( x s1 2 + x s 2 2) (97) 4 = 0, (98) because x s 1 = x s 2 by assumption. By Pythagoras s theorem this implies x s = x s s 1 s 2 2 (99) 2 = x s s 1 s 2 2 (100) 2 > x s 2 2 (101) because s 1 s 2 by assumption. We have reached a contradiction, so the projection is unique. B.2 Proof of Lemma 1.9 The proof for strict convexity is exactly the same, replacing the inequalities by strict inequalities. f being convex implies that g x,y is convex for any x, y R n For any α, β, θ (0, 1) g x,y (θα + (1 θ) β) = f ((θα + (1 θ) β) x + (1 θα (1 θ) β) y) (102) = f (θ (αx + (1 α) y) + (1 θ) (βx + (1 β) y)) (103) θf (αx + (1 α) y) + (1 θ) f (βx + (1 β) y) by convexity of f = θg x,y (α) + (1 θ) g x,y (β). (104) g x,y being convex for any x, y R n implies that f is convex 19

20 For any α, β, θ (0, 1) f (θx + (1 θ) y) = g x,y (θ) (105) θg x,y (1) + (1 θ) g x,y (0) by convexity of g x,y (106) = θf (x) + (1 θ) f (y). (107) B.3 Proof of Lemma 1.6 We prove that the l 1 -norm ball B l1 is equal to the convex hull of the intersection between the l 0 norm ball B l0 and the l -norm ball B l by showing that the sets contain each other. B l1 C (B l0 B l ) Let x be an n-dimensional vector in B l1. If we set θ i := x(i), where x (i) is the ith entry of x by x (i), and θ 0 = 1 n θ i we have n+1 θ i = 1, (108) θ i 0 for 1 i n by definition, (109) n+1 θ 0 = 1 θ i (110) = 1 x 1 (111) 0 because x B l1. (112) We can express x as a convex combination of the standard basis vectors e 1, e 2,..., e n, which belong to B l0 B l since they have a single nonzero entry equal to one, and the zero vector e 0, which also belongs to B l0 B l, x = θ i e i + θ 0 e 0. (113) C (B l0 B l ) B l1 Let x be an n-dimensional vector in C (B l0 B l ). By the definition of convex hull, we can write x = m θ i y i, (114) 20

21 where m > 0, y 1,..., y m R n have a single entry bounded by one, θ i 0 for all 1 i m and m θ i = 1. This immediately implies x B l1, since m x 1 θ i y i 1 by the Triangle inequality (115) θ i y i because each y i only has one nonzero entry (116) 1. θ i (117) (118) B.4 Proof of Theorem 1.12 We prove the result by contradiction. Let x loc be a local minimum and x glob a global minimum such that f (x glob ) < f (x loc ). Since x loc is a local minimum, there exists γ > 0 for which f (x loc ) f (x) for all x R n such that x x loc 2 γ. If we choose θ (0, 1) small enough, x θ := θx loc + (1 θ) x glob satisfies x x loc 2 γ and therefore f (x loc ) f (x θ ) (119) θf (x loc ) + (1 θ) f (x glob ) by convexity of f (120) < f (x loc ) because f (x glob ) < f (x loc ). (121) B.5 Proof of Lemma 1.16 f being convex implies that epi (f) is convex Let x, y R n+1 epi (f), then for any θ (0, 1) f (θx 1:n + (1 θ) y 1:n ) θf (x 1:n ) + (1 θ) f (y 1:n ) by convexity of f (122) θx n+1 + (1 θ) y n+1 (123) because x, y R n+1 epi (f) so f (x 1:n ) x n+1 and f (y 1:n ) y n+1. This implies that θx + (1 θ) y epi (f). epi (f) being convex implies that f is convex For any x, y R n, let us define x, ỹ R n+1 such that x 1:n := x, x n+1 := f (x), (124) ỹ 1:n := y, ỹ n+1 := f (y). (125) 21

22 By definition of epi (f), x, ỹ epi (f). For any θ (0, 1), θ x + (1 θ) ỹ belongs to epi (f) because it is convex. As a result, f (θx + (1 θ) y) = f (θ x 1:n + (1 θ) ỹ 1:n ) (126) θ x n+1 + (1 θ) ỹ n+1 (127) = θf (x) + (1 θ) f (y). (128) B.6 Proof of Proposition 2.1 The proof for strict convexity is almost exactly the same; we omit the details. The following lemma, proved in Section B.7 below establishes that the result holds for univariate functions Lemma B.1. A univariate differentiable function g : R R is convex if and only if for all α, β R and strictly convex if and only if for all α, β R g (β) g (α) (β α) (129) g (β) > g (α) (β α). (130) To complete the proof we extend the result to the multivariable case using Lemma 1.9. If f (y) f (x) + f (x) T (y x) for any x, y R n then f is convex By Lemma 1.9 we just need to show that the univariate function g a,b defined by (9) is convex for all a, b R n. Applying some basic multivariate calculus yields g a,b (α) = f (α a + (1 α) b) T (a b). (131) Let α, β R. Setting x := α a + (1 α) b and y := β a + (1 β) b we have g a,b (β) = f (y) (132) f (x) + f (x) T (y x) (133) = f (α a + (1 α) b) + f (α a + (1 α) b) T (a b) (β α) (134) = g a,b (α) + g a,b (α) (β α) by (131), (135) which establishes that g a,b is convex by Lemma B.1 above. If f is convex then f (y) f (x) + f (x) T (y x) for any x, y R n 22

23 By Lemma 1.9, g x,y is convex for any x, y R n. f (y) = g x,y (1) (136) g x,y (0) + g x,y (0) by convexity of g x,y and Lemma B.1 (137) = f (x) + f (x) T (y x) by (131). (138) B.7 Proof of Lemma B.1 g being convex implies g (β) g (α) (β α) for all α, β R If g is convex then for any α, β R and any 0 θ 1 θ (g (β) g (α)) + g (α) g (α + θ (β α)). (139) Rearranging the terms we have g (β) g (α + θ (β α)) g (α) θ + g (α). (140) Setting h = θ (β α), this implies g (β) g (α + h) g (α) h (β α) + g (α). (141) Taking the limit when h 0 yields g (β) g (α) (β α). (142) If g (β) g (α) (β α) for all α, β R then g is convex Let z = θα + (1 θ) β, then by if g (β) g (α) (β α) g (α) g (z) (α z) + g (z) (143) = g (z) (1 θ) (α β) + g (z) (144) g (β) g (z) (β z) + g (z) (145) = g (z) θ (β α) + g (z) (146) Multiplying (144) by θ, then (146) by 1 θ and summing the inequalities, we obtain θg (α) + (1 θ) g (β) g (θα + (1 θ) β). (147) 23

24 B.8 Proof of Lemma 2.4 The second derivative of g is nonnegative anywhere if and only if the first derivative is nondecreasing, because g is the derivative of g. If g is convex g is nondecreasing By Lemma B.1, if the function is convex then for any α, β R such that β > α Rearranging, we obtain Since β α > 0, we have g (β) g (α). If g is nondecreasing, g is convex g (α) g (β) (α β) + g (β), (148) g (β) g (α) (β α) + g (α). (149) g (β) (β α) g (β) g (α) g (α) (β α). (150) For arbitrary α, β, θ R, such that β > α and 0 < θ < 1, let η = θβ + (1 θ) α. Since β > η > α, by the mean-value theorem there exist γ 1 [α, η] and γ 2 [η, β] such that g (γ 1 ) = g (γ 2 ) = g (η) g (α), η α (151) g (β) g (η). β η (152) Since γ 1 < γ 2, if g is nondecreasing which implies g (β) g (η) β η g (η) g (α), (153) η α η α β α g (β) + β η g (α) g (η). (154) β α Recall that η = θβ + (1 θ) α, so that θ = (η α) / (β α) and θ = (η α) / (β α) and 1 θ = (β η) / (β α). (154) is consequently equivalent to θg (β) + (1 θ) g (α) g (θβ + (1 θ) α). (155) 24

25 B.9 Proof of Theorem 3.2 For arbitrary x, y R n and α R there exists a subgradient q of f at αx + (1 α) y. This implies f (y) f (αx + (1 α) y) + q T (y αx (1 α) y) (156) = f (αx + (1 α) y) + α q T (y x), (157) f (x) f (αx + (1 α) y) + q T (x αx (1 α) y) (158) = f (αx + (1 α) y) + (1 α) q T (y x). (159) Multiplying equation (157) by 1 α and equation (159) by α and adding them together yields αf (x) + (1 α) f (y) f (αx + (1 α) y). (160) B.10 Proof of Lemma 3.6 If q is a subgradient for 1 at x then q i is a subgradient for at x i for 1 i n y i = x + (y i x i ) e i 1 x 1 (161) q T (y i x i ) e i (162) = q i (y i x i ). (163) If q i is a subgradient for at x i for 1 i n then q is a subgradient for 1 at x y 1 = y i (164) x i + q i (y i x i ) (165) = x 1 + q T (y x). (166) B.11 Proof of Theorem 4.3 To show that the linear problem and the l 1 -norm minimization problem are equivalent, we show that they have the same set of solutions. 25

26 Let us denote an arbitrary solution of the LP by ( x lp, t lp). For any solution x l 1 of the l 1 -norm minimization problem, we define t l 1 i := x l 1 i. ( x l 1, t 1) l is feasible for the LP so x l 1 1 = t l 1 i (167) t lp i by optimality of t lp (168) x lp 1 by constraints (60) and (61). (169) This implies that any solution of the LP is also a solution of the l 1 -norm minimization problem. To prove the converse, we fix a solution x l 1 of the l 1 -norm minimization problem. Setting t l 1 i := x l 1 (, i we show that x l 1, t 1) l is a solution of the LP. Indeed, t l 1 i = x l 1 1 (170) x lp 1 by optimality of x l 1 (171) t lp i by constraints (60) and (61). (172) 26

### 2.3 Convex Constrained Optimization Problems

42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

### Convex analysis and profit/cost/support functions

CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m

### Tutorial on Convex Optimization for Engineers Part I

Tutorial on Convex Optimization for Engineers Part I M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de

### Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725

Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T

### Geometry of Linear Programming

Chapter 2 Geometry of Linear Programming The intent of this chapter is to provide a geometric interpretation of linear programming problems. To conceive fundamental concepts and validity of different algorithms

### Introduction to Convex Optimization for Machine Learning

Introduction to Convex Optimization for Machine Learning John Duchi University of California, Berkeley Practical Machine Learning, Fall 2009 Duchi (UC Berkeley) Convex Optimization for Machine Learning

### 1 Polyhedra and Linear Programming

CS 598CSC: Combinatorial Optimization Lecture date: January 21, 2009 Instructor: Chandra Chekuri Scribe: Sungjin Im 1 Polyhedra and Linear Programming In this lecture, we will cover some basic material

### 1 Convex Sets, and Convex Functions

Convex Sets, and Convex Functions In this section, we introduce one of the most important ideas in the theory of optimization, that of a convex set. We discuss other ideas which stem from the basic definition,

### Duality of linear conic problems

Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

### A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of

### max cx s.t. Ax c where the matrix A, cost vector c and right hand side b are given and x is a vector of variables. For this example we have x

Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study

### By W.E. Diewert. July, Linear programming problems are important for a number of reasons:

APPLIED ECONOMICS By W.E. Diewert. July, 3. Chapter : Linear Programming. Introduction The theory of linear programming provides a good introduction to the study of constrained maximization (and minimization)

### Definition of a Linear Program

Definition of a Linear Program Definition: A function f(x 1, x,..., x n ) of x 1, x,..., x n is a linear function if and only if for some set of constants c 1, c,..., c n, f(x 1, x,..., x n ) = c 1 x 1

### ME128 Computer-Aided Mechanical Design Course Notes Introduction to Design Optimization

ME128 Computer-ided Mechanical Design Course Notes Introduction to Design Optimization 2. OPTIMIZTION Design optimization is rooted as a basic problem for design engineers. It is, of course, a rare situation

### 3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions

### Course 221: Analysis Academic year , First Semester

Course 221: Analysis Academic year 2007-08, First Semester David R. Wilkins Copyright c David R. Wilkins 1989 2007 Contents 1 Basic Theorems of Real Analysis 1 1.1 The Least Upper Bound Principle................

### Several Views of Support Vector Machines

Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

### NONLINEAR AND DYNAMIC OPTIMIZATION From Theory to Practice

NONLINEAR AND DYNAMIC OPTIMIZATION From Theory to Practice IC-32: Winter Semester 2006/2007 Benoît C. CHACHUAT Laboratoire d Automatique, École Polytechnique Fédérale de Lausanne CONTENTS 1 Nonlinear

### Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

### Absolute Value Programming

Computational Optimization and Aplications,, 1 11 (2006) c 2006 Springer Verlag, Boston. Manufactured in The Netherlands. Absolute Value Programming O. L. MANGASARIAN olvi@cs.wisc.edu Computer Sciences

### Cost Minimization and the Cost Function

Cost Minimization and the Cost Function Juan Manuel Puerta October 5, 2009 So far we focused on profit maximization, we could look at a different problem, that is the cost minimization problem. This is

### Outline. Optimization scheme Linear search methods Gradient descent Conjugate gradient Newton method Quasi-Newton methods

Outline 1 Optimization without constraints Optimization scheme Linear search methods Gradient descent Conjugate gradient Newton method Quasi-Newton methods 2 Optimization under constraints Lagrange Equality

### Vector Spaces II: Finite Dimensional Linear Algebra 1

John Nachbar September 2, 2014 Vector Spaces II: Finite Dimensional Linear Algebra 1 1 Definitions and Basic Theorems. For basic properties and notation for R N, see the notes Vector Spaces I. Definition

### 10. Proximal point method

L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing

### Mathematics for Economics (Part I) Note 5: Convex Sets and Concave Functions

Natalia Lazzati Mathematics for Economics (Part I) Note 5: Convex Sets and Concave Functions Note 5 is based on Madden (1986, Ch. 1, 2, 4 and 7) and Simon and Blume (1994, Ch. 13 and 21). Concave functions

### Weak topologies. David Lecomte. May 23, 2006

Weak topologies David Lecomte May 3, 006 1 Preliminaries from general topology In this section, we are given a set X, a collection of topological spaces (Y i ) i I and a collection of maps (f i ) i I such

### Tangent and normal lines to conics

4.B. Tangent and normal lines to conics Apollonius work on conics includes a study of tangent and normal lines to these curves. The purpose of this document is to relate his approaches to the modern viewpoints

### 1 Norms and Vector Spaces

008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

### Modern Geometry Homework.

Modern Geometry Homework. 1. Rigid motions of the line. Let R be the real numbers. We define the distance between x, y R by where is the usual absolute value. distance between x and y = x y z = { z, z

### Mathematics Notes for Class 12 chapter 12. Linear Programming

1 P a g e Mathematics Notes for Class 12 chapter 12. Linear Programming Linear Programming It is an important optimization (maximization or minimization) technique used in decision making is business and

### On generalized gradients and optimization

On generalized gradients and optimization Erik J. Balder 1 Introduction There exists a calculus for general nondifferentiable functions that englobes a large part of the familiar subdifferential calculus

### Summary of week 8 (Lectures 22, 23 and 24)

WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry

### Minimize subject to. x S R

Chapter 12 Lagrangian Relaxation This chapter is mostly inspired by Chapter 16 of [1]. In the previous chapters, we have succeeded to find efficient algorithms to solve several important problems such

### Sequences and Convergence in Metric Spaces

Sequences and Convergence in Metric Spaces Definition: A sequence in a set X (a sequence of elements of X) is a function s : N X. We usually denote s(n) by s n, called the n-th term of s, and write {s

Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

### Section Notes 3. The Simplex Algorithm. Applied Math 121. Week of February 14, 2011

Section Notes 3 The Simplex Algorithm Applied Math 121 Week of February 14, 2011 Goals for the week understand how to get from an LP to a simplex tableau. be familiar with reduced costs, optimal solutions,

### Linear Programming, Lagrange Multipliers, and Duality Geoff Gordon

lp.nb 1 Linear Programming, Lagrange Multipliers, and Duality Geoff Gordon lp.nb 2 Overview This is a tutorial about some interesting math and geometry connected with constrained optimization. It is not

Quadratic Functions, Optimization, and Quadratic Forms Robert M. Freund February, 2004 2004 Massachusetts Institute of echnology. 1 2 1 Quadratic Optimization A quadratic optimization problem is an optimization

### (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

Lecture 1 Convex Sets (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties 1.1.1 A convex set In the school geometry

### Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing

### ARE211, Fall2012. Contents. 2. Linear Algebra Preliminary: Level Sets, upper and lower contour sets and Gradient vectors 1

ARE11, Fall1 LINALGEBRA1: THU, SEP 13, 1 PRINTED: SEPTEMBER 19, 1 (LEC# 7) Contents. Linear Algebra 1.1. Preliminary: Level Sets, upper and lower contour sets and Gradient vectors 1.. Vectors as arrows.

### We call this set an n-dimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.

Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to

### 3. Linear Programming and Polyhedral Combinatorics

Massachusetts Institute of Technology Handout 6 18.433: Combinatorial Optimization February 20th, 2009 Michel X. Goemans 3. Linear Programming and Polyhedral Combinatorics Summary of what was seen in the

### Lecture 5 Principal Minors and the Hessian

Lecture 5 Principal Minors and the Hessian Eivind Eriksen BI Norwegian School of Management Department of Economics October 01, 2010 Eivind Eriksen (BI Dept of Economics) Lecture 5 Principal Minors and

### Getting on For all the preceding functions, discuss, whenever possible, whether local min and max are global.

Problems Warming up Find the domain of the following functions, establish in which set they are differentiable, then compute critical points. Apply 2nd order sufficient conditions to qualify them as local

### Basics Inversion and related concepts Random vectors Matrix calculus. Matrix algebra. Patrick Breheny. January 20

Matrix algebra January 20 Introduction Basics The mathematics of multiple regression revolves around ordering and keeping track of large arrays of numbers and solving systems of equations The mathematical

### Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian

### 1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S

### Nonlinear Programming Methods.S2 Quadratic Programming

Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### Number Theory Homework.

Number Theory Homework. 1. Pythagorean triples and rational points on quadratics and cubics. 1.1. Pythagorean triples. Recall the Pythagorean theorem which is that in a right triangle with legs of length

### Linear Codes. In the V[n,q] setting, the terms word and vector are interchangeable.

Linear Codes Linear Codes In the V[n,q] setting, an important class of codes are the linear codes, these codes are the ones whose code words form a sub-vector space of V[n,q]. If the subspace of V[n,q]

### SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the

### Convex Optimization SVM s and Kernel Machines

Convex Optimization SVM s and Kernel Machines S.V.N. Vishy Vishwanathan vishy@axiom.anu.edu.au National ICT of Australia and Australian National University Thanks to Alex Smola and Stéphane Canu S.V.N.

### Date: April 12, 2001. Contents

2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........

### 4.6 Linear Programming duality

4.6 Linear Programming duality To any minimization (maximization) LP we can associate a closely related maximization (minimization) LP. Different spaces and objective functions but in general same optimal

### Lecture 7 - Linear Programming

COMPSCI 530: Design and Analysis of Algorithms DATE: 09/17/2013 Lecturer: Debmalya Panigrahi Lecture 7 - Linear Programming Scribe: Hieu Bui 1 Overview In this lecture, we cover basics of linear programming,

### Stanford University CS261: Optimization Handout 6 Luca Trevisan January 20, In which we introduce the theory of duality in linear programming.

Stanford University CS261: Optimization Handout 6 Luca Trevisan January 20, 2011 Lecture 6 In which we introduce the theory of duality in linear programming 1 The Dual of Linear Program Suppose that we

### Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization

### The distance of a curve to its control polygon J. M. Carnicer, M. S. Floater and J. M. Peña

The distance of a curve to its control polygon J. M. Carnicer, M. S. Floater and J. M. Peña Abstract. Recently, Nairn, Peters, and Lutterkort bounded the distance between a Bézier curve and its control

### NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

### Section 1.2. Angles and the Dot Product. The Calculus of Functions of Several Variables

The Calculus of Functions of Several Variables Section 1.2 Angles and the Dot Product Suppose x = (x 1, x 2 ) and y = (y 1, y 2 ) are two vectors in R 2, neither of which is the zero vector 0. Let α and

### MATH : HONORS CALCULUS-3 HOMEWORK 6: SOLUTIONS

MATH 16300-33: HONORS CALCULUS-3 HOMEWORK 6: SOLUTIONS 25-1 Find the absolute value and argument(s) of each of the following. (ii) (3 + 4i) 1 (iv) 7 3 + 4i (ii) Put z = 3 + 4i. From z 1 z = 1, we have

### 2.5 Gaussian Elimination

page 150 150 CHAPTER 2 Matrices and Systems of Linear Equations 37 10 the linear algebra package of Maple, the three elementary 20 23 1 row operations are 12 1 swaprow(a,i,j): permute rows i and j 3 3

### Linear Programming I

Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins

### Problem 1 (10 pts) Find the radius of convergence and interval of convergence of the series

1 Problem 1 (10 pts) Find the radius of convergence and interval of convergence of the series a n n=1 n(x + 2) n 5 n 1. n(x + 2)n Solution: Do the ratio test for the absolute convergence. Let a n =. Then,

### Good luck, veel succes!

Final exam Advanced Linear Programming, May 7, 13.00-16.00 Switch off your mobile phone, PDA and any other mobile device and put it far away. No books or other reading materials are allowed. This exam

### Solution based on matrix technique Rewrite. ) = 8x 2 1 4x 1x 2 + 5x x1 2x 2 2x 1 + 5x 2

8.2 Quadratic Forms Example 1 Consider the function q(x 1, x 2 ) = 8x 2 1 4x 1x 2 + 5x 2 2 Determine whether q(0, 0) is the global minimum. Solution based on matrix technique Rewrite q( x1 x 2 = x1 ) =

### Introduction and message of the book

1 Introduction and message of the book 1.1 Why polynomial optimization? Consider the global optimization problem: P : for some feasible set f := inf x { f(x) : x K } (1.1) K := { x R n : g j (x) 0, j =

### CHAPTER 9. Integer Programming

CHAPTER 9 Integer Programming An integer linear program (ILP) is, by definition, a linear program with the additional constraint that all variables take integer values: (9.1) max c T x s t Ax b and x integral

### Week 5 Integral Polyhedra

Week 5 Integral Polyhedra We have seen some examples 1 of linear programming formulation that are integral, meaning that every basic feasible solution is an integral vector. This week we develop a theory

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

### 5.3 The Cross Product in R 3

53 The Cross Product in R 3 Definition 531 Let u = [u 1, u 2, u 3 ] and v = [v 1, v 2, v 3 ] Then the vector given by [u 2 v 3 u 3 v 2, u 3 v 1 u 1 v 3, u 1 v 2 u 2 v 1 ] is called the cross product (or

### Orthogonal Projections

Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

### Matrix Algebra LECTURE 1. Simultaneous Equations Consider a system of m linear equations in n unknowns: y 1 = a 11 x 1 + a 12 x 2 + +a 1n x n,

LECTURE 1 Matrix Algebra Simultaneous Equations Consider a system of m linear equations in n unknowns: y 1 a 11 x 1 + a 12 x 2 + +a 1n x n, (1) y 2 a 21 x 1 + a 22 x 2 + +a 2n x n, y m a m1 x 1 +a m2 x

### MOSEK modeling manual

MOSEK modeling manual August 12, 2014 Contents 1 Introduction 1 2 Linear optimization 3 2.1 Introduction....................................... 3 2.1.1 Basic notions.................................. 3

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### A Game Theoretic Approach for Solving Multiobjective Linear Programming Problems

LIBERTAS MATHEMATICA, vol XXX (2010) A Game Theoretic Approach for Solving Multiobjective Linear Programming Problems Irinel DRAGAN Abstract. The nucleolus is one of the important concepts of solution

### Two-Stage Stochastic Linear Programs

Two-Stage Stochastic Linear Programs Operations Research Anthony Papavasiliou 1 / 27 Two-Stage Stochastic Linear Programs 1 Short Reviews Probability Spaces and Random Variables Convex Analysis 2 Deterministic

### A FIRST COURSE IN OPTIMIZATION THEORY

A FIRST COURSE IN OPTIMIZATION THEORY RANGARAJAN K. SUNDARAM New York University CAMBRIDGE UNIVERSITY PRESS Contents Preface Acknowledgements page xiii xvii 1 Mathematical Preliminaries 1 1.1 Notation

### 1. Periodic Fourier series. The Fourier expansion of a 2π-periodic function f is:

CONVERGENCE OF FOURIER SERIES 1. Periodic Fourier series. The Fourier expansion of a 2π-periodic function f is: with coefficients given by: a n = 1 π f(x) a 0 2 + a n cos(nx) + b n sin(nx), n 1 f(x) cos(nx)dx

### 1. R In this and the next section we are going to study the properties of sequences of real numbers.

+a 1. R In this and the next section we are going to study the properties of sequences of real numbers. Definition 1.1. (Sequence) A sequence is a function with domain N. Example 1.2. A sequence of real

### Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

### This chapter describes set theory, a mathematical theory that underlies all of modern mathematics.

Appendix A Set Theory This chapter describes set theory, a mathematical theory that underlies all of modern mathematics. A.1 Basic Definitions Definition A.1.1. A set is an unordered collection of elements.

### Optimisation et simulation numérique.

Optimisation et simulation numérique. Lecture 1 A. d Aspremont. M2 MathSV: Optimisation et simulation numérique. 1/106 Today Convex optimization: introduction Course organization and other gory details...

### Limits and convergence.

Chapter 2 Limits and convergence. 2.1 Limit points of a set of real numbers 2.1.1 Limit points of a set. DEFINITION: A point x R is a limit point of a set E R if for all ε > 0 the set (x ε,x + ε) E is

### CHAPTER 2. Inequalities

CHAPTER 2 Inequalities In this section we add the axioms describe the behavior of inequalities (the order axioms) to the list of axioms begun in Chapter 1. A thorough mastery of this section is essential

### Mathematical finance and linear programming (optimization)

Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may

### Course 421: Algebraic Topology Section 1: Topological Spaces

Course 421: Algebraic Topology Section 1: Topological Spaces David R. Wilkins Copyright c David R. Wilkins 1988 2008 Contents 1 Topological Spaces 1 1.1 Continuity and Topological Spaces...............

### Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

### 3. INNER PRODUCT SPACES

. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### Definition 12 An alternating bilinear form on a vector space V is a map B : V V F such that

4 Exterior algebra 4.1 Lines and 2-vectors The time has come now to develop some new linear algebra in order to handle the space of lines in a projective space P (V ). In the projective plane we have seen

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

### LAGRANGIAN RELAXATION TECHNIQUES FOR LARGE SCALE OPTIMIZATION

LAGRANGIAN RELAXATION TECHNIQUES FOR LARGE SCALE OPTIMIZATION Kartik Sivaramakrishnan Department of Mathematics NC State University kksivara@ncsu.edu http://www4.ncsu.edu/ kksivara SIAM/MGSA Brown Bag

### Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

Summer course on Convex Optimization Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.Minnesota Interior-Point Methods: the rebirth of an old idea Suppose that f is