Semidefinite and Second Order Cone Programming Seminar Fall 2012 Lecture 2

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Lecture 2 Instructor: Farid Alizadeh Scribe: Wang Yao 9/17/2012 1 Overview We had a general overview of semidefinite programming(sdp) in lecture 1, starting from this lecture we will be jumping into the theory. Some topics we will discuss in the next few lectures include: the duality theory, notion of complementary slackness, at least one polynomial time algorithm of solving SDP and application in integer programming and combinatorial optimization. 2 Definitions and General Settings 2.1 Basics of Topology Throughout the whole semester we only consider vectors in finite dimensional space R n unless otherwise explicitly point out. Also, all vectors are considered column vectors and represented by lower case bold letters such as a, b, etc. Definitions: Let S R n be a set, then S is an open set if for each x S, there is a sufficently small ball centered at x and contained in S, that is: x S, ɛ > 0 such that {y R n : y x < ɛ} S. S is a closed set if its complement R n \S is an open set. The interior of S, Int(S) = O. S is open if and only if Int(S) = S. O S O open 1

The closure of S, cl(s) = S C C closed The boundary of S is defined to be cl(s)\ Int(S). C. S is closed if and only if cl(s) = S. We say that x C is a relative interior point of C, if there exists a neighborhood N of x such that N aff(c) C. In other words, x is an interior point of C relative to aff(c). The relative interior of C, denoted rel.int(c), is the set of all relative interior points of C. Remark 1 If we have a closed set C R n, then C is also closed in any higher dimension metric space, with probably different boundary. However, the openness does depend on the metric space. For instance, the segment (a, b) is an open set relative to R, but it is not an open set in R 2. For a convex optimization problem, the optimal value of objective usually is attained on the boundary of feasible region, so the feasible region usually has to be closed for the problem to be well-defined. Theorem 2 C R n is a closed set if and only if the limit points of any sequence points x 1, x 2,..., x n, C, is also in C. 2.2 General Settings Definition 3 (Proper Cone) A proper cone K R n is a closed, pointed, convex and full- dimensional cone. Full dimensionality is with respect to a given linear space. (Thus, a cone may not be proper in a vector space, but be proper in a subspace.) The following figure shows an example of a cone which is not full dimensional. 2

Let K R n be a proper cone, so that Int(K) = rel.int(k). Theorem 4 Every proper cone K induces a partial order which is defined as follows: x, y R n, x K y x y K X K y x y Int(K) Proof: First we want to prove the reflectiveness. Note that x K x since x x = 0 K. For the property of anti-symmetry, if x K y, y K x, then x y K, y x K. Since K is a proper cone, thus a pointed cone so that K cannot contain both of x y and (x y) unless x y = 0. Finally, if x K y, y K z then x z = (x y) + (y z) K, i.e., x K z. Example 1 Nonnegative orthant Let L n denote the nonnegative orthant of R n. For every point x in L n, x i 0, i = 1, 2,..., n. If a Ln b, we have componentwise a i b i. Example 2 Semidefinite cone For semidefinite cone, X Y X Y is positive semidefinite. Definitions: Let K R n be proper cone. span(k) = L, where L is linear space. L K F is said to be a face of K if F K and x, y K, x + y F implies x, y F. The dimension of a cone, dim(k) = dim (span(k)). K in turn is a face of K itself and is the only full dimensional face of K. The definition of face implies that if a closed line segment in K with a relative interior point in F, then both of the endpoints in F. The 0-dimensional faces of convex set is called extreme points, the only extreme point of K is 0. 1-dimensional faces are called extreme rays. an extreme ray is a half-line emanating from the origin. The extreme rays of K are in one-to-one correspondence with its extreme direction. (n 1)-dimensional faces are called facets. Example 3 Extreme rays of the second order cone Let Q the second order cone, Q = {(x 0, x) x 0 x } The vectors x = ( x, x ) define the extreme rays of Q. If we have (b 0, b) Q, (c 0, c) Q and (b 0 + c 0, b + c) = ( x, x), then the following equality must hold: b + c = b + c = b 0 + c 0 = x which means these two vectors lie in the same half line with x. 3

n-dimensional polyhedral cone has all dimensional faces while non-polyhedral cones may lack some of these. Example 4 Extreme rays of nonnegative orthant Let L n denote the nonnegative orthant, L n is a proper cone. The extreme rays of L n are: e 1 = (1, 0, 0,..., 0) T e 2 = (0, 1, 0,..., 0) T e 3 = (0, 0, 1,..., 0) T.. e n = (0, 0, 0,..., 1) T Definition 5 (Conic hull) Let S R n be a nonempty set, the conic hull of S is defined as cone(s) = K SK where K is cone. Every finite dimensional proper cone is the conic hull of its extreme rays. Theorem 6 (Caratheodory s Theorem) Every nonzero vector from proper cone K can be represented as a nonnegative combination of at most n = dim(k) linearly independent vectors r i from K, where each r i generate an extreme rays of K. Definition 7 The Caratheodory number of a cone K, denoted κ(k), is defined as the largest integer such that every x K can be written as a nonnegative linear combination of at most κ(k) extreme rays r i K. Example 5 Second order cone Let Q be a second order cone, κ(q) = 2 regardless of dimension because any vector in Q can be represent by at most two vectors. 4

Example 6 Positive semidefinite cone The cone of n n symmetric matrices S n, it is fairly easy to see that dim (S n ) = n(n+1) 2. For the cone of positive semidefinite(p.s.d.) matrices, denoted by P + n n, we want to find out κ ( ) P + n n and ext.ray ( P n n) +. A matrix X Int(P + n n ) if and only if X is invertible, that is to say all eigenvalues of X are positive. Thus the interior of P + n n is the cone of positive definite matrices in P + n n. Consequently, the boundary of P + n n is the set of singular P.S.D. matrices. Positive semi-definite matrices uu T of rank 1 form the extreme rays of P + n n. For any X S n +, by eigenvalue decomposition we have X = QΛQ = ( ) q 1, q 2,..., q n diag{λ1,..., λ n } ( q T 1, qt 2,..., ) T qt n = λ 1 q 1 q T 1 + λ 2 q 2 q T 2 + + λ n q n q T n This shows that κ ( S+) n = n and all extreme rays of S n + must be among matrices of the form qq T. Now we must show that each uu T of rank 1 is an extreme ray. Let uu T = X + Y, where X, Y 0. If v R n is orthogonal of u. Then 0 = v T uu T v = v T Xv + v T Yv = 0 but since the summands are both non-negative and add up to zero, they are both zero. Thus v T Xv = v T Yv = 0 and it implies that X 1 2 v = Y 1 2 v = 0 X 1 2 v = Y 1 2 v = 0 Thus both X and Y are at most rank 1 matrices. The eigenvector corresponding to the single nonzero eigenvalue must be a multiple of u. Thus both of X and Y are a multiple of vv T. On the other hand, for any K S n +, by using Cholesky factorization, we can write K = u 1 u T 1 + + u ku T k where k is the rank of K. Clearly if k 2, then K cannot be an extreme ray of S n +. 5

3 Conic Linear Programming 3.1 The Standard cone linear programming (K-LP) min c T x s.t. a T i x = b i, i = 1,..m x K 0 where c R n and b R m,a R n m with rows a i R n, i = 1,...m. Observe that every convex optimization problem: min x C f(x) where C is a convex set and f(x) is convex over C, can be turned into a cone-lp. First turn the problem to one with linear objective and then turn it into Cone LP: min z s.t. f(x) z 0 x C. Since the set B = {(z, x) x C and f(x) z 0} is convex our problem is now equivalent to the cone LP where min z s.t. x 0 = 1 x K 0 where K = {(x 0, z, x) (z, x) C and x 0 0} Definition 8 (Dual Cone) The dual cone K of a proper cone is the set {z : z T x 0, x K}. It is easy to prove that if K is always convex (even if K is non-convex!). Furthermore, if K is full-dimensional and pointed then K is a proper cone. The definition says that the angle between any pair of vectors from a cone and its dual has to be acute. Figure 2 shows an example of dual cone. 6

Example 7 non-negative orthant Let R n + = {x x k 0 for k = 1,..., n}, the dual cone equals R n +, that is the non-negative orthant is self dual. We recall that Lemma 9 A matrix X is positive semidefinite if it satisfies any one of the following equivalent conditions: 1. 2. 3. (1) a T Xa 0, a R n (2) A R n n such that AA T = X (3) All eigenvalues of X are non-negative. Example 8 The semidefinite cone Let P n n = {X R n n : X is positive semidefinite} Now we are interested in P n n. On one side, i.e., Z P n n, Z X 0 for all X 0, Z X = Tr(ZX) = Tr(ZAA T ) = Tr(A T ZA) 0 for all A R n n. Since X is symmetric, from the knowledge of linear algebra, X can be written as X = QΛQ T where QQ T = I, that is Q is an orthogonal matrix, and Λ is diagonal with the diagonal entries containing the eigenvalues of X. Write Q = [q 1,...q n ] and Λ = diag(λ 1,...λ n ). λ i, i = 1..n, then q i is the eigenvector corresponding to λ i, i.e, q T i Xq i = λ i Let us choose A i = p i R n where p i is the eigenvector of Z corresponding to γ i and p T i p i = 1. Then, 0 Tr(A T i ZA i ) = p T i Zp i = γ i. So all the eigenvalues of Z are non-negative, i.e., Z P n n, P n n P n n. On the other hand, Y P n n, B R n n such that Y = BB T. X P n n, X = AA T, we have Y X = Tr(YX) = Tr(BB T AA T ) = Tr(A T BB T A) = Tr[(B T A) T (B T A)] 0 i.e., Y P n n, P n n P n n. In conclusion, P n n = P n n 7

From the linear programming we know the pair of primal and dual problems are: Min c, x Max b, y (P) S.T. Ax = b (D) S.T. A, y + S = C x K 0 x K 0 We just proved that the P.S.D. cone is self dual, therefore Min c, x Max b, y (P) S.T. Ax = b (D) S.T. A, y + S = C x K 0 x K 0 Example 9 The second order cone Let Q = {(x 0, x) x 0 x }. Q is a proper cone. What is Q? On one side, if z = (z 0, z) Q, then for every (x 0, x) Q ( ) (z 0, z T x0 ) = z x 0 x 0 + z T x z x + z T x z T x + z T x = 0 i.e., Q Q. The inequalities come from the Cauchy-Schwartz inequality: z T x x T z z x On the other side, we note that e = (1, 0) Q. For each element z = (z 0, z) Q we must have z T e = z 0 0. We also note that each vector of the form x = ( z, z) Q, for all z R n. Thus, in particular for z = (z 0, z) Q, z T x = z 0 z z 2 0 Since z is always non-negative, we get z 0 z, i.e., Q Q. Therefore, Q = Q. Example 10 p-norm cone A generalized definition of second order cone is that Q p = {(x 0, x) x 0 x p, p 1}, where x p = ( x i p ) 1 p. If p < 1 then Q p is not convex. We claim that Q p = Q q such that 1 p + 1 q = 1. The proof is an application of Hölder s inequality, which stats that for x R n and y R n where 1 p + 1 q = 1. x T y x p y q We next give some properties of dual cone as propositions without proofs since they are just analogy of polar cone and thus can be found in any well written convex analysis book. 8

Proposition 10 properties of dual cone If K 1 R n 1,......, K m R nm all proper cones then (K 1 K 2 K m ) is proper and are (K 1 K 2 K m ) = K 1 K 2 K m The Minkowski sum of cones is defined as K 1 + + K m = {x 1 + + x m x i K i, i = 1,..., m}. Then if each K i is a proper cone, so is K 1 + +K m and (K 1 + K 2 + + K m ) is proper and (K 1 + K 2 + + K m ) = K 1 K 2 K m In addition if rel.int(k i ) then (K 1 K 2 K m ) = K 1 + K 2 + + K m 3.2 Moment and positive polynomial cones: An Example of a pair of dual cones which are not self dual In the examples above, we note that they were all self-dual cones. But there are cones that are not self-dual. Let F be the set of functions F : R R with the following properties: 1. F is right continuous, 2. non-decreasing (i.e. if x > y then F(x) F(y),) and 3. has bounded variation, that is F(x) 0 as x, and F(x) u < as x. First observe that functions in F are almost like probability distribution functions, except that their range is the interval [0, u] rather than [0, 1]. Second the set F itself is a convex cone and in fact pointed cone in the space of rightcontinuous functions. Now we define a particular kind of Moment cone. First, let us define u x = The moment cone is defined as: { M n+1 = c = 1 x x 2 x n. } u x df(x) : F(x) F that is M n+1 consits of vectors c where for each j = 0,..., n, c j is the j th moment of a distribution times a non-negative constant. 9

Lemma 11 M n+1 is a convex pointed full-dimensional cone. Proof: Let s examine the properties we need to prove: c M n+1 and α 0 αc M n+1. To see this observe that there exists F F such that c = u x df(x). Now if F is right-continuous, nondecreasing and with bounded variation, then all these properties also hold for αf for each α 0 and thus αf F. Therefore, αc = u x d(αf(x)) M n+1. Thus M n+1 is a cone. If c and d are in M n+1 then c + d M n+1. c = u x df 1 (x) M n+1, d = u x df 2 (x) M n+1 c + d = u x d[f 1 (x) + F 2 (x)] M n+1 Thus M n+1 is a convex cone. If c and c are in M n+1 then c = 0. Ifc = u x df 1 (x) M n+1 and c M n+1, then c = u x df 2 (x) M n+1. c + ( c) = 0 = u x d[f 1 (x) + F 2 (x)] Especially, d[f 1 (x)+f 2 (x)] = 0. Since F 1 (x)+f 2 (x) F is non-decreasing with F 1 (x) + F 2 (x) 0 as x, we get F 1 (x) + F 2 (x) = 0 almost everywhere,i.e., F i (x) = 0, i = 1, 2 almost everywhere. It means c = 0, i.e., M n+1 M n+1 = 0. Thus M n+1 is a pointed cone. M n+1 is full-dimensional. Let F a (x) = { 0, if x < a 1, if x a Obviously, F a (x) F and u a = u x df a (x) M n+1 for all a R. Choose n + 1 distinct a 1,...a n+1, det[u a1,, u an+1 ] = i>j(a i a j ) 0 Thus M n+1 is full-dimension cone. (The determinant above is the wellknown Vander Monde determinant.) We need to point out that, as defined M n+1 is not a closed cone. For instance in R 2, (1, ɛ, 1/ɛ 2 ) moment space and (ɛ 2, ɛ 3, 1) M 3. However as ɛ 0, 10

(ɛ 2, ɛ 3, 1) does not belong to any moment cone. But if we take the union of n s 0 {}}{ vector α( 0, 0,..., 0, 1) T and M n+1 then this new cones will be a closed, and thus proper. 11