Semidefinite and Second Order Cone Programming Seminar Fall 2012 Lecture 2

Similar documents
Duality of linear conic problems

Orthogonal Diagonalization of Symmetric Matrices

Similarity and Diagonalization. Similar Matrices

Inner Product Spaces and Orthogonality

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

2.3 Convex Constrained Optimization Problems

by the matrix A results in a vector which is a reflection of the given

Metric Spaces. Chapter Metrics

3. Linear Programming and Polyhedral Combinatorics

Notes on Symmetric Matrices

LINEAR ALGEBRA W W L CHEN

Inner Product Spaces

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Chapter 6. Orthogonality

Some representability and duality results for convex mixed-integer programs.

Section Inner Products and Norms

Linear Algebra Review. Vectors

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Vector and Matrix Norms

1 if 1 x 0 1 if 0 x 1

Mathematical finance and linear programming (optimization)

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Chapter 17. Orthogonal Matrices and Symmetries of Space

Review Jeopardy. Blue vs. Orange. Review Jeopardy

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

DATA ANALYSIS II. Matrix Algorithms

1 Norms and Vector Spaces

On Minimal Valid Inequalities for Mixed Integer Conic Programs

1 VECTOR SPACES AND SUBSPACES

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

Orthogonal Projections

3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices.

Nonlinear Programming Methods.S2 Quadratic Programming

NOTES ON LINEAR TRANSFORMATIONS

160 CHAPTER 4. VECTOR SPACES

LINEAR ALGEBRA. September 23, 2010

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes

Data Mining: Algorithms and Applications Matrix Math Review

MATH APPLIED MATRIX THEORY

Mathematics Course 111: Algebra I Part IV: Vector Spaces

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Actually Doing It! 6. Prove that the regular unit cube (say 1cm=unit) of sufficiently high dimension can fit inside it the whole city of New York.

1 Solving LPs: The Simplex Algorithm of George Dantzig

Classification of Cartan matrices

CHAPTER 9. Integer Programming

1. Prove that the empty set is a subset of every set.

A note on companion matrices

ON TORI TRIANGULATIONS ASSOCIATED WITH TWO-DIMENSIONAL CONTINUED FRACTIONS OF CUBIC IRRATIONALITIES.

4.6 Linear Programming duality

How To Prove The Dirichlet Unit Theorem

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm

Solving Systems of Linear Equations

BANACH AND HILBERT SPACE REVIEW

3. INNER PRODUCT SPACES

Separation Properties for Locally Convex Cones

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

LEARNING OBJECTIVES FOR THIS CHAPTER

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

1 Sets and Set Notation.

T ( a i x i ) = a i T (x i ).

What is Linear Programming?

Practical Guide to the Simplex Method of Linear Programming

Lecture 18 - Clifford Algebras and Spin groups

16.3 Fredholm Operators

1 Introduction to Matrices

Inner product. Definition of inner product

α = u v. In other words, Orthogonal Projection

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

Methods for Finding Bases

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

x = + x 2 + x

Linear Algebra I. Ronald van Luijk, 2012

Notes on Determinant

Transportation Polytopes: a Twenty year Update

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

An Introduction on SemiDefinite Program

Lecture 5: Singular Value Decomposition SVD (1)

Factorization Theorems

Math 312 Homework 1 Solutions

Solving Linear Systems, Continued and The Inverse of a Matrix

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Let H and J be as in the above lemma. The result of the lemma shows that the integral

THE BANACH CONTRACTION PRINCIPLE. Contents

1 Local Brouwer degree

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

ALMOST COMMON PRIORS 1. INTRODUCTION

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Metric Spaces. Chapter 1

Solving polynomial least squares problems via semidefinite programming relaxations

Lecture 5 Principal Minors and the Hessian

ISOMETRIES OF R n KEITH CONRAD

Max-Min Representation of Piecewise Linear Functions

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Transcription:

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Lecture 2 Instructor: Farid Alizadeh Scribe: Wang Yao 9/17/2012 1 Overview We had a general overview of semidefinite programming(sdp) in lecture 1, starting from this lecture we will be jumping into the theory. Some topics we will discuss in the next few lectures include: the duality theory, notion of complementary slackness, at least one polynomial time algorithm of solving SDP and application in integer programming and combinatorial optimization. 2 Definitions and General Settings 2.1 Basics of Topology Throughout the whole semester we only consider vectors in finite dimensional space R n unless otherwise explicitly point out. Also, all vectors are considered column vectors and represented by lower case bold letters such as a, b, etc. Definitions: Let S R n be a set, then S is an open set if for each x S, there is a sufficently small ball centered at x and contained in S, that is: x S, ɛ > 0 such that {y R n : y x < ɛ} S. S is a closed set if its complement R n \S is an open set. The interior of S, Int(S) = O. S is open if and only if Int(S) = S. O S O open 1

The closure of S, cl(s) = S C C closed The boundary of S is defined to be cl(s)\ Int(S). C. S is closed if and only if cl(s) = S. We say that x C is a relative interior point of C, if there exists a neighborhood N of x such that N aff(c) C. In other words, x is an interior point of C relative to aff(c). The relative interior of C, denoted rel.int(c), is the set of all relative interior points of C. Remark 1 If we have a closed set C R n, then C is also closed in any higher dimension metric space, with probably different boundary. However, the openness does depend on the metric space. For instance, the segment (a, b) is an open set relative to R, but it is not an open set in R 2. For a convex optimization problem, the optimal value of objective usually is attained on the boundary of feasible region, so the feasible region usually has to be closed for the problem to be well-defined. Theorem 2 C R n is a closed set if and only if the limit points of any sequence points x 1, x 2,..., x n, C, is also in C. 2.2 General Settings Definition 3 (Proper Cone) A proper cone K R n is a closed, pointed, convex and full- dimensional cone. Full dimensionality is with respect to a given linear space. (Thus, a cone may not be proper in a vector space, but be proper in a subspace.) The following figure shows an example of a cone which is not full dimensional. 2

Let K R n be a proper cone, so that Int(K) = rel.int(k). Theorem 4 Every proper cone K induces a partial order which is defined as follows: x, y R n, x K y x y K X K y x y Int(K) Proof: First we want to prove the reflectiveness. Note that x K x since x x = 0 K. For the property of anti-symmetry, if x K y, y K x, then x y K, y x K. Since K is a proper cone, thus a pointed cone so that K cannot contain both of x y and (x y) unless x y = 0. Finally, if x K y, y K z then x z = (x y) + (y z) K, i.e., x K z. Example 1 Nonnegative orthant Let L n denote the nonnegative orthant of R n. For every point x in L n, x i 0, i = 1, 2,..., n. If a Ln b, we have componentwise a i b i. Example 2 Semidefinite cone For semidefinite cone, X Y X Y is positive semidefinite. Definitions: Let K R n be proper cone. span(k) = L, where L is linear space. L K F is said to be a face of K if F K and x, y K, x + y F implies x, y F. The dimension of a cone, dim(k) = dim (span(k)). K in turn is a face of K itself and is the only full dimensional face of K. The definition of face implies that if a closed line segment in K with a relative interior point in F, then both of the endpoints in F. The 0-dimensional faces of convex set is called extreme points, the only extreme point of K is 0. 1-dimensional faces are called extreme rays. an extreme ray is a half-line emanating from the origin. The extreme rays of K are in one-to-one correspondence with its extreme direction. (n 1)-dimensional faces are called facets. Example 3 Extreme rays of the second order cone Let Q the second order cone, Q = {(x 0, x) x 0 x } The vectors x = ( x, x ) define the extreme rays of Q. If we have (b 0, b) Q, (c 0, c) Q and (b 0 + c 0, b + c) = ( x, x), then the following equality must hold: b + c = b + c = b 0 + c 0 = x which means these two vectors lie in the same half line with x. 3

n-dimensional polyhedral cone has all dimensional faces while non-polyhedral cones may lack some of these. Example 4 Extreme rays of nonnegative orthant Let L n denote the nonnegative orthant, L n is a proper cone. The extreme rays of L n are: e 1 = (1, 0, 0,..., 0) T e 2 = (0, 1, 0,..., 0) T e 3 = (0, 0, 1,..., 0) T.. e n = (0, 0, 0,..., 1) T Definition 5 (Conic hull) Let S R n be a nonempty set, the conic hull of S is defined as cone(s) = K SK where K is cone. Every finite dimensional proper cone is the conic hull of its extreme rays. Theorem 6 (Caratheodory s Theorem) Every nonzero vector from proper cone K can be represented as a nonnegative combination of at most n = dim(k) linearly independent vectors r i from K, where each r i generate an extreme rays of K. Definition 7 The Caratheodory number of a cone K, denoted κ(k), is defined as the largest integer such that every x K can be written as a nonnegative linear combination of at most κ(k) extreme rays r i K. Example 5 Second order cone Let Q be a second order cone, κ(q) = 2 regardless of dimension because any vector in Q can be represent by at most two vectors. 4

Example 6 Positive semidefinite cone The cone of n n symmetric matrices S n, it is fairly easy to see that dim (S n ) = n(n+1) 2. For the cone of positive semidefinite(p.s.d.) matrices, denoted by P + n n, we want to find out κ ( ) P + n n and ext.ray ( P n n) +. A matrix X Int(P + n n ) if and only if X is invertible, that is to say all eigenvalues of X are positive. Thus the interior of P + n n is the cone of positive definite matrices in P + n n. Consequently, the boundary of P + n n is the set of singular P.S.D. matrices. Positive semi-definite matrices uu T of rank 1 form the extreme rays of P + n n. For any X S n +, by eigenvalue decomposition we have X = QΛQ = ( ) q 1, q 2,..., q n diag{λ1,..., λ n } ( q T 1, qt 2,..., ) T qt n = λ 1 q 1 q T 1 + λ 2 q 2 q T 2 + + λ n q n q T n This shows that κ ( S+) n = n and all extreme rays of S n + must be among matrices of the form qq T. Now we must show that each uu T of rank 1 is an extreme ray. Let uu T = X + Y, where X, Y 0. If v R n is orthogonal of u. Then 0 = v T uu T v = v T Xv + v T Yv = 0 but since the summands are both non-negative and add up to zero, they are both zero. Thus v T Xv = v T Yv = 0 and it implies that X 1 2 v = Y 1 2 v = 0 X 1 2 v = Y 1 2 v = 0 Thus both X and Y are at most rank 1 matrices. The eigenvector corresponding to the single nonzero eigenvalue must be a multiple of u. Thus both of X and Y are a multiple of vv T. On the other hand, for any K S n +, by using Cholesky factorization, we can write K = u 1 u T 1 + + u ku T k where k is the rank of K. Clearly if k 2, then K cannot be an extreme ray of S n +. 5

3 Conic Linear Programming 3.1 The Standard cone linear programming (K-LP) min c T x s.t. a T i x = b i, i = 1,..m x K 0 where c R n and b R m,a R n m with rows a i R n, i = 1,...m. Observe that every convex optimization problem: min x C f(x) where C is a convex set and f(x) is convex over C, can be turned into a cone-lp. First turn the problem to one with linear objective and then turn it into Cone LP: min z s.t. f(x) z 0 x C. Since the set B = {(z, x) x C and f(x) z 0} is convex our problem is now equivalent to the cone LP where min z s.t. x 0 = 1 x K 0 where K = {(x 0, z, x) (z, x) C and x 0 0} Definition 8 (Dual Cone) The dual cone K of a proper cone is the set {z : z T x 0, x K}. It is easy to prove that if K is always convex (even if K is non-convex!). Furthermore, if K is full-dimensional and pointed then K is a proper cone. The definition says that the angle between any pair of vectors from a cone and its dual has to be acute. Figure 2 shows an example of dual cone. 6

Example 7 non-negative orthant Let R n + = {x x k 0 for k = 1,..., n}, the dual cone equals R n +, that is the non-negative orthant is self dual. We recall that Lemma 9 A matrix X is positive semidefinite if it satisfies any one of the following equivalent conditions: 1. 2. 3. (1) a T Xa 0, a R n (2) A R n n such that AA T = X (3) All eigenvalues of X are non-negative. Example 8 The semidefinite cone Let P n n = {X R n n : X is positive semidefinite} Now we are interested in P n n. On one side, i.e., Z P n n, Z X 0 for all X 0, Z X = Tr(ZX) = Tr(ZAA T ) = Tr(A T ZA) 0 for all A R n n. Since X is symmetric, from the knowledge of linear algebra, X can be written as X = QΛQ T where QQ T = I, that is Q is an orthogonal matrix, and Λ is diagonal with the diagonal entries containing the eigenvalues of X. Write Q = [q 1,...q n ] and Λ = diag(λ 1,...λ n ). λ i, i = 1..n, then q i is the eigenvector corresponding to λ i, i.e, q T i Xq i = λ i Let us choose A i = p i R n where p i is the eigenvector of Z corresponding to γ i and p T i p i = 1. Then, 0 Tr(A T i ZA i ) = p T i Zp i = γ i. So all the eigenvalues of Z are non-negative, i.e., Z P n n, P n n P n n. On the other hand, Y P n n, B R n n such that Y = BB T. X P n n, X = AA T, we have Y X = Tr(YX) = Tr(BB T AA T ) = Tr(A T BB T A) = Tr[(B T A) T (B T A)] 0 i.e., Y P n n, P n n P n n. In conclusion, P n n = P n n 7

From the linear programming we know the pair of primal and dual problems are: Min c, x Max b, y (P) S.T. Ax = b (D) S.T. A, y + S = C x K 0 x K 0 We just proved that the P.S.D. cone is self dual, therefore Min c, x Max b, y (P) S.T. Ax = b (D) S.T. A, y + S = C x K 0 x K 0 Example 9 The second order cone Let Q = {(x 0, x) x 0 x }. Q is a proper cone. What is Q? On one side, if z = (z 0, z) Q, then for every (x 0, x) Q ( ) (z 0, z T x0 ) = z x 0 x 0 + z T x z x + z T x z T x + z T x = 0 i.e., Q Q. The inequalities come from the Cauchy-Schwartz inequality: z T x x T z z x On the other side, we note that e = (1, 0) Q. For each element z = (z 0, z) Q we must have z T e = z 0 0. We also note that each vector of the form x = ( z, z) Q, for all z R n. Thus, in particular for z = (z 0, z) Q, z T x = z 0 z z 2 0 Since z is always non-negative, we get z 0 z, i.e., Q Q. Therefore, Q = Q. Example 10 p-norm cone A generalized definition of second order cone is that Q p = {(x 0, x) x 0 x p, p 1}, where x p = ( x i p ) 1 p. If p < 1 then Q p is not convex. We claim that Q p = Q q such that 1 p + 1 q = 1. The proof is an application of Hölder s inequality, which stats that for x R n and y R n where 1 p + 1 q = 1. x T y x p y q We next give some properties of dual cone as propositions without proofs since they are just analogy of polar cone and thus can be found in any well written convex analysis book. 8

Proposition 10 properties of dual cone If K 1 R n 1,......, K m R nm all proper cones then (K 1 K 2 K m ) is proper and are (K 1 K 2 K m ) = K 1 K 2 K m The Minkowski sum of cones is defined as K 1 + + K m = {x 1 + + x m x i K i, i = 1,..., m}. Then if each K i is a proper cone, so is K 1 + +K m and (K 1 + K 2 + + K m ) is proper and (K 1 + K 2 + + K m ) = K 1 K 2 K m In addition if rel.int(k i ) then (K 1 K 2 K m ) = K 1 + K 2 + + K m 3.2 Moment and positive polynomial cones: An Example of a pair of dual cones which are not self dual In the examples above, we note that they were all self-dual cones. But there are cones that are not self-dual. Let F be the set of functions F : R R with the following properties: 1. F is right continuous, 2. non-decreasing (i.e. if x > y then F(x) F(y),) and 3. has bounded variation, that is F(x) 0 as x, and F(x) u < as x. First observe that functions in F are almost like probability distribution functions, except that their range is the interval [0, u] rather than [0, 1]. Second the set F itself is a convex cone and in fact pointed cone in the space of rightcontinuous functions. Now we define a particular kind of Moment cone. First, let us define u x = The moment cone is defined as: { M n+1 = c = 1 x x 2 x n. } u x df(x) : F(x) F that is M n+1 consits of vectors c where for each j = 0,..., n, c j is the j th moment of a distribution times a non-negative constant. 9

Lemma 11 M n+1 is a convex pointed full-dimensional cone. Proof: Let s examine the properties we need to prove: c M n+1 and α 0 αc M n+1. To see this observe that there exists F F such that c = u x df(x). Now if F is right-continuous, nondecreasing and with bounded variation, then all these properties also hold for αf for each α 0 and thus αf F. Therefore, αc = u x d(αf(x)) M n+1. Thus M n+1 is a cone. If c and d are in M n+1 then c + d M n+1. c = u x df 1 (x) M n+1, d = u x df 2 (x) M n+1 c + d = u x d[f 1 (x) + F 2 (x)] M n+1 Thus M n+1 is a convex cone. If c and c are in M n+1 then c = 0. Ifc = u x df 1 (x) M n+1 and c M n+1, then c = u x df 2 (x) M n+1. c + ( c) = 0 = u x d[f 1 (x) + F 2 (x)] Especially, d[f 1 (x)+f 2 (x)] = 0. Since F 1 (x)+f 2 (x) F is non-decreasing with F 1 (x) + F 2 (x) 0 as x, we get F 1 (x) + F 2 (x) = 0 almost everywhere,i.e., F i (x) = 0, i = 1, 2 almost everywhere. It means c = 0, i.e., M n+1 M n+1 = 0. Thus M n+1 is a pointed cone. M n+1 is full-dimensional. Let F a (x) = { 0, if x < a 1, if x a Obviously, F a (x) F and u a = u x df a (x) M n+1 for all a R. Choose n + 1 distinct a 1,...a n+1, det[u a1,, u an+1 ] = i>j(a i a j ) 0 Thus M n+1 is full-dimension cone. (The determinant above is the wellknown Vander Monde determinant.) We need to point out that, as defined M n+1 is not a closed cone. For instance in R 2, (1, ɛ, 1/ɛ 2 ) moment space and (ɛ 2, ɛ 3, 1) M 3. However as ɛ 0, 10

(ɛ 2, ɛ 3, 1) does not belong to any moment cone. But if we take the union of n s 0 {}}{ vector α( 0, 0,..., 0, 1) T and M n+1 then this new cones will be a closed, and thus proper. 11