MS&E 310 Course Project III: First-Order Potential Reduction for Linear Programming Yinyu Ye November 15, 2016 1 Convex optimization over the simplex constraint We consider the following optimization problem over the simplex: Minimize Subject To e T x = n; x 0, 1 where e is the vector of all ones. Such a problem in considered in [7], where function does not need to be convex and a FPTAS algorithm was developed for computing an approximate KKT point of general quadratic programming. The following algorithm and analysis resemble those in [7]. We assume that is a convex function in x R n and fx = 0 where x is a minimizer of the problem. Furthermore, we make a standard Lipschitz assumption such that fx + d T d + γ 2 d 2, where positive γ is the Lipschitz parameter. Note that any homogeneous linear feasibility problem, e.g., the canonical Karmarkar form in [2]: Ax = 0; e T x = n; x 0. can be formulated as the model with = 1 2 Ax 2 and γ as the half of the largest eigenvalue of matrix A T A. Furthermore, any linear programming problem in the standard form and its dual Minimize c T x Subject to Ax = b; x 0; Maximize b T y Subject to A T y + s = c; s 0 1
can be represented as a homogeneous linear feasibility problem Ye et al. [5]: Ax bτ = 0; A T y s + cτ = 0; b T y c T x κ = 0; e T x + e T s + τ + κ = 2n + 2; x, s, τ, κ 0. We consider the potential function e.g., see [2, 4, 1, 6] φx = ln j lnx j, where n over the simplex. Clearly, if we start from x 0 = e, the analytic center of the simplex, and generate a sequence of points x k, k = 1,...,, whose potential value is strictly decreased, then when φx k φx 0 ln1/ɛ, we must have lnfx k lnfx 0 ln1/ɛ or This is because on the simplex lnx k j j j fx k fx 0 ɛ. lnx 0 j, k = 1,... We now describe a first order steepest descent potential reduction algorithm in the next section. 2 Steepest-Descent Potential Reduction and Complexity Analysis Note that the gradient vector of the potential function of x > 0 is φx = X 1 e. where in this note X denotes the diagonal matrix whose diagonal entries are elements of vector x. The following lemma is well known in the literature of interior-point algorithms [2, 1, 6]: Lemma 1. Let x > 0 and X 1 d β < 1. Then j lnx j + d j + j lnx j e T X 1 d + β 2 21 β. 2
Lemma 2. For any x > 0 and x x, a matrix A R m n with Ax = Ax, and a vector λ R m, consider vector px = X φx A T λ. Then, px 1. Proof. First, px = X X 1 e A T λ = X AT λ e. If any entry of AT λ is equal or less than 0, then px px 1. On the other hand, if AT λ λ T > 0, we have AT x 0. Then, from convexity and Ax = Ax, Thus, from fx = 0 Furthermore, where fx T x x = px 2 = 2 X 2 2 n X 2 2 n T AT λ x. T AT λ x x. AT λ λ T 2 2 AT x + n AT λ λ T 2 1 2 AT x + n λ AT T 2 x λ 2 AT T x + n = z2 n 2z + n = 1 n z n2, z = AT λ T x 1. The above quadratic function of z has the minimizer at z = 1 if n, so that for n + n. 1 n z n2 1 n n2 1 For any given x > 0 in the simplex and any d with e T d = 0, fx + d T d + γ 2 d 2 T d + γ 2 XX 1 d 2 T d + γ 2 X 1 d 2, 3
where the last inequality is due to X 1. Let X 1 d = β < 1 and x + = x + d = Xe + X 1 d > 0. Then, from Lemma 1 φx + φx ln 1 + T d+ γ 2 X 1 d 2 e T X 1 d + β2 21 β T d+ γ 2 X 1 d 2 e T X 1 d + β2 21 β = φx T d + γ 2 β2 + β2 21 β. The first order steepest descent potential reduction algorithm would update x by solving Minimize φx T d Subject to e T d = 0, X 1 d β; or Minimize φx T Xd Subject to e T Xd = 0, d β; where parameter β < 1 is yet to be determined. 2 where Let the scaled gradient projection vector px = I 1 x 2 XeeT X Then the minimizer of problem 2 would be and since px 1 based on Lemma 2. Thus, X φx = X λx = et X 2 φx x 2. d = β px Xpx, e λx e, φx T d = β px px 2 = β px β, φx + φx β + For β 1/2, the above quantity is less than β + Thus, one can choose β to minimize the quantity at γ β 2 2 β2 + 21 β 2 + γ β 2 /2. so that β = 1 2 + γ φx + φx 1/2 2 + 2γ. 4
One can see that the larger value of, the greater reduction of the potential function. Starting from x 0 = 1 n e, we iteratively generate xk, k = 1,..., such that φx k+1 φx k fx k 2fx k + 2γ fx k 2fx 0 + 2γ fx k 4 max{fx 0, 2γ}. The second inequality is due to fx k < fx 0 from φx k < φx 0 for all k 1 and x 0 is the analytic center of the simplex. Thus, if fxk fx 0 ɛ for 1 k K, we must have φx 0 φx K ln 1 ɛ, so that or K k=1 fx k 4 max{fx 0, 2γ} ln1 ɛ Kɛfx 0 4 max{fx 0, 2γ} ln 1 ɛ. Note that = n + n 2n. We conclude Theorem 3. The steepest descent potential reduction algorithm generates a x k with fx k /fx 0 ɛ in no more than steps. 4n + n max{1, 2n + nγ/fx 0 } ɛ ln 1 ɛ 3 Extension, Implementation and Further Analysis? Question 1: Develop a similar analysis for solving Minimize Subject To 0 x 2, 3 where we start x 0 = e, the analytic center of the BOX constraint. Question 2: Implement the algorithm and perform numerical tests to solve for = 1 2 Ax 2. Question 3: Implement the algorithm and perform numerical tests to solve for = 1 2 AAT 1/2 Ax 2, 5
and compare the performance with that in Question 2. Question 4: Test your implementation on homogeneous and self LP models for various linear programs feasible or infeasible, where you may eliminate free variables y from the formulation. References [1] C. C. Gonzaga, Polynomial affine algorithms for linear programming, Math. Programming 49 1990 7 21. [2] N. Karmarkar, A new polynomial-time algorithm for linear programming, Combinatorica 4 1984 373-395. [3] S. Mehrotra. On the implementation of a primal dual interior point method. SIAM J. Optimization, 24:575 601, 1992. [4] M. J. Todd and Y. Ye, A centered projective algorithm for linear programming, Math. Oper. Res. 15 1990 508-529. [5] Y. Ye, M. J. Todd, and S. Mizuno, An O nl - iteration homogeneous and self-dual linear programming algorithm, Math. Oper. Res. 19 1994 53 67. [6] Y. Ye, An On 3 L potential reduction algorithm for linear programming, Math. Programming 50 1991 239 258. [7] Y. Ye, On the complexity of approximating a KKT point of quadratic programming, Math. Programming 80 1998 195-211. 6