Outline Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach The University of New South Wales SPOM 2013 Joint work with V. Jeyakumar, B.S. Mordukhovich and T.S. Pham
Outline Outline 1 Introduction 2 3 4 5
Outline Outline 1 Introduction 2 3 4 5
Outline Outline 1 Introduction 2 3 4 5
Outline Outline 1 Introduction 2 3 4 5
Outline Outline 1 Introduction 2 3 4 5
For f : R n R, we consider the following inequality system (S) f (z) 0. To judge whether x is an approximate solution of (S), we want to know d(x, [f 0]) := inf{ x z : f (z) 0}. However, we often measure [f (x)] + := max{f (x), 0}. So, we seek an error bound: there exist τ, δ > 0 such that d(x, [f 0]) τ ( [f (x)] + + [f (x)] δ ) + either locally or globally.
For f : R n R, we consider the following inequality system (S) f (z) 0. To judge whether x is an approximate solution of (S), we want to know d(x, [f 0]) := inf{ x z : f (z) 0}. However, we often measure [f (x)] + := max{f (x), 0}. So, we seek an error bound: there exist τ, δ > 0 such that d(x, [f 0]) τ ( [f (x)] + + [f (x)] δ ) + either locally or globally.
Definition We say f has a (1) global error bound with exponent δ if there exist τ > 0 such that d(x, [f 0]) τ ( [f (x)] + + [f (x)] δ ) + for all x R n (1) (2) local error bound with exponent δ around x if there exist τ, ɛ > 0 such that d(x, [f 0]) τ ( [f (x)] + + [f (x)] δ +) for all x B(x; ɛ). (2) If δ = 1 in (1) (resp. (2)), we say f has a Lipschitz type global (resp. local) error bound.
Error bound is useful in analyzing the convergence properties of algorithms (e.g. Luo 2000, Tseng 2010 and Attouch etal. 2009); sensitivity analysis of optimization problem/variational inequality problem (e.g. Jourani 2000) identifying the active constraints (e.g. Facchinei etal. 1998 and Pang 1997)
Some Known Results Lipschitz type global error bound holds when f is maximum of finitely many affine functions (Hoffman 1951) Global error bound can fail even when f is convex and continuous (e.g. f (x 1, x 2 ) = x 1 + x1 2 + x 2 2). Many further developments (e.g. Ioffe, Kruger, Lewis, Ng, Outrata, Pang, Robinson, Thera etc...) Global error bound with exponent 1/2 holds when f is a convex quadratic function. (Luo and Luo, 1994).
Motivating Example: go beyond quadratic Consider f (x) = x 2. Then, [f 0] = {0} and so, d(x, [f 0]) = x (x 2 ) 1 2 = [f (x)] 1 2 +. More generally, consider f (x) = x d with d is an even number. Then, d(x, [f 0]) = x (x d ) 1 d = [f (x)] 1 d +.
Main Problem Can we extend the error bound results from convex quadratic functions to convex polynomials? If yes, how about nonconvex cases involving polynomial structures?
What is special about convex polynomials? Convex polynomial optimization problems can be solved via a sequential SDP approximation scheme (in some cases, one single SDP is enough). (Lasserre 2010 and Jeyakuma and L. 2012). For a convex polynomial f on R n with degree d, we have (1) inf f > argminf (Belousov & Klatte 2000); (2) d(0, f (x k )) 0 f (x k ) inf f (L. 2010); (3) If f (v) = 0, then f (x + tv) = f (x) for all x R n and t R (Teboulle & Auslender, 2003). Note: f (v) = sup t>0 f (x+tv) f (x) t for all x domf.
What is special about convex polynomials? Convex polynomial optimization problems can be solved via a sequential SDP approximation scheme (in some cases, one single SDP is enough). (Lasserre 2010 and Jeyakuma and L. 2012). For a convex polynomial f on R n with degree d, we have (1) inf f > argminf (Belousov & Klatte 2000); (2) d(0, f (x k )) 0 f (x k ) inf f (L. 2010); (3) If f (v) = 0, then f (x + tv) = f (x) for all x R n and t R (Teboulle & Auslender, 2003). Note: f (v) = sup t>0 f (x+tv) f (x) t for all x domf.
Let κ(n, d) = (d 1) n + 1. Theorem (L. 2010) For a convex polynomial f on R n with degree d. Then there exists τ > 0 such that d(x, [f 0]) τ ( ) [f (x)] + + [f (x)] κ(n,d) 1 + for all x R n. (3) convex quadratic d = 2 (and so, κ(n, d) 1 = 1/2). previous example x d n = 1 (and so, κ(n, d) 1 = 1/d).
What is behind the proof? Łojasiewicz s inequality and its variants (Łojasiewicz s inequality) Let f be an analytic function on R n with f (0) = 0. Then, exists a rational number ρ (0, 1] and β, δ > 0 s.t. d(x, f 1 (0)) β f (x) ρ for all x δ. (Gwoździewicz 1999) In addition, if f is a polynomial with degree d and 0 is a strict local minimizer, then, ρ = 1 (d 1) n +1 = κ(n, d) 1. Further development on dropping the strict minimizer assumption in Gwoździewicz s result (Kurdyka 2012, and L., Mordukhovich and Pham 2013).
Outline of the proof Induction on the dimension k of [f 0] (1) If k = 0, then strict minimizer, so Gwoździewicz s result can be applied. (2) Suppose the result is true for k = p; (3) For the case k = p + 1, find a direction v such that f (v) = 0, and so, f (x + tv) = f (x) for all x and for all t. Reduce the case to k = p.
Maximum of finitely many convex polynomials? Extension to maximum of finitely many convex polynomials can fail in general (Shironin, 1986). Let f 1, f 2 : R 4 R be defined by f 1 (x 1, x 2, x 3, x 4 ) = x 1 and f 2 (x 1, x 2, x 3, x 4 ) = x1 16 2 8 + x 3 6 + x 1x2 3 x 3 3 + x 1 2 x2 4 x3 2 +x2 2 x3 4 + x1 4 x3 4 + x1 4 x2 6 + x 1 2 x2 6 + x 1 2 + x2 2 + x3 2 x 4. Define f = max{f 1, f 2 }. Then global error bound fails for f. Remark: The implication: f (v) = 0 f (x + tv) = f (x) x R n fails
Maximum of finitely many convex polynomials? Extension to maximum of finitely many convex polynomials can fail in general (Shironin, 1986). Let f 1, f 2 : R 4 R be defined by f 1 (x 1, x 2, x 3, x 4 ) = x 1 and f 2 (x 1, x 2, x 3, x 4 ) = x1 16 2 8 + x 3 6 + x 1x2 3 x 3 3 + x 1 2 x2 4 x3 2 +x2 2 x3 4 + x1 4 x3 4 + x1 4 x2 6 + x 1 2 x2 6 + x 1 2 + x2 2 + x3 2 x 4. Define f = max{f 1, f 2 }. Then global error bound fails for f. Remark: The implication: f (v) = 0 f (x + tv) = f (x) x R n fails
Corollary (L. 2010) Let f i, i = 1,..., m, be nonnegative convex polynomials on R n with degree d i and let d = max 1 i m d i. Let f = max 1 i m f i. Then there exists a constant τ > 0 such that d(x, [f 0]) τ ( ) [f (x)] + + [f (x)] κ(n,d) 1 + for all x R n. (4)
Classes of nonconvex systems involving polynomial structure Piecewise convex polynomials; Composite polynomial systems.
Piecewise convex polynomials Definition A function f is said to be a piecewise convex polynomial on R n with degree d if it is continuous and there exist finitely many polyhedra P 1,..., P k with k j=1 P j = R n such that the restriction of f on each P j is a convex polynomial with degree d. Examples: piecewise affine, convex polynomial + α [Ax + b] + 2. Can be nonconvex and nonsmooth (e.g. min{x, 1}).
Example Consider the piecewise convex polynomial f : R R defined by f (x) = { 1 if x 1, x 4 if x < 1. Clearly, [f 0] = {0}. Now, consider x k = k. Then d(x k, [f 0]) = k but f (x k ) = 1. So, global error bound fails.
Notably, in this example, the following implication fails d(x, [f 0]) f (x) +.
Theorem (L. 2013) Let f be a piecewise convex polynomial with degree d. Then, the following statements are equivalent: (1) d(x, [f 0]) f (x) +. (2) Global error bound holds with exponent κ(n, d) 1, i.e., there exists τ > 0 such that d(x, [f 0]) τ([f (x)] + + [f (x)] κ(n,d) 1 + ) for all x R n. (5) Remark: (1) is satisfied when f is coercive or when f is convex.
Composite polynomial systems Let f (x) := (ψ g)(x) where ψ is a convex polynomial on R n with degree d and g : R m R n is a continuously differentiable map. Theorem (L. & Mordukhovich, 2012) Let x [f 0], and assume that g(x): R m R n is surjective. Then there exist positive numbers τ and ɛ such that d ( x; [f 0]) τ[f (x)] κ(n,d) 1 + for all x B X (x, ɛ).
Applications: Proximal Point Algorithm Consider the following proximal point algorithm (PPM) for solving min x R n f (x): x k+1 = argmin x R n{f (x) + 1 2ɛ k x x k 2 }, k = 0, 1,... (6) PPM converges to a solution of min x R n f (x) (provided it exists) whenever ɛ k = + k=0
Theorem (L. & Mordukhovich, 2012) Let f be a piecewise convex polynomial on R n with degree d (d 2). Suppose that f is convex and inf f >. Let {x k } be generated by the proximal point method (6). Then there exists µ > 0 such that ( ( d(x k, argminf ) = O d(x k, argminf ) = O 1 k 1 i=0 ɛ i ( k 1 i=0 ) 1 κ(n,d) 2 ) if d > 2, 1 µɛk +1 ) if d = 2. (7) Remark: Can be extended to finding zeros of the maximal monotone operator T in Hilbert spaces under high-order metric subregularity condition.
Conclusion Error bound is an interesting research topic and has many important applications; Variational analysis and semi-algebraic techniques could shed some light on how to improve error bound results from quadratic to polynomial cases.
Future Works Still very preliminary development. A lot of interesting questions, e.g. (1) Is the derived exponent sharp? (2) Identify subclasses of convex polynomials s.t. global error bound holds for maximum of finitely many functions within this class? (3) Local error bound results with explicit exponents for nonconvex polynomials (some partial answer was given in L., Mordukhovich and Pham 2013)? (4) Any high-order stability analysis for nonconvex polynomial optimization problems?
Want to know more? (1) V. Jeyakumar and G. Li, Duality theory with SDP dual programs for SOS-convex programming via sums-of-squares representations, preprint 2012. (2) G. Li, On the asymptotically well behaved functions and global error bound for convex polynomials, SIAM J. Optim., 20 (2010), No. 4, 1923-1943. (3) G. Li, Global error bounds for piecewise convex polynomials, Math. Program., 137 (2013), 37-64. (4) G. Li and B.S. Mordukhovich, Hölder metric subregularity with applications to proximal point method, SIAM J. Optim., 22 (2012), No. 4, 1655-1684. (5) G. Li, B.S. Mordukhovich and T.S. Pham, New fractional error bounds for nonconvex polynomial systems with applications to Hölderian stability in optimization, preprint, 2013.
Thanks!
Let f : R 2 R {+ } be defined by x 2 1 2x 2, if x 2 > 0, f (x 1, x 2 ) = 0 if (x 1, x 2 ) = (0, 0), + else. (8) It can be verified that f is a proper, lower semicontinuous and convex function with inf f = 0. Consider x n = (n, n 2 ). Then one has f (x n ) = 1/2 and f (x n ) = f (x n ) = (1/n, 1/2n 2 ) 0.
f (x 1, x 2 ) = x 1 + x1 2 + x 2 2. [f 0] = {(x 1, x 2 ) : x 1 0, x 2 = 0}. Consider x n = ( n, 1). Then d(x n, [f 0]) = 1 and f (x n ) = n + n 2 1 + 1 = n 0. 2 +1+n