Advances in Convex Optimization: Interior-point Methods, Cone Programming, and Applications

Similar documents

Conic optimization: examples and software

Optimisation et simulation numérique.

An Overview Of Software For Convex Optimization. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM

An Introduction on SemiDefinite Program

Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

Solving polynomial least squares problems via semidefinite programming relaxations

Convex Optimization. Lieven Vandenberghe University of California, Los Angeles

[1] F. Jarre and J. Stoer, Optimierung, Lehrbuch, Springer Verlag 2003.

Real-Time Embedded Convex Optimization

Nonlinear Optimization: Algorithms 3: Interior-point methods

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

TOMLAB - For fast and robust largescale optimization in MATLAB

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Completely Positive Cone and its Dual

Advanced Lecture on Mathematical Science and Information Science I. Optimization in Finance

Nonlinear Programming Methods.S2 Quadratic Programming

Duality of linear conic problems

Interior Point Methods and Linear Programming

Solving Large Scale Optimization Problems via Grid and Cluster Computing

Applied Algorithm Design Lecture 5

Big Data Optimization at SAS

Optimization Methods in Finance

Proximal mapping via network optimization

INTERIOR POINT POLYNOMIAL TIME METHODS IN CONVEX PROGRAMMING

On SDP- and CP-relaxations and on connections between SDP-, CP and SIP

Can linear programs solve NP-hard problems?

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

6. Cholesky factorization

Lecture 6: Logistic Regression

CHAPTER 9. Integer Programming

Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams

A QCQP Approach to Triangulation. Chris Aholt, Sameer Agarwal, and Rekha Thomas University of Washington 2 Google, Inc.

Interior-Point Algorithms for Quadratic Programming

A Log-Robust Optimization Approach to Portfolio Management

Big Data - Lecture 1 Optimization reminders

. P. 4.3 Basic feasible solutions and vertices of polyhedra. x 1. x 2

Lecture 11: 0-1 Quadratic Program and Lower Bounds

Dual Methods for Total Variation-Based Image Restoration

Applications to Data Smoothing and Image Processing I

How will the programme be delivered (e.g. inter-institutional, summerschools, lectures, placement, rotations, on-line etc.):

Linear Threshold Units

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Linear Programming I

Computing a Nearest Correlation Matrix with Factor Structure

Virtual Landmarks for the Internet

Lecture 7: Finding Lyapunov Functions 1

(Quasi-)Newton methods

Theory and applications of Robust Optimization

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property

LECTURE: INTRO TO LINEAR PROGRAMMING AND THE SIMPLEX METHOD, KEVIN ROSS MARCH 31, 2005

Online Learning and Competitive Analysis: a Unified Approach

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

2.3 Convex Constrained Optimization Problems

Maximum-Margin Matrix Factorization

24. The Branch and Bound Method

Operations Research and Financial Engineering. Courses

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Solving Curved Linear Programs via the Shadow Simplex Method

3. Linear Programming and Polyhedral Combinatorics

Derive 5: The Easiest... Just Got Better!

10. Proximal point method

A Working Knowledge of Computational Complexity for an Optimizer

We shall turn our attention to solving linear systems of equations. Ax = b

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Random graphs with a given degree sequence

The Fourth International DERIVE-TI92/89 Conference Liverpool, U.K., July Derive 5: The Easiest... Just Got Better!

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Lecture 3: Linear methods for classification

Mathematical finance and linear programming (optimization)

A Hybrid Algorithm for Convex Semidefinite Optimization

Stochastic gradient methods for machine learning

1 Solving LPs: The Simplex Algorithm of George Dantzig

A New Method for Estimating Maximum Power Transfer and Voltage Stability Margins to Mitigate the Risk of Voltage Collapse

Solutions Of Some Non-Linear Programming Problems BIJAN KUMAR PATEL. Master of Science in Mathematics. Prof. ANIL KUMAR

Additional Exercises for Convex Optimization

Ideal Class Group and Units

4.6 Linear Programming duality

Numerical Methods For Image Restoration

Discuss the size of the instance for the minimum spanning tree problem.

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Nan Kong, Andrew J. Schaefer. Department of Industrial Engineering, Univeristy of Pittsburgh, PA 15261, USA

Roots of Polynomials

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Transcription:

Advances in Convex Optimization: Interior-point Methods, Cone Programming, and Applications Stephen Boyd Electrical Engineering Department Stanford University (joint work with Lieven Vandenberghe, UCLA) CDC 02 Las Vegas 12/11/02

Easy and Hard Problems

Least squares (LS) minimize Ax b 2 2 A R m n, b R m are parameters; x R n is variable have complete theory (existence & uniqueness, sensitivity analysis... ) several algorithms compute (global) solution reliably can solve dense problems with n = 1000 vbles, m = 10000 terms by exploiting structure (e.g., sparsity) can solve far larger problems... LS is a (widely used) technology CDC 02 Las Vegas 12/11/02 1

Linear program (LP) minimize c T x subject to a T i x b i, i =1,...,m c, a i R n are parameters; x R n is variable have nearly complete theory (existence & uniqueness, sensitivity analysis...) several algorithms compute (global) solution reliably can solve dense problems with n = 1000 vbles, m = 10000 constraints by exploiting structure (e.g., sparsity) can solve far larger problems... LP is a (widely used) technology CDC 02 Las Vegas 12/11/02 2

Quadratic program (QP) minimize Fx g 2 2 subject to a T i x b i, i =1,...,m a combination of LS & LP same story... QP is a technology solution methods reliable enough to be embedded in real-time control applications with little or no human oversight basis of model predictive control CDC 02 Las Vegas 12/11/02 3

The bad news LS, LP, and QP are exceptions most optimization problems, even some very simple looking ones, are intractable CDC 02 Las Vegas 12/11/02 4

Polynomial minimization minimize p(x) p is polynomial of degree d; x R n is variable except for special cases (e.g., d =2) this is a very difficult problem even sparse problems with size n =20,d =10are essentially intractable all algorithms known to solve this problem require effort exponential in n CDC 02 Las Vegas 12/11/02 5

What makes a problem easy or hard? classical view: linear is easy nonlinear is hard(er) CDC 02 Las Vegas 12/11/02 6

What makes a problem easy or hard? emerging (and correct) view:... the great watershed in optimization isn t between linearity and nonlinearity, but convexity and nonconvexity. R. Rockafellar, SIAM Review 1993 CDC 02 Las Vegas 12/11/02 7

Convex optimization minimize subject to f 0 (x) f 1 (x) 0,...,f m (x) 0 x R n is optimization variable; f i : R n R are convex: f i (λx +(1 λ)y) λf i (x)+(1 λ)f i (y) for all x, y, 0 λ 1 includes LS, LP, QP, and many others like LS, LP, and QP, convex problems are fundamentally tractable CDC 02 Las Vegas 12/11/02 8

Example: Robust LP minimize c T x subject to Prob(a T i x b i) η, i =1,...,m coefficient vectors a i IID, N (a i, Σ i ); η is required reliability for fixed x, a T i x is N (at i x, x T Σ i x) so for η = 50%, robust LP reduces to LP minimize c T x subject to a T i x b i, i =1,...,m and so is easily solved what about other values of η, e.g., η = 10%? η = 90%? CDC 02 Las Vegas 12/11/02 9

Hint {x Prob(a T i x b i ) η, i =1,...,m} η = 10% η = 50% η = 90% CDC 02 Las Vegas 12/11/02 10

That s right robust LP with reliability η = 90% is convex, and very easily solved robust LP with reliability η = 10% is not convex, and extremely difficult moral: very difficult and very easy problems can look quite similar (to the untrained eye) CDC 02 Las Vegas 12/11/02 11

Convex Analysis and Optimization

Convex analysis & optimization nice properties of convex optimization problems known since 1960s local solutions are global duality theory, optimality conditions simple solution methods like alternating projections convex analysis well developed by 1970s Rockafellar separating & supporting hyperplanes subgradient calculus CDC 02 Las Vegas 12/11/02 12

What s new (since 1990 or so) primal-dual interior-point (IP) methods extremely efficient, handle nonlinear large scale problems, polynomial-time complexity results, software implementations new standard problem classes generalizations of LP, with theory, algorithms, software extension to generalized inequalities semidefinite, cone programming... convex optimization is becoming a technology CDC 02 Las Vegas 12/11/02 13

Applications and uses lots of applications control, combinatorial optimization, signal processing, circuit design, communications,... robust optimization robust versions of LP, LS, other problems relaxations and randomization provide bounds, heuristics for solving hard problems CDC 02 Las Vegas 12/11/02 14

Recent history 1984 97: interior-point methods for LP 1984: Karmarkar s interior-point LP method theory Ye, Renegar, Kojima, Todd, Monteiro, Roos,... practice Wright, Mehrotra, Vanderbei, Shanno, Lustig,... 1988: Nesterov & Nemirovsky s self-concordance analysis 1989 : LMIs and semidefinite programming in control 1990 : semidefinite programming in combinatorial optimization Alizadeh, Goemans, Williamson, Lovasz & Schrijver, Parrilo,... 1994: interior-point methods for nonlinear convex problems Nesterov & Nemirovsky, Overton, Todd, Ye, Sturm,... 1997 : robust optimization Ben Tal, Nemirovsky, El Ghaoui,... CDC 02 Las Vegas 12/11/02 15

New Standard Convex Problem Classes

Some new standard convex problem classes second-order cone program (SOCP) geometric program (GP) (and entropy problems) semidefinite program (SDP) for these new problem classes we have complete duality theory, similar to LP good algorithms, and robust, reliable software wide variety of new applications CDC 02 Las Vegas 12/11/02 16

Second-order cone program second-order cone program (SOCP) has form minimize c T 0 x subject to A i x + b i 2 c T i x + d i, i =1,...,m with variable x R n includes LP and QP as special cases nondifferentiable when A i x + b i =0 new IP methods can solve (almost) as fast as LPs CDC 02 Las Vegas 12/11/02 17

Example: robust linear program where a i N(ā i, Σ i ) equivalent to minimize c T x subject to Prob(a T i x b i) η, i =1,...,m minimize c T x subject to ā T i x +Φ 1 (η) Σ 1/2 i x 2 1, i =1,...,m where Φ is (unit) normal CDF robust LP is an SOCP for η 0.5 (Φ(η) 0) CDC 02 Las Vegas 12/11/02 18

log-sum-exp function: Geometric program (GP) lse(x) = log (e x 1 + +e x n )... a smooth convex approximation of the max function geometric program: minimize lse(a 0 x + b 0 ) subject to lse(a i x + b i ) 0, i =1,...,m A i R m i n, b i R m i ; variable x R n CDC 02 Las Vegas 12/11/02 19

Entropy problems unnormalized negative entropy is convex function entr(x) = n x i log(x i /1 T x) i=1 defined for x i 0, 1 T x>0 entropy problem: minimize entr(a 0 x + b 0 ) subject to entr(a i x + b i ) 0, i =1,...,m A i R m i n, b i R m i CDC 02 Las Vegas 12/11/02 20

Solving GPs (and entropy problems) GP and entropy problems are duals (if we solve one, we solve the other) new IP methods can solve large scale GPs (and entropy problems) almost as fast as LPs applications in many areas: information theory, statistics communications, wireless power control digital and analog circuit design CDC 02 Las Vegas 12/11/02 21

CMOS analog/mixed-signal circuit design via GP given circuit cell: opamp, PLL, D/A, A/D, SC filter,... specs: power, area, bandwidth, nonlinearity, settling time,... IC fabrication process: TSMC 0.18µm mixed-signal,... find electronic design: device L & W, bias I & V, component values,... physical design: placement, layout, routing, GDSII,... CDC 02 Las Vegas 12/11/02 22

The challenges complex, multivariable, highly nonlinear problem dominating issue: robustness to model errors parameter variation unmodeled dynamics (sound familiar?) CDC 02 Las Vegas 12/11/02 23

Two-stage op-amp V dd M 8 M 5 M 7 I bias V in+ M 1 M 2 M 3 M 4 V in R c C c M 6 C L design variables: device lengths & widths, component values V ss constraints/objectives: power, area, bandwidth, gain, noise, slew rate, output swing,... CDC 02 Las Vegas 12/11/02 24

Op-amp design via GP express design problem as GP (using change of variables, and a few good approximations... ) 10s of vbles, 100s of constraints; solution time 1sec robust version: take 10 (or so) different parameter values ( PVT corners ) replicate all constraints for each parameter value get 100 vbles, 1000 constraints; solution time 2sec CDC 02 Las Vegas 12/11/02 25

Minimum noise versus power & BW 400 350 w c =30MHz w c =60MHz w c =90MHz Minimum noise in nv/hz 0.5 300 250 200 150 100 0 5 10 15 Power in mw CDC 02 Las Vegas 12/11/02 26

Cone Programming

Cone programming general cone program: minimize subject to c T x Ax K b generalized inequality Ax K b means b Ax K, a proper convex cone LP, QP, SOCP, GP can be expressed as cone programs CDC 02 Las Vegas 12/11/02 27

Semidefinite program semidefinite program (SDP): minimize subject to c T x x 1 A 1 + +x n A n B B, A i are symmetric matrices; variable is x R n constraint is linear matrix inequality (LMI) inequality is matrix inequality, i.e., K is positive semidefinite cone SDP is special case of cone program CDC 02 Las Vegas 12/11/02 28

Early SDP applications (around 1990 on) control (many) combinatorial optimization & graph theory (many) CDC 02 Las Vegas 12/11/02 29

More recent SDP applications structural optimization: Ben-Tal, Nemirovsky, Kocvara, Bendsoe,... signal processing: Vandenberghe, Stoica, Lorenz, Davidson, Shaked, Nguyen, Luo, Sturm, Balakrishnan, Saadat, Fu, de Souza,... circuit design: El Gamal, Vandenberghe, Boyd, Yun,... algebraic geometry: Parrilo, Sturmfels, Lasserre, de Klerk, Pressman, Pasechnik,... communications and information theory: Rasmussen, Rains, Abdi, Moulines,... quantum computing: Kitaev, Waltrous, Doherty, Parrilo, Spedalieri, Rains,... finance: Iyengar, Goldfarb,... CDC 02 Las Vegas 12/11/02 30

Convex optimization heirarchy convex problems cone problems more general SDP SOCP QP LS GP LP more specific CDC 02 Las Vegas 12/11/02 31

Relaxations & Randomization

Relaxations & randomization convex optimization is increasingly used to find good bounds for hard (i.e., nonconvex) problems, via relaxation as a heuristic for finding good suboptimal points, often via randomization CDC 02 Las Vegas 12/11/02 32

Example: Boolean least-squares Boolean least-squares problem: minimize Ax b 2 subject to x 2 i =1, i =1,...,n basic problem in digital communications could check all 2 n possible values of x... an NP-hard problem, and very hard in practice many heuristics for approximate solution CDC 02 Las Vegas 12/11/02 33

Boolean least-squares as matrix problem Ax b 2 = x T A T Ax 2b T Ax + b T b = Tr A T AX 2b T A T x + b T b where X = xx T hence can express BLS as minimize Tr A T AX 2b T Ax + b T b subject to X ii =1, X xx T, rank(x) =1... still a very hard problem CDC 02 Las Vegas 12/11/02 34

SDP relaxation for BLS ignore rank one constraint, and use X xx T [ X x x T 1 ] 0 to obtain SDP relaxation (with variables X, x) minimize Tr A T AX 2b T A T x + b T b [ ] X x subject to X ii =1, x T 0 1 optimal value of SDP gives lower bound for BLS if optimal matrix is rank one, we re done CDC 02 Las Vegas 12/11/02 35

Interpretation via randomization can think of variables X, x in SDP relaxation as defining a normal distribution z N(x, X xx T ), with E z 2 i =1 SDP objective is E Az b 2 suggests randomized method for BLS: find X, x, optimal for SDP relaxation generate z from N (x,x x x T ) take x = sgn(z) as approximate solution of BLS (can repeat many times and take best one) CDC 02 Las Vegas 12/11/02 36

Example (randomly chosen) parameters A R 150 100, b R 150 x R 100, so feasible set has 2 100 10 30 points LS approximate solution: minimize Ax b s.t. x 2 = n, then round yields objective 8.7% over SDP relaxation bound randomized method: (using SDP optimal distribution) best of 20 samples: 3.1% over SDP bound best of 1000 samples: 2.6% over SDP bound CDC 02 Las Vegas 12/11/02 37

0.5 0.4 SDP bound LS solution frequency 0.3 0.2 0.1 0 1 1.2 Ax b /(SDP bound) CDC 02 Las Vegas 12/11/02 38

Interior-Point Methods

Interior-point methods handle linear and nonlinear convex problems Nesterov & Nemirovsky based on Newton s method applied to barrier functions that trap x in interior of feasible region (hence the name IP) worst-case complexity theory: # Newton steps problem size in practice: # Newton steps between 10 & 50 (!) over wide range of problem dimensions, type, and data 1000 variables, 10000 constraints feasible on PC; far larger if structure is exploited readily available (commercial and noncommercial) packages CDC 02 Las Vegas 12/11/02 39

Typical convergence of IP method 10 2 duality gap 10 0 10 2 10 4 SOCP GP LP 10 6 SDP 0 10 20 30 40 50 # Newton steps LP, GP, SOCP, SDP with 100 variables CDC 02 Las Vegas 12/11/02 40

Typical effort versus problem dimensions 35 LPs with n vbles, 2n constraints 100 instances for each of 20 problem sizes avg & std dev shown Newton steps 30 25 20 15 10 1 10 2 10 3 n CDC 02 Las Vegas 12/11/02 41

Computational effort per Newton step Newton step effort dominated by solving linear equations to find primal-dual search direction equations inherit structure from underlying problem equations same as for least-squares problem of similar size and structure conclusion: we can solve a convex problem with about the same effort as solving 30 least-squares problems CDC 02 Las Vegas 12/11/02 42

Problem structure common types of structure: sparsity state structure Toeplitz, circulant, Hankel; displacement rank Kronecker, Lyapunov structure symmetry CDC 02 Las Vegas 12/11/02 43

Exploiting sparsity well developed, since late 1970s direct (sparse factorizations) and iterative methods (CG, LSQR) standard in general purpose LP, QP, GP, SOCP implementations can solve problems with 10 5, 10 6 vbles, constraints (depending on sparsity pattern) CDC 02 Las Vegas 12/11/02 44

Exploiting structure in SDPs in combinatorial optimization, major effort to exploit structure structure is mostly (extreme) sparsity IP methods and others (bundle methods) used problems with 10000 10000 LMIs, 10000 variables can be solved Ye, Wolkowicz, Burer, Monteiro... CDC 02 Las Vegas 12/11/02 45

Exploiting structure in SDPs in control structure includes sparsity, Kronecker/Lyapunov substantial improvements in order, for particular problem classes Balakrishnan & Vandenberghe, Hansson, Megretski, Parrilo, Rotea, Smith, Vandenberghe & Boyd, Van Dooren,......butno general solution yet CDC 02 Las Vegas 12/11/02 46

Conclusions

Conclusions convex optimization theory fairly mature; practice has advanced tremendously last decade qualitatively different from general nonlinear programming becoming a technology like LS, LP (esp., new problem classes), reliable enough for embedded applications cost only 30 more than least-squares, but far more expressive lots of applications still to be discovered CDC 02 Las Vegas 12/11/02 47

Some references Semidefinite Programming, SIAM Review 1996 Applications of Second-order Cone Programming, LAA 1999 Linear Matrix Inequalities in System and Control Theory, SIAM 1994 Interior-point Polynomial Algorithms in Convex Programming, SIAM 1994, Nesterov & Nemirovsky Lectures on Modern Convex Optimization, SIAM 2001, Ben Tal & Nemirovsky CDC 02 Las Vegas 12/11/02 48

Shameless promotion Convex Optimization, Boyd & Vandenberghe to be published 2003 good draft available at Stanford EE364 (UCLA EE236B) class web site as course reader CDC 02 Las Vegas 12/11/02 49