Analytic Center Cutting-Plane Method

Similar documents
Nonlinear Optimization: Algorithms 3: Interior-point methods

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

4.6 Linear Programming duality

10. Proximal point method

Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

Module1. x y 800.

Convex Programming Tools for Disjunctive Programs

Linear Programming Notes V Problem Transformations

What is Linear Programming?

Nonlinear Programming Methods.S2 Quadratic Programming

Practical Guide to the Simplex Method of Linear Programming

Mathematical finance and linear programming (optimization)

Convex analysis and profit/cost/support functions

14. Nonlinear least-squares

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Two-Stage Stochastic Linear Programs

Optimizing Call Center Staffing using Simulation and Analytic Center Cutting Plane Methods

CHAPTER 9. Integer Programming

Solutions Of Some Non-Linear Programming Problems BIJAN KUMAR PATEL. Master of Science in Mathematics. Prof. ANIL KUMAR

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

7 Gaussian Elimination and LU Factorization

Integrating Benders decomposition within Constraint Programming

Solving Linear Programs

Largest Fixed-Aspect, Axis-Aligned Rectangle

Big Data - Lecture 1 Optimization reminders

Optimizing Call Center Staffing using Simulation and Analytic Center Cutting Plane Methods

Duality of linear conic problems

Distributed Machine Learning and Big Data

1 Solving LPs: The Simplex Algorithm of George Dantzig

3. Linear Programming and Polyhedral Combinatorics

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Online Convex Optimization

Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

Discrete Optimization

Duality in Linear Programming

24. The Branch and Bound Method

Computing a Nearest Correlation Matrix with Factor Structure

The Graphical Method: An Example

Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand

Linear Programming in Matrix Form

Date: April 12, Contents

Dual Methods for Total Variation-Based Image Restoration

Some representability and duality results for convex mixed-integer programs.

Lecture 2: August 29. Linear Programming (part I)

1 if 1 x 0 1 if 0 x 1

Chapter 2 Solving Linear Programs

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

2.3 Convex Constrained Optimization Problems

. P. 4.3 Basic feasible solutions and vertices of polyhedra. x 1. x 2

Inner Product Spaces

General Framework for an Iterative Solution of Ax b. Jacobi s Method

NOTES ON LINEAR TRANSFORMATIONS

Optimisation et simulation numérique.

Arrangements And Duality

(Quasi-)Newton methods

5.1 Bipartite Matching

A three point formula for finding roots of equations by the method of least squares

Lecture 5 Least-squares

Linear Programming. Solving LP Models Using MS Excel, 18

Metric Spaces. Chapter Metrics

Numerical methods for American options

26 Linear Programming

Linear Programming Notes VII Sensitivity Analysis

Solutions to Math 51 First Exam January 29, 2015

Principles of demand management Airline yield management Determining the booking limits. » A simple problem» Stochastic gradients for general problems

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

Adaptive Online Gradient Descent

MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying

Special Situations in the Simplex Algorithm

Several Views of Support Vector Machines

Roots of Equations (Chapters 5 and 6)

Chapter 6. Cuboids. and. vol(conv(p ))

Nonlinear Algebraic Equations. Lectures INF2320 p. 1/88

Chapter 13: Binary and Mixed-Integer Programming

Linear Programming: Theory and Applications

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

OPRE 6201 : 2. Simplex Method

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

6.4 Logarithmic Equations and Inequalities

Interior Point Methods and Linear Programming

Lecture 7: Finding Lyapunov Functions 1

Robust Geometric Programming is co-np hard

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

1. Prove that the empty set is a subset of every set.

Week 1: Functions and Equations

Matrix Representations of Linear Transformations and Changes of Coordinates

Factorization Theorems

Equations, Inequalities & Partial Fractions

Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Understanding Basic Calculus

Solving Mass Balances using Matrix Algebra

4.5 Linear Dependence and Linear Independence

The last three chapters introduced three major proof techniques: direct,

6. Cholesky factorization

Transcription:

Analytic Center Cutting-Plane Method S. Boyd, L. Vandenberghe, and J. Skaf April 13, 2008 Contents 1 Analytic center cutting-plane method 2 2 Computing the analytic center 3 3 Pruning constraints 5 4 Lower bound and stopping criterion 5 5 Convergence proof 7 6 Numerical examples 10 6.1 Basic ACCPM................................... 10 6.2 ACCPM with constraint dropping........................ 10 6.3 Epigraph ACCPM................................ 13 1

In these notes we describe in more detail the analytic center cutting-plane method (AC- CPM) for non-differentiable convex optimization, prove its convergence, and give some numerical examples. ACCPM was developed by Goffin and Vial [GV93] and analyzed by Nesterov [Nes95] and Atkinson and Vaidya [AV95]. These notes assume a basic familiarity with convex optimization (see [BV04]), cuttingplane methods (see the EE364b notes Localization and Cutting-Plane Methods), and subgradients (see the EE364b notes Subgradients). 1 Analytic center cutting-plane method The basic ACCPM algorithm is: Analytic center cutting-plane method (ACCPM) given an initial polyhedron P 0 known to contain X. k := 0. repeat Compute x (k+1), the analytic center of P k. Query the cutting-plane oracle at x (k+1). If the oracle determines that x (k+1) X, quit. Else, add the returned cutting-plane inequality to P. P k+1 := P k {z a T z b} If P k+1 =, quit. k := k + 1. There are several variations on this basic algorithm. For example, at each step we can add multiple cuts, instead of just one. We can also prune or drop constraints, for example, after computing the analytic center of P k. Later we will see a simple but non-heuristic stopping criterion. We can construct a cutting-plane a T z b at x (k), for the standard convex problem minimize f 0 (x) subject to f i (x) 0, i = 1,...,m, as follows. If x (k) violates the ith constraint, i.e., f i (x (k) ) > 0, we can take where g i f i (x (k) ). If x (k) is feasible, we can take a = g 0, a = g i, b = g T i x (k) f i (x (k) ), (1) b = g T 0 x(k) f 0 (x (k) ) + f (k) best, (2) where g 0 f 0 (x (k) ), and f (k) best is the best (lowest) objective value encountered for a feasible iterate. 2

2 Computing the analytic center Each iteration of ACCPM requires computing the analytic center of a set of linear inequalities (and, possibly, determining whether the set of linear inequalities is feasible), a T i x b i, i = 1,...,m, that define the current localization polyhedron P. In this section we describe some methods that can be used to do this. We note that the inequalities defined by a i and b i, as well as their number m, can change at each iteration of ACCPM, as we add new cutting-planes and possibly prune others. In addition, these inequalities can include some of the original inequalities that define P 0. To find the analytic center, we must solve the problem minimize Φ(x) = m log(b i a T i x). (3) This is an unconstrained problem, but the domain of the objective function is the open polyhedron dom Φ = {x a T i x < b i, i = 1,...,m}, i.e., the interior of the polyhedron. Part of the challenge of computing the analytic center is that we are not given a point in the domain. One simple approach is to use a phase I optimization method (see [BV04, 11.4]) to find a point in dom Φ (or determine that dom Φ = ). Once we find such a point, we can use a standard Newton method to compute the analytic center (see [BV04, 9.5]). A simple alternative is to use an infeasible start Newton method (see [BV04, 10.3]) to solve (3), starting from a point that is outside dom Φ, as suggested by Goffin and Sharifi- Mokhtarian in [GSM99]. We reformulate the problem as minimize m log y i subject to y = b Ax, (4) with variables x and y. The infeasible start Newton method can be started from any x and any y 0. We can, for example, take the inital x to be the previous point x prev, and choose y as { bi a y i = T i x if b i a T i x > 0 1 otherwise. In the basic form of ACCPM x prev strictly satisfies all of the inequalities except the last one added; in this case, y i = b i a T i x holds for all but one index. We define the primal and dual residuals for (4) as r p = y + Ax b, r d = [ A T ν g + ν ], (5) 3

where g = diag(1/y i )1 is the gradient of the objective. We also define r to be (r d, r p ). The Newton step at a point (x, y, ν) is defined by the system of linear equations 0 0 A T 0 H I A I 0 x y ν = where H = diag(1/yi 2 ) is the Hessian of the objective. We can solve this system by block elimination (see [BV04, 10.4]), using the expressions [ rd x = (A T HA) 1 (A T g A T Hr p ), r p y = A x r p, (6) ν = H y g ν. We can compute x from the first equation in several ways. We can, for example, form A T HA, then compute its Cholesky factorization, then carry out backward and forward substitution. Another option is to compute x by solving the least-squares problem H x = argmin 1/2 z Az H 1/2 r p + H 1/2 g. The infeasible start Newton method is: Infeasible start Newton method. ], given starting point x, y 0, tolerance ǫ > 0, α (0, 1/2), β (0, 1). ν := 0. Compute residuals from (5). repeat 1. Compute Newton step ( x, y, ν) using (6). 2. Backtracking line search on r 2. t := 1. while y + t y 0, t := βt. while r(x + t x, y + t y, ν + t ν) 2 > (1 αt) r(x, y, ν) 2, 3. Update. x := x + t x, y := y + t y, ν := ν + t ν. until y = b Ax and r(x, y, ν) 2 ǫ. t := βt. This method works quite well, unless the polyhedron is empty (or, in practice, very small), in which case the algorithm does not converge. To guard against this, we fix a maximum number of iterations. Typical parameter values are β = 0.5, α = 0.01, with maximum iterations set to 50. 4

3 Pruning constraints There is a simple method for ranking the relevance of the inequalities a T i x b i, i = 1,...,m that define a polyhedron P, once we have computed the analytic center x. Let Then the ellipsoid lies inside P, and the ellipsoid m H = 2 Φ(x ) = (b i a T i x) 2 a i a T i. E in = {z (z x ) T H(z x ) 1} E out = {z (z x ) T H(z x ) m 2 }, which is E in scaled by a factor m about its center, contains P. Thus the ellipsoid E in at least grossly (within a factor m) approximates the shape of P. This suggests the (ir)relevance measure η i = b i a T i x H 1/2 a i = b i a T i x a T i H 1 a i for the inequality a T i x b i. This factor is always at least one; if it is m or larger, then the inequality is certainly redundant. These factors (which are easily computed from the computations involved in the Newton method) can be used to decide which constraints to drop or prune. We simply drop constraints with the large values of η i ; we keep constraints with smaller values. One typical scheme is to keep some fixed number N of constraints, where N is usually chosen to be between 3n and 5n. When this is done, the computational effort per iteration (i.e., centering) does not grow as ACCPM proceeds, as it does when no pruning is done. 4 Lower bound and stopping criterion In 7 of the notes Localization and Cutting-Plane Methods we described a general method for constructing a lower bound on the optimal value p of the convex problem minimize f 0 (x) subject to f 1 (x) 0, Cx d, assuming we have evaluated the value and a subgradient of its objective and constraint functions at some points. For notational simplicity we lump multiple constraints into one by forming the maximum of the constraint functions. We will re-order the iterates so that at x (1),...,x (q), we have evaluated the value and a subgradient of the objective function f 0. This gives us the piecewise-linear underestimator ˆf 0 of f 0, defined as ( ˆf 0 (z) = max f0 (x (i) ) + g (i)t 0 (z x (i) ) ) f 0 (z).,...,q 5 (7)

We assume that at x (q+1),...,x (k), we have evaluated the value and a subgradient of the constraint function f 1. This gives us the piecewise-linear underestimator ˆf 1, given by ( ˆf 1 (z) = f1 (x (i) ) + g (i)t 1 (z x (i) ) ) f 1 (z). Now we form the problem max i=q+1,...,k minimize ˆf0 (x) subject to ˆf1 (x) 0, Cx d, whose optimal value is a lower bound on p. In ACCPM we can easily construct a lower bound on the problem (8), as a by-product of the analytic centering computation, which in turn gives a lower bound on the original problem (7). We first write (8) as the LP minimize t subject to f 0 (x (i) ) + g (i)t 0 (x x (i) ) t, i = 1,...,q f 1 (x (i) ) + g (i)t 1 (x x (i) ) 0, i = q + 1,..., k Cx d, with variables x and t. The dual problem is maximize q λ i (f 0 (x (i) ) g (i)t 0 x (i) ) + k i=q+1 λ i (f 1 (x (i) ) g (i)t 1 x (i) ) d T µ subject to q λ i g (i) 0 + k i=q+1 λ i g (i) 1 + C T µ = 0 µ 0, λ 0, q λ i = 1, with variables λ, µ. We can compute at modest cost a lower bound to the optimal value of (9), and hence to p, by finding a dual feasible point for (10. The optimality condition for x (k+1), the analytic center of the inequalities is q f 0 (x (i) ) + g (i)t 0 (x x (i) ) f (i) best, i = 1,...,q, f 1 (x (i) ) + g (i)t 1 (x x (i) ) 0, i = q + 1,..., k, c T i x d i, i = 1,..., m, g (i) 0 f (i) best f 0(x (i) ) g (i)t 0 (x (k+1) x (i) ) + k i=q+1 g (i) 1 f 1 (x (i) ) g (i)t 1 (x (k+1) x (i) ) + m c i (8) (9) (10) = 0. d i c T i x (k+1) (11) Let τ i = 1/(f (i) best f 0(x (i) ) g (i)t 0 (x (k+1) x (i) )) for i = 1,...,q. Comparing (11) with the equality constraint in (10), we can construct a dual feasible point by taking λ i = { τi /1 T τ for i = 1,...,q 1/( f 1 (x (i) ) g (i)t 1 (x (k+1) x (i) ))(1 T τ) for i = q + 1,...,k, µ i = 1/(d i c T i x(k+1) )(1 T τ) i = 1,...,m. 6

Using these values of λ and µ, we conclude that p l (k+1), where l (k+1) = q λ i (f 0 (x (i) ) g (i)t 0 x (i) ) + k i=q+1 λ i (f 1 (x (i) ) g (i)t 1 x (i) ) d T µ. Let l (k) best be the best lower bound found after k iterations. The ACCPM algorithm can be stopped once the gap f (k) best l(k) best is less than a desired value ǫ > 0. This guarantees that x (k) is, at most, ǫ-suboptimal. 5 Convergence proof In this section we give a convergence proof for ACCPM, adapted from Ye [Ye97, Chap. 6]. We take the initial polyhedron as the unit box, centered at the origin, with unit length sides, i.e., the initial set of linear inequalities is (1/2)1 z (1/2)1, so the first analytic center is x (1) = 0. We assume the target set X contains a ball with radius r < 1/2, and show that the number of iterations is no more than a constant times n 2 /r 2. Assuming the algorithm has not terminated, the set of inequalities after k iterations is (1/2)1 z (1/2)1, a T i z b i, i = 1,...,k. (12) We assume the cuts are neutral, so b i = a T i x(i) for i = 1,...,k. Without loss of generality we normalize the vectors a i so that a i 2 = 1. We will let φ k : R n R be the logarithmic barrier function associated with the inequalities (12), n n k φ k (z) = log(1/2 + z i ) log(1/2 z i ) log(b i a T i x). The iterate x (k+1) is the minimizer of this logarithmic barrier function. Since the algorithm has not terminated, the polyhedron P k defined by (12) still contains the target set X, and hence also a ball with radius r and (unknown) center x c. We have ( 1/2 + r)1 x c (1/2 r)1, and the slacks of the inequalities a T i z b i evaluated at x c also exceed r: b i sup a T i (x c + rv) = b i a T i x c r a i 2 = b i a T i x c r 0. v 2 1 Therefore φ k (x c ) (2n + k) log r and, since x (k) is the minimizer of φ k, φ k (x (k) ) = inf z φ k(z) φ k (x c ) (2n + k) log(1/r). (13) 7

We can also derive a lower bound on φ k (x (k) ) by noting that the functions φ j are selfconcordant for j = 1,..., k. Using the inequality (9.48), [BV04, p.502], we have φ j (x) φ j (x (j) ) + (x x (j) ) T H j (x x (j) ) log(1 + (x x (j) ) T H j (x x (j) )) for all x domφ j, where H j is the Hessian of φ j at x (j). If we apply this inequality to φ k 1 we obtain φ k (x (k) ) = inf φ k(x) x ( = inf φk 1 (x) log( a T x k (x x (k 1) )) ) inf v ( ) φ k 1 (x (k 1) ) + v T H k 1 v log(1 + v T H k 1 v) log( a T k v). By setting the gradient of the righthand side equal to zero, we find that it is minimized at which yields φ k (x (k) ) φ k 1 (x (k 1) ) + ˆv = 1 + 5 2 a T k H 1 k 1 a k H 1 k 1 a k, ˆv T H k 1ˆv log(1 + ˆv T H k 1ˆv) log( a T k ˆv) = φ (k 1) (x k 1 ) + 0.1744 1 2 log(at k H 1 k 1 a k) 0.1744k 1 2 k 0.1744k k 2 log ( 1 k k 2 log ( 1 k k log(a T i H 1 i 1 a i) + 2n log 2 k ) a T i H 1 i 1 a i a T i H 1 i 1 a i ) + 2n log 2 + 2n log 2 (14) because φ 0 (x (0) ) = 2n log 2. We can further bound the second term on the righthand side by noting that H i = 4diag(1 + 2x (i) ) 2 + 4diag(1 2x (i) ) 2 + because (1/2)1 x (i) (1/2)1 and i j=1 1 (b j a T j x (i) ) 2a ja T j I + 1 n b i a T i x (k) = a T i (x (i 1) x (k) ) a i 2 x (i 1) x (k) 2 n. Define B 0 = I and B i = I + (1/n) i j=1 a j a T j for i 1. Then i a j a T j j=1 n log(1 + k/n 2 ) = n log(trb k /n) log det B k 8

= log det B k 1 + log(1 + 1 n at k B 1 k 1 a k) log det B k 1 + 1 2n at k B 1 1 k a T i B 1 2n i 1a i. k 1 a k (The second inequality follows from the fact that a T k B 1 k 1 a k 1, and log(1+x) (log 2)x x/2 for 0 x 1.) Therefore k and we can simplify (14) as a T i H 1 i 1a i k a T i B 1 i 1a i 2n 2 log(1 + k n 2), φ k (x (k) ) k ( 2 log 2 log(1 + ) k/n2 ) + 2n log 2 k/n 2 = (2n k 2 ) log 2 + k ( 2 log k/n 2 log(1 + k/n 2 ) Combining this lower bound with the upper bound (13) we find ). (15) k 2 log 2 + k ( 2 log k/n 2 ) k log(1/r) + 2n log(1/(2r)). (16) log(1 + k/n 2 ) From this it is clear that the algorithm terminates after a finite k: since the ratio (k/n 2 )/ log(1+ k/n 2 ) goes to infinity, the left hand side grows faster than linearly as k increases. We can derive an explicit bound on k as follows. Let α(r) be the solution of the nonlinear equation α/ log(1 + α) = 1/(2r 4 ). Suppose k > max{2n, n 2 α(r)}. Then we have a contradiction in (16): k log(1/(2r 2 )) k 2 log 2 + k ( 2 log k/n 2 ) k log(1/r) + 2n log(1/(2r)), log(1 + k/n 2 ) i.e., k log(1/(2r)) 2n log(1/(2r)). We conclude that max{2n, n 2 α(r)} is an upper bound on the number of iterations. 9

6 Numerical examples We consider the problem of minimizing a piecewise linear function: minimize f(x) = max,...,m (a T i x + b i ), with variable x R n. The particular problem instance we use to illustrate the different methods has n = 20 variables and m = 100 terms, with problem data a i and b i generated from a unit normal distribution. Its optimal value (which is easily computed via linear programming) is f 1.1. 6.1 Basic ACCPM We use the basic ACCPM algorithm described in 1, with the infeasible start Newton method used to carry out the centering steps. We take P 0 to be the unit box {z z 1}. We keep track of f best, the best objective value found, and use this to generate deep objective cuts. Figure 1 shows convergence of f (k) f, which is nearly linear (on a semi-log scale). Figure 2 shows the convergence of the true suboptimality f (k) best f (which is not available as the algorithm is running), along with the upper bound on suboptimality f (k) best l(k) best (which is available as the algorithm runs). Figure 3 shows f (k) best f versus the cumulative number of Newton steps required by the infeasible start Newton method in the centering steps. This plots shows that around 10 Newton steps are needed, on average, to carry out the centering. We can see that as the algorithm progresses (and P (k) gets small), there is a small increase in the number of Newton steps required to achieve the same factor increase in accuracy. 6.2 ACCPM with constraint dropping We illustrate ACCPM with constraint dropping, keeping at most N = 3n constraints, using the constraint dropping scheme described in 3. Figure 4 shows convergence of f (k) f with and without constraint dropping. The plots show that keeping only 3n constraints has almost no effect on the progress of the algorithm, as measured in iterations. At k = 200 iterations, the pruned polyhedron has 60 constraints, whereas the unpruned polyhedron has 240 constraints. The number of flops per Newton step is, to first order, m k n 2, where m k is the number of constraints at iteration k, so the total flop count of iteration k can be estimated as N k m k n 2, where N k is the number of Newton steps required in iteration k. Figure 5 shows the convergence of f (k) best f versus the (estimated) cummulative flop count. 10

10 1 10 0 f (k) f 10 1 10 2 10 3 10 4 0 50 100 150 200 k Figure 1: The value of f (k) f versus iteration number k, for the basic ACCPM. 10 4 f (k) best f f (k) best l(k) best 10 2 10 0 10 2 10 4 0 50 100 150 200 k Figure 2: The value of f (k) best f (in blue) and the value of f (k) versus iteration number k for the basic ACCPM. best l(k) best (in red) 11

10 1 10 0 f (k) f 10 1 10 2 10 3 10 4 0 500 1000 1500 2000 2500 Newton iterations Figure 3: The value of f (k) best f versus the cumulative number of Newton steps, for the basic ACCPM. 10 1 10 0 no dropping keeping 3n f (k) f 10 1 10 2 10 3 10 4 0 50 100 150 200 k Figure 4: The value of f (k) f versus iteration number k in the case where all constraints are kept (blue) and only 3n constraints are kept (red). 12

10 1 10 0 no dropping keeping 3n f (k) best f 10 1 10 2 10 3 10 4 0 50 100 150 200 Cumulative flops/10 6 Figure 5: The value of f (k) best f versus estimated cummulative flop count in the case where all constraints are kept (blue) and only 3n constraints are kept (red). 6.3 Epigraph ACCPM Figure 6 shows convergence of f (k) f for the epigraph ACCPM. (See 6 of the EE364b notes Localization and Cutting-Plane Methods for details.) Epigraph ACCPM requires only 50 iterations to reach the same accuracy that was reached by basic ACCPM in 200 iterations. The convergence of f (k) best f versus the cumulative number of Newton steps is shown in figure 7. We see that in epigraph ACCPM the average number of Newton steps per iteration is a bit higher than for basic ACCPM, but a substantial advantage remains. 13

10 1 10 0 f (k) f 10 1 10 2 10 3 10 4 0 10 20 30 40 50 k Figure 6: The value of f (k) f versus iteration number k, for epigraph ACCPM. 10 1 10 0 f (k) best f 10 1 10 2 10 3 10 4 0 200 400 600 800 1000 1200 Newton iterations Figure 7: The value of f (k) best f versus the cumulative number of Newton steps, for epigraph ACCPM. 14

References [AV95] [BV04] D. Atkinson and P. Vaidya. A cutting plane algorithm for convex programming that uses analytic centers. Mathematical Programming, 69:1 43, 1995. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. [GSM99] J. Goffin and F. Sharifi-Mokhtarian. Using the primal-dual infeasible newton method in the analytic center method for problems defined by deep cutting planes. Journal of Optimization Theory and Applications, 101:35 58, April 1999. [GV93] [Nes95] [Ye97] J. Goffin and J. Vial. On the computation of weighted analytic centers and dual ellipsoids with the projective algorithm. Mathematical Programming, 60:81 92, 1993. Y. Nesterov. Cutting-plane algorithms from analytic centers: efficiency estimates. Mathematical Programming, 69:149 176, 1995. Yinyu Ye. Complexity analysis of the analytic center cutting plane method that uses multiple cuts. Mathematical Programming, 78:85 104, 1997. 15