A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

Similar documents

2.3 Convex Constrained Optimization Problems

Date: April 12, Contents

Duality of linear conic problems

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Convex analysis and profit/cost/support functions

Some representability and duality results for convex mixed-integer programs.

10. Proximal point method

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

Several Views of Support Vector Machines

A FIRST COURSE IN OPTIMIZATION THEORY

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

Nonlinear Optimization: Algorithms 3: Interior-point methods

Big Data - Lecture 1 Optimization reminders

CONSTRAINED NONLINEAR PROGRAMMING

Convex Programming Tools for Disjunctive Programs

Introduction to Support Vector Machines. Colin Campbell, Bristol University

MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

GI01/M055 Supervised Learning Proximal Methods

15 Kuhn -Tucker conditions

Research Article Stability Analysis for Higher-Order Adjacent Derivative in Parametrized Vector Optimization

Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

1 Norms and Vector Spaces

Adaptive Online Gradient Descent

Nonlinear Programming Methods.S2 Quadratic Programming

Support Vector Machines

Max-Min Representation of Piecewise Linear Functions

Cost Minimization and the Cost Function

Lecture 2: August 29. Linear Programming (part I)

The Heat Equation. Lectures INF2320 p. 1/88

Support Vector Machine (SVM)

Separation Properties for Locally Convex Cones

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach

ECONOMIC THEORY AND OPERATIONS ANALYSIS

TOPIC 4: DERIVATIVES

Lecture 2: The SVM classifier

24. The Branch and Bound Method

CHAPTER 9. Integer Programming

Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach

No: Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1

Optimization. J(f) := λω(f) + R emp (f) (5.1) m l(f(x i ) y i ). (5.2) i=1

A Simple Introduction to Support Vector Machines

CURVES WHOSE SECANT DEGREE IS ONE IN POSITIVE CHARACTERISTIC. 1. Introduction

Downloaded 08/02/12 to Redistribution subject to SIAM license or copyright; see

Two-Stage Stochastic Linear Programs

Lecture 7: Finding Lyapunov Functions 1

Numerical Verification of Optimality Conditions in Optimal Control Problems

Critical points of once continuously differentiable functions are important because they are the only points that can be local maxima or minima.

Metric Spaces. Chapter 1

Solutions Of Some Non-Linear Programming Problems BIJAN KUMAR PATEL. Master of Science in Mathematics. Prof. ANIL KUMAR

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

Optimization in R n Introduction

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

ALMOST COMMON PRIORS 1. INTRODUCTION

Metric Spaces. Chapter Metrics

Notes on metric spaces

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Support Vector Machines Explained

Optimal shift scheduling with a global service level constraint

Mathematical Methods of Engineering Analysis

The Equivalence of Linear Programs and Zero-Sum Games

constraint. Let us penalize ourselves for making the constraint too big. We end up with a

Third CI 2 MA Focus Seminar Optimization: Asymptotic Analysis, Strong Duality and Relaxation

Class Meeting # 1: Introduction to PDEs

3. Linear Programming and Polyhedral Combinatorics

Chapter 4. Duality. 4.1 A Graphical Example

Lecture 6: Logistic Regression

Envelope Theorem. Kevin Wainwright. Mar 22, 2004

Geometrical Characterization of RN-operators between Locally Convex Vector Spaces

Choice under Uncertainty

An Overview Of Software For Convex Optimization. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Least-Squares Intersection of Lines

International Doctoral School Algorithmic Decision Theory: MCDA and MOO

. P. 4.3 Basic feasible solutions and vertices of polyhedra. x 1. x 2

The Envelope Theorem 1

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Constrained optimization.

Metric Spaces Joseph Muscat 2003 (Last revised May 2009)

Linear Algebra I. Ronald van Luijk, 2012

Notes from Week 1: Algorithms for sequential prediction

Proximal mapping via network optimization

Spatial Decomposition/Coordination Methods for Stochastic Optimal Control Problems. Practical aspects and theoretical questions

What is Linear Programming?

9 More on differentiation

BANACH AND HILBERT SPACE REVIEW

Math 120 Final Exam Practice Problems, Form: A

Transcription:

1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003

2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of analysis Common geometrical framework for duality and minimax Unifying framework for existence of solutions and duality gap analysis Unification of Lagrange multiplier theory using an enhanced Fritz John theory and the notion of pseudonormality

3 WHY IS CONVEXITY IMPORTANT IN OPTIMIZATION I A convex function has no local minima that are not global A nonconvex function can be convexified while maintaining the optimality of its minima A convex set has nonempty relative interior A convex set has feasible directions at any point f(x) Epigraph f(x) Epigraph Convex set Feasible Directions Nonconvex set x x Convex function Nonconvex function

WHY IS CONVEXITY IMPORTANT IN OPTIMIZATION II The existence of minima of convex functions is conveniently characterized using directions of recession A polyhedral convex set is characterized by its extreme points and extreme directions 4 v 1 P 0 Recession Cone R f v 2 Level Sets of Convex Function f v 3

5 WHY IS CONVEXITY IMPORTANT IN OPTIMIZATION III A convex function is continuous and has nice differentiability properties Convex functions arise prominently in duality Convex, lower semicontinuous functions are self-dual with respect to conjugacy f(z) (x, f(x)) (-d, 1) z

6 SOME HISTORY Late 19th-Early 20th Century: Caratheodory, Minkowski, Steinitz, Farkas 40s-50s: The big turning point Game Theory: von Neumann Duality: Fenchel, Gale, Kuhn, Tucker Optimization-related convexity: Fenchel 60s-70s: Consolidation Rockafellar 80s-90s: Extensions to nonconvex optimization and nonsmooth analysis Clarke, Mordukovich, Rockafellar-Wets

7 ABOUT THE BOOK Convex Analysis and Optimization, by D. P. Bertsekas, with A. Nedic and A. Ozdaglar (March 2003) Aims to make the subject accessible through unification and geometric visualization Unification is achieved through several new lines of analysis

8 NEW LINES OF ANALYSIS I A unified geometrical approach to convex programming duality and minimax theory Basis: Duality between two elementary geometrical problems II A unified view of theory of existence of solutions and absence of duality gap Basis: Reduction to basic questions on intersections of closed sets IIIA unified view of theory of existence of Lagrange multipliers/constraint qualifications Basis: The notion of constraint pseudonormality, motivated by a new set of enhanced Fritz John conditions

9 OUTLINE OF THE BOOK Chapter 1 Basic Convexity Concepts Convex Sets and Functions Semicontinuity and Epigraphs Convex and Affine Hulls Relative Interior Directions of Recession Closed Set Intersections Chapters 2-4 Existence of Optimal Solutions Min Common/ Max Crossing Duality Minimax Saddle Point Theory Polyhedral Convexity Representation Thms Extreme Points Linear Programming Integer Programming Subdifferentiability Calculus of Subgradients Conical Approximations Optimality Conditions Chapters 5-8 Constrained Optimization Duality Geometric Multipliers Strong Duality Results Fenchel Duality Conjugacy Exact Penalty Functions Lagrange Multiplier Theory Pseudonormality Constraint Qualifications Exact Penalty Functions Fritz John Theory Dual Methods Subgradient and Incremental Methods

10 OUTLINE OF DUALITY ANALYSIS Closed Set Intersection Results Min Common/ Max Crossing Theory Enhanced Fritz John Conditions Preservation of Closedness Under Partial Minimization p(u) = inf x F(x,u) Minimax Theory Nonlinear Farkas Lemma Absence of Duality Gap Fenchel Duality Theory Existence of Geometric Multipliers Existence of Informative Geometric Multipliers

11 I MIN COMMON/MAX CROSSING DUALITY

12 GEOMETRICAL VIEW OF DUALITY w Min Common Point w * _ M w Min Common Point w * M M 0 Max Crossing Point q * u Max Crossing Point q * 0 u (a) (b) Min Common Point w * w _ M Max Crossing Point q * S M 0 u (c)

13 APPROACH Prove theorems about the geometry of M Conditions on M guarantee that w* = q* Conditions on M that guarantee existence of a max crossing hyperplane Choose M to reduce the constrained optimization problem, the minimax problem, and others, to special cases of the min common/max crossing framework Specialize the min common/max crossing theorems to duality and minimax theorems

14 CONVEX PROGRAMMING DUALITY Primal problem: min f(x) subject to x X and g j (x) 0, j=1,,r Dual problem: max q(µ) subject to µ 0 where the dual function is q(µ) = inf x L(x,µ) = inf x {f(x) + µ g(x)} Optimal primal value = inf x sup L(x,µ) µ 0 Optimal dual value = sup µ 0 inf x L(x,µ) Min common/max crossing framework: M = epi(p), p(u) = inf x X, gj(x) uj f(x)

15 VISUALIZATION (µ,1) M {(g(x),f(x)) x X} q * f* p(u) q(µ) = inf u {p(u) + µ u} u

16 MINIMAX / ZERO SUM GAME THEORY ISSUES Given a function Φ(x,z), where x X and z Z, under what conditions do we have inf x sup z Φ(x,z) = sup z inf x Φ(x,z) Assume convexity/concavity, semicontinuity of Φ Min common/max crossing framework: M = epigraph of the function p(u) = inf x sup z {Φ(x,z) - u z} inf x sup z Φ = Min common value sup z inf x Φ = Max crossing value

17 VISUALIZATION w w inf x sup z φ(x,z) (µ,1) M = epi(p) = min common value w * M = epi(p) (µ,1) inf x sup z φ(x,z) 0 q(µ) = min common value w * u sup z inf x φ(x,z) sup z inf x φ(x,z) = max crossing value q * 0 q(µ) u = max crossing value q * (a) (b)

18 TWO ISSUES IN CONVEX PROGRAMMING AND MINIMAX When is there no duality gap ( in convex programming), or inf sup = sup inf (in minimax)? When does an optimal dual solution exist (in convex programming), or the sup is attained (in minimax)? Min common/max crossing framework shows that 1st question is a lower semicontinuity issue 2nd question is an issue of existence of a nonvertical support hyperplane (or subgradient) at the origin Further analysis is needed for more specific answers

19 II UNIFICATION OF EXISTENCE AND NO DUALITY GAP ISSUES

20 INTERSECTIONS OF NESTED FAMILIES OF CLOSED SETS Two basic problems in convex optimization Attainment of a minimum of a function f over a set X Existence of a duality gap The 1st question is a set intersection issue: The set of minima is the intersection of the nonempty level sets {x X f(x) γ} The 2nd question is a lower semicontinuity issue: When is the function p(u) = inf x F(x,u) lower semicontinuous, assuming F(x,u) is convex and lower semicontinuous?

21 PRESERVATION OF SEMICONTINUITY UNDER PARTIAL MINIMIZATION 2nd question also involves set intersection Key observation: For p(u) = inf x F(x,u), we have Closure(P(epi(F))) epi(p) P(epi(F)) where P(. ) is projection on the space of u. So if projection preserves closedness, F is l.s.c. C C k+1 x C k y y k+1 y k P(C) Given C, when is P(C) closed? If y k is a sequence in P(C) that converges to y, - we must show that the intersection of the C k is nonempty

22 UNIFIED TREATMENT OF EXISTENCE OF SOLUTIONS AND DUALITY GAP ISSUES Results on nonemptiness of intersection of a nested family of closed sets No duality gap results In convex programming inf sup Φ = sup inf Φ Existence of minima of f over X

23 THE ROLE OF POLYHEDRAL ASSUMPTIONS: AN EXAMPLE Polyhedral Constraint Set X Level Sets of Convex Function f Nonpolyhedral Constraint Set X Level Sets of Convex Function f x 2 x 2 X X 0 x 1 0 x 1 Minimum Attained f(x) = e x 1 Minimum not Attained The min is attained if X is polyhedral, and f is constant along common directions of recession of f and X

24 THE ROLE OF QUADRATIC FUNCTIONS Results on nonemptiness of intersection of sets defined by quadratic inequalities If f is bounded below over X, the min of f over X is attained If the optimal value is finite, there is no duality gap Linear programming Quadratic programming Semidefinite programming

25 III LAGRANGE MULTIPLIER THEORY / PSEUDONORMALITY

26 LAGRANGE MULTIPLIERS Problem (smooth, nonconvex): Min f(x) subject to x X, h i (x)=0, i = 1,,m Necessary condition for optimality of x* (case X = R n ): Under some constraint qualification, we have f(x*) + Σ I λ i h i (x*) = 0 for some Lagrange multipliers λ i Basic analytical issue: What is the fundamental structure of the constraint set that guarantees the existence of a Lagrange multiplier? Standard constraint qualifications (case X = R n ): The gradients h i (x*) are linearly independent The functions h i are affine

27 ENHANCED FRITZ JOHN CONDITIONS If x* is optimal, there exist µ 0 0 and λ i, not all 0, such that µ 0 f(x*) + Σ i λ i h i (x*) = 0, and a sequence {x k } with x k x* and such that f (xk ) < f(x*) for all k, λ i h i (x k ) > 0 for all i with λ i 0 and all k NOTE: If µ 0 > 0, the λ i are Lagrange multipliers with a sensitivity property (we call them informative)

28 PSEUDONORMALITY Definition: A feasible point x* is pseudonormal if one cannot find λ i and a sequence {x k } with x k x* such that Σ i λ i h i (x*) = 0, Σ i λ i h i (x k ) > 0 for all k Pseudonormality at x* guarantees that, if x* is optimal, we can take µ 0 = 1 in the F-J conditions (so there exists an informative Lagrange multiplier) u 2 u 2 0 λ Tε u 1 λ 0 Tε u 1 Map an ε -ball around x* onto the constraint space T ε = {h(x) x-x* < ε} Pseudonormal h i : Affine Not Pseudonormal

29 INFORMATIVE LAGRANGE MULTIPLIERS The Lagrange multipliers obtained from the enhanced Fritz John conditions have a special sensitivity property: They indicate the constraints to violate in order to improve the cost We call such multipliers informative Proposition: If there exists at least one Lagrange multiplier vector, there exists one that is informative

30 EXTENSIONS/CONNECTIONS TO NONSMOOTH ANALAYSIS F-J conditions for an additional constraint x X The stationarity condition becomes - ( f(x*) + Σ I λ i h i (x*)) (normal cone of X at x*) X is called regular at x* if the normal cone is equal to the polar of its tangent cone at x* (example: X convex) If X is not regular at x*, the Lagrangian may have negative slope along some feasible directions Regularity is the fault line beyond which there is no satisfactory Lagrange multiplier theory

31 THE STRUCTURE OF THE THEORY FOR X = R n Independent Constraint Gradients Linear Constraints Mangasarian Fromovitz Condition Pseudonormality Existence of Lagrange Multipliers Existence of Informative Multipliers

32 THE STRUCTURE OF THE THEORY FOR X: REGULAR New Mangasarian Fromovitz-Like Condition Slater Condition Pseudonormality Existence of Lagrange Multipliers Existence of Informative Multipliers

33 EXTENSIONS Enhanced Fritz John conditions and pseudonormality for convex problems, when existence of a primal optimal solution is not assumed Connection of pseudonormality and exact penalty functions Connection of pseudonormality and the classical notion of quasiregularity