ISyE 6661: Topics Covered

Similar documents
4.6 Linear Programming duality

3. Linear Programming and Polyhedral Combinatorics

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

International Doctoral School Algorithmic Decision Theory: MCDA and MOO

Duality of linear conic problems

1 Solving LPs: The Simplex Algorithm of George Dantzig

LECTURE 5: DUALITY AND SENSITIVITY ANALYSIS. 1. Dual linear program 2. Duality theory 3. Sensitivity analysis 4. Dual simplex method

Some representability and duality results for convex mixed-integer programs.

Practical Guide to the Simplex Method of Linear Programming

IEOR 4404 Homework #2 Intro OR: Deterministic Models February 14, 2011 Prof. Jay Sethuraman Page 1 of 5. Homework #2

Degeneracy in Linear Programming

5.1 Bipartite Matching

2.3 Convex Constrained Optimization Problems

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

1 Linear Programming. 1.1 Introduction. Problem description: motivate by min-cost flow. bit of history. everything is LP. NP and conp. P breakthrough.

. P. 4.3 Basic feasible solutions and vertices of polyhedra. x 1. x 2

Linear Programming I

Nonlinear Optimization: Algorithms 3: Interior-point methods

Interior Point Methods and Linear Programming

Lecture 2: August 29. Linear Programming (part I)

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

Linear Programming: Theory and Applications

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Mathematical finance and linear programming (optimization)

Can linear programs solve NP-hard problems?

Equilibrium computation: Part 1

Nonlinear Programming Methods.S2 Quadratic Programming

Chapter 6. Linear Programming: The Simplex Method. Introduction to the Big M Method. Section 4 Maximization and Minimization with Problem Constraints

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

This exposition of linear programming

Special Situations in the Simplex Algorithm

Optimization Theory for Large Systems

arxiv: v1 [math.co] 7 Mar 2012

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

Optimization of Communication Systems Lecture 6: Internet TCP Congestion Control

26 Linear Programming

CHAPTER 9. Integer Programming

Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach

Simplex method summary

3. Evaluate the objective function at each vertex. Put the vertices into a table: Vertex P=3x+2y (0, 0) 0 min (0, 5) 10 (15, 0) 45 (12, 2) 40 Max

Two-Stage Stochastic Linear Programs

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

Max-Min Representation of Piecewise Linear Functions

An Introduction on SemiDefinite Program

What is Linear Programming?

Linear Programming Notes V Problem Transformations

Lecture 7: Finding Lyapunov Functions 1

Linear Programming in Matrix Form

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Duality in Linear Programming

Integrating Benders decomposition within Constraint Programming

Convex Programming Tools for Disjunctive Programs

OPTIMIZATION. Schedules. Notation. Index

Optimization Modeling for Mining Engineers

Chapter 6. Cuboids. and. vol(conv(p ))

DUAL METHODS IN MIXED INTEGER LINEAR PROGRAMMING

Transportation Polytopes: a Twenty year Update

Date: April 12, Contents

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

An interval linear programming contractor

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

Advanced Lecture on Mathematical Science and Information Science I. Optimization in Finance

Linear Programming. March 14, 2014

Minimally Infeasible Set Partitioning Problems with Balanced Constraints

Ideal Class Group and Units

Dantzig-Wolfe bound and Dantzig-Wolfe cookbook

Optimization in R n Introduction

NP-Hardness Results Related to PPAD

Definition Given a graph G on n vertices, we define the following quantities:

Completely Positive Cone and its Dual

Discrete Optimization

56:171 Operations Research Midterm Exam Solutions Fall 2001

Actually Doing It! 6. Prove that the regular unit cube (say 1cm=unit) of sufficiently high dimension can fit inside it the whole city of New York.

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Solving Linear Programs

Proximal mapping via network optimization

1 Norms and Vector Spaces

9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1

We shall turn our attention to solving linear systems of equations. Ax = b

Arrangements And Duality

Operation Research. Module 1. Module 2. Unit 1. Unit 2. Unit 3. Unit 1

On the representability of the bi-uniform matroid

Applied Algorithm Design Lecture 5

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

The Equivalence of Linear Programs and Zero-Sum Games

4.1 Learning algorithms for neural networks

Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

Minkowski Sum of Polytopes Defined by Their Vertices

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

Chapter 4. Duality. 4.1 A Graphical Example

OPRE 6201 : 2. Simplex Method

Optimization Methods in Finance

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

COMPUTING EQUILIBRIA FOR TWO-PERSON GAMES

1 Review of Least Squares Solutions to Overdetermined Systems

Support Vector Machine (SVM)

Transcription:

ISyE 6661: Topics Covered 1. Optimization fundamentals: 1.5 lectures 2. LP Geometry (Chpt.2): 5 lectures 3. The Simplex Method (Chpt.3): 4 lectures 4. LP Duality (Chpt.4): 4 lectures 5. Sensitivity Analysis (Chpt.5): 3 lectures 6. Large-scale LP (Chpt.6): 1.5 lectures 7. Computational complexity and the Ellipsoid method (Chpt. 8): 2 lectures 8. Interior Point Algorithms (Chpt. 9): 5 lectures 1

1. Fundamentals of Optimization The generic optimization problem: (P ) : min{f(x) : x X}. Weirstrass Theorem: If f is continuous and X is compact then problem (P ) has an optimal solution. If f is a convex function and X is a convex set, then (P ) is a convex program. Theorem: If x is a local optimal solution of the convex program (P ) then it is also a global optimal solution. 2

2. Linear Programming Geometry LP in standard form (P ) : min{c T x : Ax = b, x 0}. LP involves involves minimizing a linear function over the polyhderal set X = {x : Ax = b, x 0}. Basic building blocks of a polyhedral set: Extreme points and Extreme rays. Theorem: (Algebraic characterization of Extreme pts.) A vector x is an extreme point of X iff it is a Basic Feasible Solution, i.e., a partitioning of A = [B N] (with B square and nonsingular) such that x B = B 1 b and x N = 0. Theorem: (Algebraic characterization of Extreme rays.) A vector d 0 is an extreme ray of X iff if it is a Non-negative Basic Direction, i.e., a partitioning of A = [B N] (with B square and nonsingular) s.t. [ B d = α 1 A j e j for some A j N and α > 0. ] 0 3

2. Linear Programming Geometry (contd.) The Representation Theorem: Let x 1,..., x k and d 1,..., d l be the extreme points and extreme rays of X respectively. Then X = x : x = k i=1 k i=1 λ i x i + l j=1 µ j d j λ i = 1, λ i 0 i, µ j 0 j. To prove the above result, we used: The Separation Theorem: Let S be a non-empty closed convex set, and x S. Then a vector c s.t. c T x < c T x x S. Theorem: (Cor. of Rep. Thm.) (a) An LP min{c T x : x X} has an optimal solution iff c T d j 0 for all extreme rays d j j = 1,..., l. (b) Extreme point optimality: If an LP has an optimal solution then there exists an extreme point that is optimal. 4

3. The Simplex Method Basic idea: Move from one extreme point (bfs) to another while improving the objective. Given a bfs x k with basis B move along one of the j-th Basic Directions (j N) [ d j B = 1 ] A j. e j If x is non-degenerate then d j is a feasible direction, i.e., allows a positive step move. If c T d j < 0 then d j is an improving direction. Note c T d j = c j c T B B 1 A j = c j (the reduced cost). If no improving direction exists, i.e. c j 0 for all j N, the current solution is optimal, Stop. Chose an improving basic direction d j from j N, and move to x k+1 x k + αd j where α 0 is such that x k+1 0. If d j 0 then α = + implying that the problem is unbounded, Stop. 5

3. The Simplex Method (contd.) Theorem: x k+1 is an adjacent bfs to x k with basis ˆB = B + {A j } {A l } where l is some basic variable that becomes nonbasic. Degeneracy, i.e., when a basic variable has a value of zero, is a problem. If x k is a degenerate, α could be zero, i.e., the basis changes from B to ˆB but x k+1 = x k and cause Stalling or Cycling. Can be dealt with by properly choosing j and l (e.g. Lexicographic rule). Theorem: The Simplex method (with proper pivot rules) solves LP in a finite number of iterations. 6

3. The Simplex Method (contd.) Revised Simplex and Tableau implementations. Initializing the Simplex method Two-phase Simplex Big-M method 7

4. Duality Standard form Primal-dual LP pairs: v P = min c T x v D = max b T y s.t. Ax = b s.t. A T y c. x 0 Recipe for writing dual problem for general LPs. Weak Duality Theorem: v D v P. Proof of WD: By construction of the dual problem. Strong Duality Theorem: If either problem has a finite optimal value then v D = v P. Proof 1 of SD: From the Simplex Method. (c T B B 1 are the optimal dual variables). Proof 2 of SD: From the theorems of alternatives (Farkaas Lemma). 8

4. Duality (Contd.) Fakaas Lemma: Let A R m n and b R m then exactly one of the following two systems (a or b) is feasible: (a) Ax = b (b) A T y 0 x 0 b T y < 0. Proof: Use Separating Hyperplane theorem. See different forms of Farkaas Lemma. From Duality to Polyhedral theory: An immediate proof of Farkaas Lemma. A simple proof of the Representation Thm. Converse to Rep. Thm.: Convex hull of a finite number of points is a polytope. 9

4. Duality (Contd.) LP Optimality Conditions (Cor. (x, y ) is primal-dual optimal iff of SD) A pair Ax = b, x 0 Primal Feasibility A T y c Dual Feasibility x j (c j A T j y ) = 0 j Complementary Slackness. Relation between non-degeneracy and uniqueness amongst primal and dual optimal solutions. The Dual Simplex Algorithm: A basis B is primal feasible (PF) if B 1 b 0 and dual feasible (DF) if c T c T B B 1 A 0. Start with a basis that is DF but not PF. Select a variable (< 0) to leave the basis (move towards PF). Select an entering variable to maintain DF. 10

4. Duality (Contd.) Dual Simplex is not analogous to applying Primal Simplex to the Dual problem. When to use Dual Simplex over Primal Simplex? Generalized Duality: The dual of v P = min{c T : Ax b, x X} is v D = max{l(y) : y 0} where L(y) := min{c T x + y T (b Ax) : x X}. 11

5. Sensitivity Analysis Consider the LP z = min{c T x : Ax = b, x 0}. An instance of the LP is given by the data (n, m, c, A, b). If the optimal solution x is non-degenerate then the i-th dual variable represents yi = z b i x i = 1,..., m. Local Sensitivity Analysis: (a) How doe the optimal solution x and the optimal value z behave under small perturbations of the problem data (n, m, c, A, b)? (b) How to efficiently recover the new optimal solution and optimal value after the perturbation? 12

5. Sensitivity Analysis (contd.) Adding a new variable: Current basis remains PF. So check DF (reduced cost of the new variable) and use Primal Simplex to optimize if needed. Adding a new constraint: Current basis remain DF. Check PF, and use Dual Simplex to optimize if needed. Perturbing b b + δd: Current basis remain DF and PF over a computable range of δ. Outside this range, we have DF but not PF, so use Dual Simplex to optimize. Perturbing c c + δd: Current basis remain DF and PF over a computable range of δ. Outside this range, we have PF but not DF, so use Primal Simplex to optimize. 13

5. Sensitivity Analysis (contd.) Perturbing A j A j + δd where j N: Current basis remain DF and PF over a computable range of δ. Outside this range, we have PF but not DF, so use Primal Simplex to optimize. Perturbing A j A j + δd where j B: Current basis remain DF and PF over a computable range of δ. Outside this range, both PF and DF maybe affected. Global behavior of value functions: (a) F (b) = min{c T x : Ax = b, x 0} is a convex function of b, and the dual solution y is a subgradient of F (b) at b. (b) G(c) = min{c T x : Ax = b, x 0} is a concave function of c, and x is a subgradient of G(c) at c. 14

6. Large-Scale LP Column Generation: The Cutting Stock Problem Dantzig-Wolfe decomposition. Row Generation: Benders decomposition. 15

7. Computational Complexity of LP A problem (class) is easy if there exists an algorithm whose computational effort required to solve any instance of the problem is bounded by some polynomial of the size of that instance (i.e. if there exists a polynomial time algorithm for the problem). Is LP easy? The Simplex method may require an exponential number (in the number of variables) of iterations! Klee-Minty (1972). Yudin and Nemirovskii (1977) developed Ellipsoid method and showed that general convex programs are easy and Khachian (1979) used it show that LP is indeed easy. 16

7. The Ellipsoid Method for LP The Ellipsoid method answers the following question Is X = {x R n Ax b} =? Assume: if X then 0 < v vol(x) V. We have a Separation Oracle S(x, X) which returns 0 if x X, otherwise it returns a vector a 0 such that a T y > a T x for all y X. 0. Find an ellipsoid E 0 (x 0 ) X. Set k = 0. 1. If S(x k, X) = 0 stop X. If vol(e k (x k )) v Stop X =. 2. If S(x k, X) = a k, then X H k := {x : a T k x a T k xk }. Find such that E k+1 (x k+1 ) E k (x k ) H k X vol(e k+1 (x k+1 )) vol(e k (x k )) < e 1/2(n+1). 3. Set k k + 1 and go to step 1. 17

7. The Ellipsoid Method for LP (contd.) The numbers v and V depend on n and U (the largest number in the data (A, b)). Theorem: The Ellipsoid method answers the question Is X = {x R n Ax b} =? in O(n 6 log(nu)) iterations. 18

7. The Ellipsoid Method for LP (contd.) Easily modified for optimization of a linear function over polyhedra. Polynomial complexity is preserved. Note the complexity does not depend on the number of constraints in X. Equivalence of Separation and Optimization: The description of X maybe involve an exponential number of constraints. However as long as we have a polynomial time Separation Oracle then the Ellipsoid algorithm guarantees that optimization of a linear function over X is still polynomial time! 19

8. Interior Point Methods min{c T x : x X} Basic idea: Given x k int(x), find a direction d k and a step size α k s.t. x k + α k d k =: x k+1 int(x) and c T x k+1 < c T x k. Continue until some termination criteria is met. The algorithms differ w.r.t choice of d k, α k and the termination criteria. May need some preprocessing to guarantee that an optimal solution exists. The algorithms are convergent lim k xk = x. A good criteria for finite termination is needed. 20

8. Interior Point Methods: The Affine Scaling Method Basic idea: Given x k int(x), construct an Ellipsoid E k (x k ) int(x). Choose x k+1 = argmin{c T x : x E k }. Based on the fact that the minimizer of a linear form over an Ellipsoid can be found analytically. Not proven to be polynomial time. 21

8. Interior Point Methods: The Primal path following (Barrier) method We want to solve P : min{c T x : Ax = b, x 0}. Use a penalty function to prevent iterates from approaching the boundary of the polyhedron. Reduce penalty as the iterates approach an optimal solution (on the boundary). Given µ > 0, the barrier problem is P (µ) : min{f µ (x) := c T x µ n j=1 log(x j ) : Ax = b}. 22

8. Interior Point Methods: The Barrier method For any µ > 0 the function f µ (x) is strictly convex the problem P (µ) has a unique optimal solution x(µ). For any µ > 0, x(µ) int(x), where X = {x : Ax = b, x 0}. For µ = +, x(µ) is the analytic center of X. As µ 0, x(µ) x. The set of solutions {x(µ) : µ (0, )} is known as the Central Path. How to find x(µ) (at least approximately)? 23

Aside: NLP Optimality Conditions NLP : min{f(x) : Ax = b, x 0} LP (x ) : min{ f(x ) T x : Ax = b, x 0} Theorem: If x is an optimal solution of NLP then x is an optimal solution of LP (x ). Theorem: If f is convex, then x is an optimal solution of NLP iff x is an optimal solution of LP (x ). Theorem: If x is an optimal solution of NLP then x solves the KKT system Ax = b, x 0 A T y + s = f(x ), s 0 x j s j = 0 j = 1,..., n. Theorem: If f is convex, then x is an optimal solution of NLP iff x solves the KKT system Ax = b, x 0 A T y + s = f(x ), s 0 x j s j = 0 j = 1,..., n. 24

8. Interior Point Methods: The Barrier method (contd.) x(µ) is a solution of the KKT system for the Barrier problem Ax = b, x > 0 A T y + s = c, s > 0 x j s j = µ j = 1,..., n. The system is nonlinear difficult to solve. We are content with β-approximate solutions (0 < β < 1) Ax = b, x > 0 A T y + s = c, s > 0 nj=1 ( x js j µ 1)2 β 2 For fixed β, lim µ 0 x β (µ) = lim µ 0 x(µ) = x. 25

8. Interior Point Methods: The Barrier method (contd.) Let β = 1/2. Start with some µ k > 0 and a β- approximation x k of x(µ k ). Linearize the KKT system around x k and solve it to get the new solution x k+1. It can be shown that x k+1 is a β-approximation of x(µ k+1 ) with µ k+1 = (1 1 2+4 n )µ k. Continue until the duality gap (x k ) T s k ɛ. 26

8. Interior Point Methods: The Barrier Method (contd.) Theorem: The barrier algorithm reduces the duality gap from ɛ 0 to ɛ in O( n log ɛ 0 ɛ ) iterations. 27

Not covered: Network Flow Problems A very important class of problems. Constraint matrix has a very special structure, called a Network matrix. Specialized Simplex type algorithm is strongly polynomial time. E.g. Transportation and Assignment Problems. 28

What s next? Optimization Courses in SP 04 ISyE 6662: Optimization II. Ph.D. level class on Integer Programming and Network Flows. Offered by Prof. Ergun. ISyE 8871: Integer Programming. Advanced Ph.D. level class on Integer Programming. Offered by Prof. Nemhauser. ISyE 6663: Optimization III. Nonlinear Programming programming theory for Ph.D. students. Offered by Prof. Nemirovskii. ISyE 8813: Advanced Ph.D. class on Interior Point Methods. Offered by Prof. Nemirovskii. ISyE 6669: Deterministic optimization (MS level). ISyE 6673: Financial optimization models (MS level). Offered by Prof. Sokol. ISyE 6679: Computational Methods in Optimization. Offered by Prof. Barnes. 29