Math Camp Notes: Constrained Optimization Consider the following general constrained optimization problem: g 1 (x 1,, x n ) b 1,, g k (x 1,, x n ) b k, h 1 (x 1,, x n ) = c 1,, h k (x 1,, x n ) = c k The function f(x) is called the objective equation, g(x) is called an inequality constraint, and h(x) is called an equality constraint Notice that this problem diers from the regular unconstrained optimization problem in that instead of nding the extrema of the curve f(x), we are nding the extrema of f(x) only at points which satisfy the constraints Maximize f(x) = x 2 subject to 0 x 1 Solution: We know that f(x) is strictly monotonically increasing over the domain, therefore the maximum (if it exists) must lie at the largest number in the domain Since we are optimizing over a compact set, the point x = 1 is the maximal number in the domain, and therefore it is the maximum This problem was easy because we could visualize the graph of f(x) in our minds and see that it was strictly monotonically increasing over the domain However, we see a method to nd constrained extrema of functions even when we can't picture them in our minds Equality Constraints One Constraint Consider a simple optimization problem with only one constraint: x R h(x 1,, x n ) = c Now draw level sets of the function f(x 1,, x n ) Since we might not be able to achieve the unconstrained extrema of the function due to our constraint, we seek to nd the value of x which gets us onto the highest level curve of f(x) while remaining on the function h(x) Notice also that the function h(x) will be just tangent to the level curve of f(x) Call the point which maximizes the optimization problem x Since at x the level curve of f(x) is tangent to the curve g(x), it must also be the case that the gradient of f(x ) must be in the same direction as the gradient of h(x ), or f(x ) = λ h(x ), where λ is merely a constant Expanding this formula, we have df = λ dh df λ dh = 0 df = λ dh df λ dh = 0 dx n dx n dx n dx n Integrate each equation to get f λh C = 0 1
Therefore, when we derivate this equation, we get our previously stated condition on the gradient Since this equation yields the proper rst order condition for all C, it yields the proper rst order condition when C = λc Call the equation when C = λc the Lagrangean L(x 1,, x n, λ) = f(x 1,, x n ) λ [h(x 1,, x n ) c] Taking the partial derivatives of this function will yield the rst order conditions for an extrema The reason we have C = λc is so that taking the derivative of the Lagrangean will yield our equality constraint Maximize f(x 1, x 2 ) = x 1 x 2 subject to h(x 1, x 2 ) x 1 + 4x 2 = 16 Solution: Form the lagrangean The rst order conditions are From the rst two equations we have L(x 1, x 2 ) = x 1 x 2 λ (x 1 + 4x 2 16) = x 2 λ = 0 dx 2 = x 1 4λ = 0 dλ = x 1 + 4x 2 16 = 0 x 1 = 4x 2 Plugging this into the last equation we have that x 2 = 2, which in turn implies that x 1 = 8, and that λ = 2 This states that the only candidate for the solution is (x 1, x 2, λ) = (8, 2, 2) Remember that points obtained using this formula may or may not be a maximum or minimum, since the rst order conditions are only necessary conditions They only give us candidate solutions There is another more subtle way that this process may fail, however Consider the case where h(x ) = 0, or in other words, the point which maximizes f(x) is also a critical point of h(x) Remember our necessary condition for a maximum f(x ) = λ h(x ), Since h(x ) = 0, this implies that f(x ) = 0 However, this the necessary condition for an unconstrained optimization problem, not a constrained one! In eect, when h(x ) = 0, the constraint is no longer taken into account in the problem, and therefore we arrive at the wrong solution Many Constraints Consider the problem h 1 (x 1,, x n ) = c 1,, h m (x 1,, x n ) = c m Lets rst talk about how this might fail As we saw for one constraint, if h(x ) = 0, then the constraint drops out of the equation Now consider the Jacobian matrix, or a vector of the gradients of the dierent h i (x ) Dh(x ) = h 1 (x ) h m (x ) = dh 1(x ) dh 1(x ) dx n dh m(x ) dh m(x ) dx n Notice that if and of the h i (x )'s is zero, then that constraint will not be taken into account in the analysis Also, there will be a row of zeros in the Jacobian, and therefore the Jacobian will not be full rank The generalization of the condition that h(x ) 0 for the case when m = 1 is that the Jacobian matrix must 2
be full rank Otherwise, one of the constraints is not being taken into account, and the analysis fails We call this condition the non-degenerate constraint quailication (NDCQ) The lagrangean for the multi-constraint optimization problem is L(x 1,, x n, λ) = f(x 1,, x n ) And therefore the necessary conditions for a maximum are m λ i [h i (x 1,, x n ) c i ] i=1 = 0,, = 0 dx n = 0,, = 0 dλ 1 dλ n Maximize f(x, y, z) = xyz subject to h 1 (xyz) x 2 + y 2 = 1 and h 2 (xyz) x + z = 1 Solution: First let us form the Jacobian matrix Dh(x, y, z) = ( 2x 2y 0 1 0 1 Notice that this is only singular if both x = y = 0 However, if this is the case, then our rst constraint fails to hold Therefore, the matrix is nonsingular for all points in the domain, and so we don't need to worry about the NDCQ Form the lagrangean L(x, y, z) = xyz λ 1 ( x 2 + y 2 1 ) λ 2 (x + z 1) ) The rst order conditions are x = yz 2λ 1x λ 2 = 0 y = xz 2λ 1y = 0 z = xy λ 2 = 0 λ 1 = x 2 + y 2 1 = 0 λ 2 = x + z 1 = 0 We can solve the second and third equations for λ 1 and λ 2 and plug these into the rst equation to get Multiply both sides by y λ 1 = xz 2y, λ 2 = xy yz 2 x2 z 2y xy = 0 y 2 z x 2 z xy 2 = 0 Now we want to solve the two constraints for y and z in terms of x, plug them into this equation, and get one equation in terms of x (1 x 2 )(1 x) x 2 (1 x) x(1 x 2 ) = 0 3
(1 x) [ (1 x 2 ) x 2 x(1 + x) ] = 0 (1 x) [ 1 3x 2 x ] = 0 { } This yeilds x = 1, 1+ 13 6, 1 13 6 Let's analyze x = 1 rst From the second constraint we have that z = 0, and from the rst constraint we have that y = {1, 1} However, this violates the equation y 2 z x 2 z xy 2 = 0, and therefore x = 1 cannot be a solution Plugging in the other values, we get four candidate solutions x = 4343, y = ±09008, z = 05657 x = 7676, y = ±06409, z = 17676 Inequality Constraints One Constraint Consider the problem g 1 (x 1,, x n ) b 1 In this case the solution is not constrained to the curve, but merely bounded by it Solving the constrained optimization problem with inequality constraints is the same as solving them with equality constraints, but with more conditions In order to understand the new conditions, imagine the graph of the level sets which we talked about before Instead of being constrained to the function g(x), the domain is now bounded by it instead However, the boundary of the function is still the same as before Notice that there is still a point where the boundary is tangent to the highest level set on g(x) The question now is whether the boundary is binding or not binding Case 1: Binding This is the case where the unconstrained extrema lies outside of the domain In other words, the inequality constrains us from reaching the extrema of f(x) In this case, the gradient of f(x) is going to point in the steepest direction up the graph The gradient of g(x) points to the set g(x) b (since it points in the direction of increasing g(x)) Therefore, if the constraint is binding, then g(x) is pointing in the same direction as f(x), which implies that λ 0 Case 2: Not Binding This is the case where the unconstrained extrema lies inside the domain In other words, the inequality does not constrain us from reaching the extrema of f(x) Remember again that the gradient of f(x) is going to point in the steepest direction up the graph The gradient of g(x) points to the set g(x) b (since it points in the direction of increasing g(x)) This time, however, the gradient of g(x) will point away from the gradient of f(x) (since we have passed the optimum, the gradient of f(x) is now pointing in the opposite direction) Therefore, if the constraint is not binding, then g(x) is pointing in opposite direction as f(x), which implies that λ 0 As we can see, it does not matter whether the constraint binds or does not bind, the lagrangean multiplier must always be greater than or equal to 0 Therefore, a new set of conditions are λ i 0 i If the constraint binds, then that means the solution will lie somewhere on the constraint Therefore, we can replace the inequality constraint with an equality constraint: g 1 (x 1,, x n ) = b 1 4
This implies that g 1 (x 1,, x n ) b 1 = 0 If the constraint is not binding, then there is no use in having it in the analysis Notice that if λ = 0, then the constraint drops out just as we wish Therefore, we have a new set of conditions called the complementary slackness conditions [g 1 (x 1,, x n ) b 1 ] λ = 0 This works because if the constraint is binding, then g 1 (x 1,, x n ) b 1 = 0, and if the constraint is not binding, then λ = 0 Many Constraints Consider the problem: g 1 (x 1,, x n ) b 1,, g m (x 1,, x n ) b m In order to understand the new NDCQ, we must realize that if a constraint does not bind, we don't care if it drops out of the equation The point of the NDCQ was to ensure that binding constraints do not drop out Therefore, the NDCQ with inequality constraints is the same as the equality constraints, except for the fact that we only care about the Jacobian matrix of the binding constraints, or the Jacobian for the constraints with λ i > 0 The following rst order conditions will tell us the candidate points for extrema Homework = 0,, = 0 dx n g 1 (x 1,, x n ) b 1,, g m (x 1,, x n ) b m λ 1 0,, λ m 0 [g 1 (x 1,, x n ) b 1 ] λ 1 = 0,, [g m (x 1,, x n ) b m ] λ m = 0 1 In chapter 18 of Simon and Blume, numbers 3, 6, 7, 10, 11, and 12 5