Lecture Cost functional
|
|
|
- Rosalind Powers
- 10 years ago
- Views:
Transcription
1 Lecture Cost functional Consider functions L(t, x, u) and K(t, x). Here L : R R n R m R, K : R R n R sufficiently regular. (To be consistent with calculus of variations we would have to take L(t, x, ẋ), not L(t, x, u). However, we can always go from (t, x, ẋ) to (t, x, u) by plugging in ẋ = f(t, x, u), but not necessarily back, so the latter choice of variables is preferable.) Unlike with f, u, there are no a priori conditions here. All derivatives that will arise will be assumed to exist, so depending on the analysis method we will see at the end what differentiability assumptions on K, L we need. tf J := L(t, x(t), u(t))dt + K(t f, x f ) Here J = J(u) or, in more detail, J(, x 0,, u, t f, x f ), where: and x 0 = x( ) are the initial time and state; t f and x f = x(t f ) are the final time and state (x f depends on u); L is the running cost (or Lagrangian); K is the terminal cost. The above form of J is Bolza form. Two special cases: Lagrange form J = t f L(t, x, u)dt (no terminal cost). This comes from calculus of variations (Lagrange problem) Mayer form J = K(t f, x f ) (no running cost) We can pass back and forth between these: Mayer Lagrange K(t f, x f ) = K(, x 0 ) } {{ } constant, can ignore = K(, x 0 ) + + d (K(t, x(t))dt dt (K t + K x f(t, x, u))dt Lagrange Mayer Introduce additional state x 0 : then ẋ 0 = L(t, x, u), x 0 ( ) = 0 tf L(t, x, u)dt = x 0 (t f ) which is a terminal cost. (The value of x 0 ( ) actually doesn t matter, it s just a constant and we can subtract it.) 69
2 Note that a similar trick can be used to eliminate time-dependence (of f and L): Let x n+1 := t, so ẋ n+1 = 1, and treat is as additional state (in this case we need f to be C 1 in t for existence and uniqueness of solutions) Target set (, x 0 ) are given, but what about (t f, x f )? One or both can be free or fixed, or need to belong to some set. All these possibilities are captured by introducing a target set S [, ) R n (or use R for time) such that (t f, x f ) S. Examples (see [A-F, Sect and 5-12]): S = [, ) {x 1 }, where x 1 is fixed. This gives a fixed-endpoint, free-time problem. S = {t 1 } R n, where t 1 is fixed. This gives a free-endpoint, fixed-time problem. S = [, ) S 1, where S 1 is a surface (manifold) in R n. Note: this is what we get if we start with a free-endpoint, fixed-time problem and consider auxiliary state x n+1 as before: ẋ n+1 = 1. Then S 1 R n+1 is {x : x n+1 = t 1 }. More generally, S = T S 1 where S 1 is as in the previous item and T is some subset of [, ). As a special case of this, we have S = {t 1 } {x 1 } (fixed-endpoint, fixed-time problem, the most restrictive). S = {(t, g(t)) : t [, )} for some g : R R n. This is trajectory tracking (moving target). Or g(t) can be generalized to a set, i.e., g can be set-valued. Finally, Problem: Find u( ) that minimizes J. 3.4 A variational approach to the optimal control problem We ll first try to see how far we can get with variational techniques. The problems we ll encounter will motivate us to modify our approach and will lead to the the Maximum Principle. We ll actually also arrive at the correct statement of the Maximum Principle.. References: [A-F, Sect. 5-7, 5-9, 5-10] (5-8 is sufficient conditions); [ECE 515 Class Notes, Sect. 11.2] Free-endpoint, fixed-time case ẋ = f(t, x, u), x( ) = x 0 where x R n, u R m (unconstrained), J(u) = L(t, x, u)dt + K(x(t 1 )) (3.2) 70
3 (no direct dependance of terminal cost on t 1 ), t 1 fixed, x(t 1 ) free (S = {t 1 } R n ). Let u (t) be an optimal control (assumed to exist), and let x (t) be the corresponding optimal trajectory. Goal: derive necessary conditions. To simplify matters, assume that u is a global min: for all admissible (i.e., piecewise continuous) u. Begin optional: J(u ) J(u) If we want to work with local optimality, then it can be shown that the right topology for which the Maximum Principle works is uniform convergence of x and L 1 convergence of u this is Pontryagin s topology [Milyutin and Osmolovskii, Calculus of Variations and Optimal Control, AMS, 1998, p. 23]. On the other hand, the variational approach is with respect to 0 topology on U. This is close to weak minima: if ẋ = u then this is exactly weak minimum, (αη, αξ) for α small. End optional Right now, we ignore these distinctions as we work with a global minimum. Perturbation: We cannot perturb x (t) directly, only through the control. This is new, compared to calculus of variations with constraints where we had perturbations of the curve subject to δc(η) = 0, C constraint. So, consider control It gives the trajectory u = u + αη x = x + αξ where we have ξ( ) = 0. How are η and ξ related? On one hand, ẋ = ẋ + αξ = f(t, x, u ) + αξ On the other hand, using Taylor with respect to α around α = 0, ẋ = f(t, x, u) = f(t, x + αξ, u + αη) = f(t, x, u ) + f x ξα + f u ηα + o(α) where always means evaluation at the optimal trajectory. Hence ξ = f x ξ + f u η + o(α) α (3.3) Note: this is a forced linear time-varying system, of the form ξ = A(t)ξ + B(t)η + h.o.t. (3.4) (f x is a matrix!) This is nothing but the linearization of the original system around the trajectory x (t). We have J given by (3.2), and we have the constraint ẋ(t) f(t, x, u) = 0. Recall Lagrange s idea for treating non-integral constraints: for an arbitrary function p(t), rewrite the cost as J(u) = K(x(t 1 )) + (L(t, x, u) + p(t)(ẋ f(t, x, u))) dt 71
4 Here p(t) is a Lagrange multiplier function (what we called λ(x) in calculus of variations, λ is also often used), it is unspecified for now (can be anything). (The relation with Lagrange multipliers will become clearer in Exercise 10 below.) Also note that p, f, ẋ are n-vectors, so we are dealing with inner products. We will write this more explicitly from now on: p, ẋ f(t, x, u) Let s introduce the Hamiltonian: H(t, x, u, p) := p, f(t, x, u) L(t, x, u) In calculus of variations we had H = p, y L which matches the current definition (since in new notation, y ẋ). We have Similarly, We know: J(u) J(u ). Let us analyze J(u) J(u ), by Taylor: J(u) = K(x(t 1 )) + ( p, ẋ H(t, x, u, p)) dt J(u ) = K(x (t 1 )) + ( p, ẋ H(t, x, u, p)) dt K(x(t 1 )) K(x (t 1 )) K x (x (t 1 )), x(t 1 ) x (t 1 ) = K x (x (t 1 )), ξ(t 1 ) α Here and below, stands for equality up to o(α). H(t, x, u, p) H(t, x, u, p) H x, x x + H u, u u = H x, ξ α + H u, η α Hence, ( J(u) J(u ) α K x (x (t 1 )), ξ(t 1 ) + α p, ) ξ H x, ξ H u, η dt We have just computed the first variation δj u (η)! It s all the first-order terms in α. As in calculus of variations, let us simplify the term containing ξ, integrating it by parts: p, ξ dt = p, ξ t 1 t0 ṗ, ξ dt Since x = x + αξ and x( ) is fixed, we must have ξ( ) = 0. So, p, ξ dt = p(t 1 ), ξ(t 1 ) ṗ, ξ dt and we see that δj u (η) = K x (x (t 1 )) + p(t 1 ), ξ(t 1 ) ( ) H x + ṗ, ξ + H u, η dt 72
5 The familiar first-order necessary condition for optimality (Section 1.3.2) says that we must have δj u (η) = 0. But this makes the choice of p( ) clear: ṗ = H x = (f x ) T p + L x (recalling that H = f T p L), and p(t 1 ) = K x (x (t 1 )) is the boundary condition. (If K = 0, i.e., no terminal cost, then p(t 1 ) = 0.) Note: we have a final (not initial) condition for p. Note that the ODE for p(t) is also linear (affine, or forced): ṗ = A T (t)p + L x where A is the same as in the ODE (3.4) for the state perturbation ξ. Two linear systems with matrices A and A T are called adjoint. For this reason, p is called the adjoint vector. Call it p from now on. The significance of the concept of adjoint will become clear in the proof of the Maximum Principle ( p, x = const). p is also called the costate, because it acts on x, ẋ by p T ẋ (dual, covector). It is also a (distributed) Lagrange multiplier. Note that x satisfies ẋ = f = H p (recall again that H = p, f L), so x and p satisfy Hamilton s canonical equations: ẋ = H p ṗ = H x (3.5) (From now on, means evaluation at (x, p ) only then are the equations valid. ) If we had a fixed endpoint problem, then the term p(t 1 ), ξ(t 1 ) would disappear and the boundary condition on p(t 1 ) would not be enforced; we would use the endpoint condition on x instead. More on this case later (problems arise). With the above definition of p, we are left with δj u (η) = H u, η dt = 0 η We know from calculus of variations ( DuBois-Reymond ) that this means H u = 0, or H u (t, x (t), u (t), p (t)) 0 (3.6) (already saw this in calculus of variations ). This means that H(u) has an extremum at u. 73
6 74
7 Lecture 12 Exercise 10 (due Mar 7) The purpose of this exercise is to recover the earlier conditions from calculus of variations, namely, the Euler-Lagrange equation and the Lagrange multiplier condition (for the case of multiple degrees of freedom and several constraints) from the preliminary necessary conditions for optimal controls derived so far, namely, (3.5) and (3.6). a) The standard (unconstrained) problem from calculus of variations can be rewritten in the new optimal control language if we consider the control system ẋ i = u i, i = 1,... n together with the cost J(u) = t 1 L(t, x, ẋ)dt. Use (3.5) and (3.6) for this system to prove that L must d satisfy the Euler-Lagrange equation, dt L ẋ i = L xi, along the optimal trajectory. b) Now suppose that we have m < n non-integral constraints. We can model this by the control system ẋ i = f i (t, x 1,..., x n, u 1,..., u n m ), ẋ m+i = u i, i = 1,..., m i = 1,..., n m and the same cost as in a). Use (3.5) and (3.6) for this system to prove that there exist functions λ i (t), i = 1,..., m such that the augmented Lagrangian L + m i=1 λ i(t)(ẋ i f i ) satisfies the Euler-Lagrange equation along the optimal trajectory. Hint: part b) is a bit tricky. The Lagrange multipliers λ i are related to the components of the adjoint vector p, but they are not the same. Also note that the Lagrangian has ẋ i as arguments (since Lẋi appear in Euler-Lagrange equation), whereas the Hamiltonian that you will construct should not (it is a function of t, x, u, p), so you will need to use the differential equations to put H in the right form. A consequence of this, in part b), is that x i will appear inside H in two different places. In order to see whether the extremum of H as a function of u along the optimal trajectory is a minimum or a maximum, let us bring in the second variation. In δ 2 J u, going back to the earlier expression for J(u) J(u ) but this time expanding up to second-order terms in α, we ll get second partial derivatives of K outside the integral and second partial derivatives of H inside the integral. Let us look at the latter (since we care about H). x = x + αξ, u = u + αη. H(t, x, u, p) H(t, x, u, p) = H x, ξ α + H u, η α + ( ξ η ) ( H xx H xu H xu H uu) Which term in the above Hessian matrix is more important? ( ) ξ α2 η 2 + o(α2 ) As in calculus of variations (Section 2.6), we can see that η is more important: we may have a large η producing small ξ (but not vice versa). (In calculus of variations η was more important than η.) Since H appears in J with a sign, the second-order necessary condition is η T H uu η 0 η i.e., the Hessian of H as a function of u along the optimal trajectory should be negative semidefinite. 75
8 η(t) ξ(t) t ξ = Aξ + Bη t Figure 3.4: Large η produces small ξ Therefore, the extremum of H at u is actually a maximum. We arrive at the following necessary condition for optimality: If u (t) is an optimal control and x (t) is the corresponding optimal trajectory, then there exists an adjoint vector p (t) (we use superscript to emphasize that it is associated with the optimal trajectory) such that: 1) x (t), p (t) satisfy the canonical equations (3.5),where means evaluation at x, p, u, with boundary conditions x ( ) = x 0, p (t 1 ) = K x (x (t 1 )). 2) H(t, x (t), u, p (t)), viewed as a function of u for each t (here we treat u as a variable independent of x and p, ignoring system dynamics), has a (possibly local) maximum at u = u (t), for each t [, t 1 ]: H(t, x (t), u (t), p (t)) H(t, x (t), u, p (t)) u near u (t), t Comments: Sign difference: we could define a different Hamiltonian and a different costate: ˆp := p, Ĥ := ˆp, f + L = H So Ĥ would have a minimum at u, and ˆp would still satisfy ˆp = ṗ = H x = (f x ) T p L x = (f x ) T ˆp L x = Ĥx This alternative convention is the one followed in [A-F], and it might seem a bit more natural: min u J min u H. But the convention we follow is consistent with the definition of Hamiltonian in calculus of variations and its mechanical interpretation as total energy of the system. Note that if the original problem is maximization, then it can also be reduced to minimization by L L. So, whether we get minimum principle or maximum principle has nothing to do with max or min in the problem, and is determined just by the sign convention. Note that our calculus of variations definition of the momentum (p = Lẋ) has disappeared and is replaced by the existential statement. This is more suitable for the Maximum Principle but in the present variational setting it can be recovered: if ẋ = u (unconstrained case in calculus of variations ) then H = pu L and we have so p = L u. 0 = H u = p L u 76
9 Using canonical equations, we have (note that in general H = H(t, x, u, p), i.e., explicitly depends on t): dh dt = H t + H x, ẋ + H p, ṗ + H u, u = H t because the first two terms cancel each other and the third term is zero. (Assuming that u is differentiable.) In particular, if the problem is time-invariant (no t in f and L) then H is constant along the optimal trajectory. Making suitable assumptions so that δ 2 J u > 0 and dominates o(α 2 ), we can get sufficient conditions for optimality [A-F, Sect. 5.8]. They apply, e.g., to LQR (same section). The following problem is on the last point, and also illustrates the necessary conditions. Exercise 11 (due Mar 7) Consider ẋ = Ax + Bu (LTI or LTV, doesn t matter), J = (x T Qx + u T Ru)dt, Q 0, R > 0 (or Q(t) 0, R(t) > 0 t), free endpoint, fixed time as above. a) Write down the above necessary conditions for this case, and find a formula for the control satisfying them. b) By analyzing the second variation, show that this control is indeed optimal (in what sense?) Critique of the variational approach and preview of the Maximum Principle (We ll completely trash it now.) Summary of the above approach: We took a perturbation u = u + αη (3.7) where α is a small number. We showed (using the first variation) that H u, η dt = 0 (3.8) We deduced that H u 0, hence H has an extremum at u. We then showed (using the second variation) that H uu 0, hence the extremum is a maximum. 77
10 The good thing about this approach is that it leads, quite quickly, to the correct form of necessary conditions (canonical equations plus Hamiltonian maximization). However, it has several severe limitations which we now discuss. Considering perturbations of the form (3.7) is OK for the unconstrained case u R m. However, in control applications we usually have constraints on the control range: u U R m bounded set. U may even be finite (think of car gears)! We ll see in the Maximum Principle that the statement that H(u) has a maximum at u (with all other arguments evaluated along the optimal trajectory) is still correct! In fact, u is a global maximum of H(u). But the variational approach doesn t let us show this, because H u = 0 need not hold for u on the boundary of U. ([A-F, Sect. 5-9]) We treated the free-endpoint case, but in practice we want to hit some target set S. Take the simplest case: S = {t 1 } {x 1 } (fixed endpoint). Then the control perturbation η is no longer arbitrary. It gives a state perturbation ξ, and we must have ξ(t 1 ) = 0. We know that ξ and η are related by ξ = Aξ + Bη so we need 0 = ξ(t 1 ) = Φ(t 1, t)b(t)η(t)dt where Φ(, ) is the transition matrix for A( ). The equation (3.8) holds only for these perturbations, and not for all η. So we cannot conclude that H u = 0. (Lemma 5-1 in [A-F, p. 248] doesn t help.) Writing conditions like H u = 0, we are assuming that H is differentiable with respect to u. But recall that H = p, f L. So first, we need L u to exist. This rules out very reasonable cost functionals like Second, we need f u to exist. u(t) dt Was this part of the standing assumptions needed for existence and uniqueness of solutions? No! So, the variational approach requires extra regularity assumptions on the system. The second variation analysis (needed to conclude that u is a maximum) involves H uu even worse! We want to establish the maximization claim directly, not by working with derivatives. u = u + αη, x = x + αξ this means that we are allowing only small deviations in both x and u. If we had ẋ = u, this would correspond exactly to the notion of weak minimum in calculus of variations. ẋ = f(t, x, u), x x 0 and u u 0 small is an appropriate generalization of that. However, we want optimality with respect to control perturbations that may be large, as long as the corresponding trajectories are close (as discussed some time ago); see Figure 3.5. Important for bang-bang controls (U bounded). We want to incorporate such perturbations this would correspond to the notion of strong minima from calculus of variations (actually, we have minima with respect to Pontryagin s topology, see 78
11 u (t) u(t) u(t) u (t) τ t τ + ε t t Figure 3.5: Control perturbation before). A richer perturbation family is crucial for obtaining a better result, i.e., sharper necessary conditions that we want (MP). See a nice quote on p. 154 of Stochastic Controls by Zhou and Yong (cutting the umbilical cord ). So, while the basic statement of the result (canonical equations plus Hamiltonian maximization) in the Maximum Principle will be very similar to the one obtained using the variational approach, we want to incorporate: Constraints on the control set; Constraints on the final state; Less restrictive notion of closeness of controls (this one is key for the others); Fewer differentiability assumptions and for these reasons, the proof will be quite different and much more involved. the Maximum Principle was proved many years later than all basic calculus of variations results, and it is a very nontrivial extension. The proof of the Maximum Principle is also much more geometric. 79
12 Chapter 4 The Maximum Principle 4.1 The statement of the Maximum Principle For the Maximum Principle, A-F is a primary reference. Supplementary references: the original book by Pontryagin et al.; Sussmann s lecture notes (more rigorous and modern treatment). All standing assumptions on the general control problem still hold, we ll just make them more specific. We will state and prove the Maximum Principle for two special problems, and later discuss how all other cases can be deduced from this Special Problem 1 System: ẋ = f(x, u), x( ) = x 0 (given), no t dependence, f is C 1 in x (no differentiability with respect to u!), f and f x are C 0 in u (these are conditions for existence and uniqueness we had earlier). x R n, u U some subset of R m. Can be the whole R m. Usually closed (Sussmann assumes this, but A-F and Pontryagin don t). L = L(x, u), no t dependence. Assume that L satisfies the same regularity assumptions as f. K = 0 no terminal cost. Target set: S = R {x 1 } fixed-endpoint, free-time problem. J(u) = t 1 L(x, u)dt, where t 1 is the first time at which x(t) = x 1. Theorem 4 (Maximum principle for Special Problem 1 [Theorem 5-5P in [A-F, p. 305]) Let u (t) be an optimal control and let x (t) be the corresponding trajectory. Then there exists a function p (t) and a constant p 0 0, satisfying (p 0, p (t)) (0, 0) t, such that: 1) x (t) and p (t) satisfy the canonical equations ẋ = H p (x, u, p, p 0 ) ṗ = H x (x, u, p, p 0 ) with boundary conditions x ( ) = x 0, x (t 1 ) = x 1, where the Hamiltonian is defined as H(x, u, p, p 0 ) := p, f(x, u) + p 0 L(x, u) u U, t [, t 1 ] 80
13 2) H(x (t), u (t), p (t), p 0 ) H(x (t), u, p (t), p 0 ) u U, t [, t 1 ], i.e., u is a global maximum of H(x (t),, p (t)) t. 3) H(x (t), u (t), p (t), p 0 ) 0, t [, t 1 ]. Comments: u is optimal by this we mean that for any other control u taking values in U which transfers x from x 0 at t = to x 1 at some (unspecified) time t = t 1, J(u ) J(u) (We could relax the assumption of global minimum to a local minimum, but need to know what topology to use, so we won t do that. The proof will make the topology more clear.) p 0 is the abnormal multiplier. We saw its analog in calculus of variations. It handles pathological cases. If p 0 0, then we can always normalize it to p 0 = 1 and forget it, which is what one does when applying the Maximum Principle (almost always). For this reason, we will also suppress the argument p 0 of H from now on. We saw in the variational approach (just a little while ago) that H = const, but H 0 may seem surprising. This is a feature of the free-time problem (earlier we were looking at the fixed-time case). More on this later. How restrictive is Special Problem 1? Time-independence no problem: if time-dependent, can always set x n+1 = t and eliminate t (but we need f to be C 1 with respect to t); No terminal cost no problem: if we have terminal cost, can do and define ˆL := L + K t + K x f. K(t f, x f ) = K(, x 0 ) + (K t + K x f(t, x, u)) dt S = R {x 1 } this is not very general, need more general target set Special Problem 2 Same as Special Problem 1, except S = R S 1, where S 1 is a k-dimensional surface in R n, 1 k n. This is defined as follows: S 1 = {x R n : h 1 (x) = h 2 (x) = = h n k (x) = 0} Here, h i are smooth functions from R n to R. We also assume that each x S 1 is a regular point, in the sense discussed before: the gradients (h i ) x are linearly independent at x. The special case k = n means S 1 = R n. 81
14 Special Problem 1 can be thought of as a special case with k = 0. Does the case S 1 = R n (i.e., both x f and t f are free) make sense? When do we stop? Why even start moving? Answer: yes, because L may be negative. (Also if we have K, which translates to the same thing: moving decreases the cost.) When we say cost, we may think implicitly that L 0, but it is not necessarily true. The difference between the next theorem and the previous one is only in the boundary conditions for the canonical system. Theorem 5 (Maximum principle for Special Problem 2 [Theorem 5-6P in [A-F, p. 306]) Same as the Maximum Principle for Special Problem 1, except: The final boundary condition in the canonical equations is and there is one more necessary condition: x (t 1 ) S 1 4) ( transversality condition ) The vector p (t 1 ) is orthogonal to the tangent space to S 1 at x (t 1 ): p (t 1 ), d = 0 d T x (t 1 )S 1 We know this means that p (t 1 ) is a linear combination of (h i ) x (x (t 1 )), since T x (t 1 )S 1 = {d R n : (h i ) x (x (t 1 )), d = 0, i = 1, 2,..., n k} If k = n, then S 1 = R n. In this case, the transversality condition says (or can be extended to say) p (t 1 ) = 0 (p must be orthogonal to all d R n, since there are no h i ). Note that we still have n boundary conditions at t = t 1 : k degrees of freedom for x (t 1 ) S 1 correspond to n k equations, and n k degrees of freedom for p (t 1 ) S 1 correspond to k equations/constraints. The freer the state, the less free the costate. Each additional degree of freedom for x (t 1 ) kills one degree of freedom for p (t 1 ). 82
Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
OPTIMAL CONTROL OF A COMMERCIAL LOAN REPAYMENT PLAN. E.V. Grigorieva. E.N. Khailov
DISCRETE AND CONTINUOUS Website: http://aimsciences.org DYNAMICAL SYSTEMS Supplement Volume 2005 pp. 345 354 OPTIMAL CONTROL OF A COMMERCIAL LOAN REPAYMENT PLAN E.V. Grigorieva Department of Mathematics
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
Cost Minimization and the Cost Function
Cost Minimization and the Cost Function Juan Manuel Puerta October 5, 2009 So far we focused on profit maximization, we could look at a different problem, that is the cost minimization problem. This is
MATH 425, PRACTICE FINAL EXAM SOLUTIONS.
MATH 45, PRACTICE FINAL EXAM SOLUTIONS. Exercise. a Is the operator L defined on smooth functions of x, y by L u := u xx + cosu linear? b Does the answer change if we replace the operator L by the operator
Duality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
DERIVATIVES AS MATRICES; CHAIN RULE
DERIVATIVES AS MATRICES; CHAIN RULE 1. Derivatives of Real-valued Functions Let s first consider functions f : R 2 R. Recall that if the partial derivatives of f exist at the point (x 0, y 0 ), then we
Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
Date: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
(a) We have x = 3 + 2t, y = 2 t, z = 6 so solving for t we get the symmetric equations. x 3 2. = 2 y, z = 6. t 2 2t + 1 = 0,
Name: Solutions to Practice Final. Consider the line r(t) = 3 + t, t, 6. (a) Find symmetric equations for this line. (b) Find the point where the first line r(t) intersects the surface z = x + y. (a) We
The Heat Equation. Lectures INF2320 p. 1/88
The Heat Equation Lectures INF232 p. 1/88 Lectures INF232 p. 2/88 The Heat Equation We study the heat equation: u t = u xx for x (,1), t >, (1) u(,t) = u(1,t) = for t >, (2) u(x,) = f(x) for x (,1), (3)
MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
Linear Programming Notes V Problem Transformations
Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material
(Refer Slide Time: 1:42)
Introduction to Computer Graphics Dr. Prem Kalra Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture - 10 Curves So today we are going to have a new topic. So far
Orthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no
arxiv:physics/0004029v1 [physics.ed-ph] 14 Apr 2000
arxiv:physics/0004029v1 [physics.ed-ph] 14 Apr 2000 Lagrangians and Hamiltonians for High School Students John W. Norbury Physics Department and Center for Science Education, University of Wisconsin-Milwaukee,
THE DIMENSION OF A VECTOR SPACE
THE DIMENSION OF A VECTOR SPACE KEITH CONRAD This handout is a supplementary discussion leading up to the definition of dimension and some of its basic properties. Let V be a vector space over a field
Rolle s Theorem. q( x) = 1
Lecture 1 :The Mean Value Theorem We know that constant functions have derivative zero. Is it possible for a more complicated function to have derivative zero? In this section we will answer this question
Math 53 Worksheet Solutions- Minmax and Lagrange
Math 5 Worksheet Solutions- Minmax and Lagrange. Find the local maximum and minimum values as well as the saddle point(s) of the function f(x, y) = e y (y x ). Solution. First we calculate the partial
A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS
A QUIK GUIDE TO THE FOMULAS OF MULTIVAIABLE ALULUS ontents 1. Analytic Geometry 2 1.1. Definition of a Vector 2 1.2. Scalar Product 2 1.3. Properties of the Scalar Product 2 1.4. Length and Unit Vectors
An Introduction to Mathematical Optimal Control Theory Version 0.2
An Introduction to Mathematical Optimal Control Theory Version.2 By Lawrence C. Evans Department of Mathematics University of California, Berkeley Chapter 1: Introduction Chapter 2: Controllability, bang-bang
CHAPTER 7 APPLICATIONS TO MARKETING. Chapter 7 p. 1/54
CHAPTER 7 APPLICATIONS TO MARKETING Chapter 7 p. 1/54 APPLICATIONS TO MARKETING State Equation: Rate of sales expressed in terms of advertising, which is a control variable Objective: Profit maximization
19 LINEAR QUADRATIC REGULATOR
19 LINEAR QUADRATIC REGULATOR 19.1 Introduction The simple form of loopshaping in scalar systems does not extend directly to multivariable (MIMO) plants, which are characterized by transfer matrices instead
Envelope Theorem. Kevin Wainwright. Mar 22, 2004
Envelope Theorem Kevin Wainwright Mar 22, 2004 1 Maximum Value Functions A maximum (or minimum) value function is an objective function where the choice variables have been assigned their optimal values.
Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.
1. Differentiation The first derivative of a function measures by how much changes in reaction to an infinitesimal shift in its argument. The largest the derivative (in absolute value), the faster is evolving.
constraint. Let us penalize ourselves for making the constraint too big. We end up with a
Chapter 4 Constrained Optimization 4.1 Equality Constraints (Lagrangians) Suppose we have a problem: Maximize 5, (x 1, 2) 2, 2(x 2, 1) 2 subject to x 1 +4x 2 =3 If we ignore the constraint, we get the
t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
Stochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
Adaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
Applications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
The equivalence of logistic regression and maximum entropy models
The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/20/09/the-simplerderivation-of-logistic-regression/
1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
Inner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued
15 Kuhn -Tucker conditions
5 Kuhn -Tucker conditions Consider a version of the consumer problem in which quasilinear utility x 2 + 4 x 2 is maximised subject to x +x 2 =. Mechanically applying the Lagrange multiplier/common slopes
Functional Optimization Models for Active Queue Management
Functional Optimization Models for Active Queue Management Yixin Chen Department of Computer Science and Engineering Washington University in St Louis 1 Brookings Drive St Louis, MO 63130, USA [email protected]
Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver
Høgskolen i Narvik Sivilingeniørutdanningen STE637 ELEMENTMETODER Oppgaver Klasse: 4.ID, 4.IT Ekstern Professor: Gregory A. Chechkin e-mail: [email protected] Narvik 6 PART I Task. Consider two-point
5 Numerical Differentiation
D. Levy 5 Numerical Differentiation 5. Basic Concepts This chapter deals with numerical approximations of derivatives. The first questions that comes up to mind is: why do we need to approximate derivatives
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013
Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,
correct-choice plot f(x) and draw an approximate tangent line at x = a and use geometry to estimate its slope comment The choices were:
Topic 1 2.1 mode MultipleSelection text How can we approximate the slope of the tangent line to f(x) at a point x = a? This is a Multiple selection question, so you need to check all of the answers that
Constrained optimization.
ams/econ 11b supplementary notes ucsc Constrained optimization. c 2010, Yonatan Katznelson 1. Constraints In many of the optimization problems that arise in economics, there are restrictions on the values
Lectures 5-6: Taylor Series
Math 1d Instructor: Padraic Bartlett Lectures 5-: Taylor Series Weeks 5- Caltech 213 1 Taylor Polynomials and Series As we saw in week 4, power series are remarkably nice objects to work with. In particular,
Sequences and Series
Sequences and Series Consider the following sum: 2 + 4 + 8 + 6 + + 2 i + The dots at the end indicate that the sum goes on forever. Does this make sense? Can we assign a numerical value to an infinite
Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
1 Error in Euler s Method
1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of
In this section, we will consider techniques for solving problems of this type.
Constrained optimisation roblems in economics typically involve maximising some quantity, such as utility or profit, subject to a constraint for example income. We shall therefore need techniques for solving
College of the Holy Cross, Spring 2009 Math 373, Partial Differential Equations Midterm 1 Practice Questions
College of the Holy Cross, Spring 29 Math 373, Partial Differential Equations Midterm 1 Practice Questions 1. (a) Find a solution of u x + u y + u = xy. Hint: Try a polynomial of degree 2. Solution. Use
Schooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)
Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.
The Epsilon-Delta Limit Definition:
The Epsilon-Delta Limit Definition: A Few Examples Nick Rauh 1. Prove that lim x a x 2 = a 2. (Since we leave a arbitrary, this is the same as showing x 2 is continuous.) Proof: Let > 0. We wish to find
1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.
Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10
Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 10 Boundary Value Problems for Ordinary Differential Equations Copyright c 2001. Reproduction
Class Meeting # 1: Introduction to PDEs
MATH 18.152 COURSE NOTES - CLASS MEETING # 1 18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck Class Meeting # 1: Introduction to PDEs 1. What is a PDE? We will be studying functions u = u(x
1 Completeness of a Set of Eigenfunctions. Lecturer: Naoki Saito Scribe: Alexander Sheynis/Allen Xue. May 3, 2007. 1.1 The Neumann Boundary Condition
MAT 280: Laplacian Eigenfunctions: Theory, Applications, and Computations Lecture 11: Laplacian Eigenvalue Problems for General Domains III. Completeness of a Set of Eigenfunctions and the Justification
MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform
MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish
Lecture 5 Principal Minors and the Hessian
Lecture 5 Principal Minors and the Hessian Eivind Eriksen BI Norwegian School of Management Department of Economics October 01, 2010 Eivind Eriksen (BI Dept of Economics) Lecture 5 Principal Minors and
Big Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
3.2 The Factor Theorem and The Remainder Theorem
3. The Factor Theorem and The Remainder Theorem 57 3. The Factor Theorem and The Remainder Theorem Suppose we wish to find the zeros of f(x) = x 3 + 4x 5x 4. Setting f(x) = 0 results in the polynomial
THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS
THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS KEITH CONRAD 1. Introduction The Fundamental Theorem of Algebra says every nonconstant polynomial with complex coefficients can be factored into linear
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
PYTHAGOREAN TRIPLES KEITH CONRAD
PYTHAGOREAN TRIPLES KEITH CONRAD 1. Introduction A Pythagorean triple is a triple of positive integers (a, b, c) where a + b = c. Examples include (3, 4, 5), (5, 1, 13), and (8, 15, 17). Below is an ancient
Inner products on R n, and more
Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +
ORDINARY DIFFERENTIAL EQUATIONS
ORDINARY DIFFERENTIAL EQUATIONS GABRIEL NAGY Mathematics Department, Michigan State University, East Lansing, MI, 48824. SEPTEMBER 4, 25 Summary. This is an introduction to ordinary differential equations.
CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation
Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in
POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS
POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS N. ROBIDOUX Abstract. We show that, given a histogram with n bins possibly non-contiguous or consisting
Lecture 13 Linear quadratic Lyapunov theory
EE363 Winter 28-9 Lecture 13 Linear quadratic Lyapunov theory the Lyapunov equation Lyapunov stability conditions the Lyapunov operator and integral evaluating quadratic integrals analysis of ARE discrete-time
A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
Module 1 : Conduction. Lecture 5 : 1D conduction example problems. 2D conduction
Module 1 : Conduction Lecture 5 : 1D conduction example problems. 2D conduction Objectives In this class: An example of optimization for insulation thickness is solved. The 1D conduction is considered
CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
Linear Codes. Chapter 3. 3.1 Basics
Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length
1 3 4 = 8i + 20j 13k. x + w. y + w
) Find the point of intersection of the lines x = t +, y = 3t + 4, z = 4t + 5, and x = 6s + 3, y = 5s +, z = 4s + 9, and then find the plane containing these two lines. Solution. Solve the system of equations
Solutions for Review Problems
olutions for Review Problems 1. Let be the triangle with vertices A (,, ), B (4,, 1) and C (,, 1). (a) Find the cosine of the angle BAC at vertex A. (b) Find the area of the triangle ABC. (c) Find a vector
Notes on Factoring. MA 206 Kurt Bryan
The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor
α = u v. In other words, Orthogonal Projection
Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v
Portfolio Optimization Part 1 Unconstrained Portfolios
Portfolio Optimization Part 1 Unconstrained Portfolios John Norstad [email protected] http://www.norstad.org September 11, 2002 Updated: November 3, 2011 Abstract We recapitulate the single-period
Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.
An Example 2 3 4 Outline Objective: Develop methods and algorithms to mathematically model shape of real world objects Categories: Wire-Frame Representation Object is represented as as a set of points
1 Formulating The Low Degree Testing Problem
6.895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Linearity Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz In the last lecture, we proved a weak PCP Theorem,
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
The Envelope Theorem 1
John Nachbar Washington University April 2, 2015 1 Introduction. The Envelope Theorem 1 The Envelope theorem is a corollary of the Karush-Kuhn-Tucker theorem (KKT) that characterizes changes in the value
14.451 Lecture Notes 10
14.451 Lecture Notes 1 Guido Lorenzoni Fall 29 1 Continuous time: nite horizon Time goes from to T. Instantaneous payo : f (t; x (t) ; y (t)) ; (the time dependence includes discounting), where x (t) 2
Section 12.6: Directional Derivatives and the Gradient Vector
Section 26: Directional Derivatives and the Gradient Vector Recall that if f is a differentiable function of x and y and z = f(x, y), then the partial derivatives f x (x, y) and f y (x, y) give the rate
Introduction to the Finite Element Method (FEM)
Introduction to the Finite Element Method (FEM) ecture First and Second Order One Dimensional Shape Functions Dr. J. Dean Discretisation Consider the temperature distribution along the one-dimensional
Nonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris [email protected] Nonlinear optimization c 2006 Jean-Philippe Vert,
a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
Section 1.1. Introduction to R n
The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to
The Black-Scholes Formula
FIN-40008 FINANCIAL INSTRUMENTS SPRING 2008 The Black-Scholes Formula These notes examine the Black-Scholes formula for European options. The Black-Scholes formula are complex as they are based on the
2.2. Instantaneous Velocity
2.2. Instantaneous Velocity toc Assuming that your are not familiar with the technical aspects of this section, when you think about it, your knowledge of velocity is limited. In terms of your own mathematical
Curves and Surfaces. Goals. How do we draw surfaces? How do we specify a surface? How do we approximate a surface?
Curves and Surfaces Parametric Representations Cubic Polynomial Forms Hermite Curves Bezier Curves and Surfaces [Angel 10.1-10.6] Goals How do we draw surfaces? Approximate with polygons Draw polygons
Copyrighted Material. Chapter 1 DEGREE OF A CURVE
Chapter 1 DEGREE OF A CURVE Road Map The idea of degree is a fundamental concept, which will take us several chapters to explore in depth. We begin by explaining what an algebraic curve is, and offer two
15 Limit sets. Lyapunov functions
15 Limit sets. Lyapunov functions At this point, considering the solutions to ẋ = f(x), x U R 2, (1) we were most interested in the behavior of solutions when t (sometimes, this is called asymptotic behavior
