Min-Max Approximate Dynamic Programming
|
|
|
- Lewis Pope
- 9 years ago
- Views:
Transcription
1 20 IEEE International Symposium on Computer-Aided Control System Design CACSD Part of 20 IEEE Multi-Conference on Systems and Control Dener, CO, USA. September 28-30, 20 Min-Max Approximate Dynamic Programming Brendan O Donoghue Yang Wang Stephen Boyd Abstract In this paper e describe an approximate dynamic programming policy for a discrete-time dynamical system perturbed by noise. The approximate alue function is the pointise supremum of a family of loer bounds on the alue function of the stochastic control problem; ealuating the control policy inoles the solution of a min-max or saddle-point problem. For a quadratically constrained linear quadratic control problem, ealuating the policy amounts to soling a semidefinite program at each time step. By ealuating the policy, e obtain a loer bound on the alue function, hich can be used to ealuate performance: When the loer bound and the achieed performance of the policy are close, e can conclude that the policy is nearly optimal. We describe seeral numerical examples here this is indeed the case. I. INTRODUCTION We consider an infinite horion stochastic control problem ith discounted objectie and full state information. In the general case this problem is difficult to sole, but exact solutions can be found for certain special cases. When the state and action spaces are finite, for example, the problem is readily soled. Another case for hich the problem can be soled exactly is hen the state and action spaces are finite dimensional real ector spaces, the system dynamics are linear, the cost function is conex quadratic, and there are no constraints on the action or the state. In this case optimal control policy is affine in the state ariable, ith coefficients that are readily computable [], [2], [3], [4]. One general method for finding the optimal policy is to use dynamic programming DP. DP represents the optimal policy in terms of an optimiation problem inoling the alue function of the stochastic control problem [3], [4], [5]. Hoeer, due to the curse of dimensionality, een representing the alue function can be intractable hen the state or action spaces are infinite, or as a practical matter, hen the number of states or actions is ery large. Een hen the alue function can be represented, ealuating the optimal policy can still be intractable. As a result approximate dynamic programming ADP has been deeloped as a general method for finding suboptimal control policies [6], [7], [8]. In ADP e substitute an approximate alue function for the alue function in the expression for the optimal policy. The goal is to choose the approximate alue function also knon as a control-lyapuno function so that the performance of the resulting policy is close to optimal, or at least, good. In this paper e deelop a control policy hich e call the min-max approximate dynamic programming policy. We first parameterie a family of loer bounds on the true alue function; then e perform control, taking the pointise supremum oer this family as our approximate alue function. The condition e use to parameterie our family of bounds is related to the linear programming approach to ADP, hich as first introduced in [9], and extended to approximate dynamic programming in [0], []. The basic idea is that any function hich satisfies the Bellman inequality is a loer bound on the true alue function [3], [4]. It as shon in [2] that a better loer bound can be attained ia an iterated chain of Bellman inequalities, hich e use here. We relate this chain of inequalities to a forard look-ahead in time, in a similar sense to that of model predictie control MPC [3], [4]. Indeed many types of MPC can be thought of as performing min-max ADP ith a particular generally affine family of underestimator functions. In cases here e hae a finite number of states and inputs, ealuating our policy requires soling a linear program at eery time step. For problems ith an infinite number of states and inputs, the method requires the solution of a semi-infinite linear program, ith a finite number of ariables, but an infinite number of constraints one for eery state-control pair. For these problems e can obtain a tractable semidefinite program SDP approximation using methods such as the S- procedure [8], [2]. Ealuating our policy then requires soling an SDP at each time step [5], [6]. Much progress has been made in soling structured conex programs efficiently see, e.g., [7], [8], [9], [20]. These fast optimiation methods make our policies practical, een for large problems, or those requiring fast sampling rates. II. STOCHASTIC CONTROL Consider a discrete time-inariant dynamical system, ith dynamics described by x t+ = fx t, u t, t, t = 0,,..., //$ IEEE 424
2 here x t X is the system state, u t U is the control input or action, t W is an exogenous noise or disturbance, at time t, and f : X U W X is the state transition function. The noise terms t are independent identically distributed IID, ith knon distribution. The initial state x 0 is also random ith knon distribution, and is independent of t. We consider causal, time-inariant state feedback control policies u t = φx t, t = 0,,..., here φ : X U is the control policy or state feedback function. The stage cost is gien by l : X U R {+ }, here the infinite alues of l encode constraints on the states and inputs: The state-action constraint set C X U is C = {, l, < }. The problem is unconstrained if C = X U. The stochastic control problem is to choose φ in order to minimie the infinite horion discounted cost J φ = E γ t lx t, φx t, 2 t=0 here γ 0, is a discount factor. The expectations are oer the noise terms t, t = 0,,..., and the initial state x 0. We assume here that the expectation and limits exist, hich is the case under arious technical assumptions [3], [4]. We denote by J the optimal alue of the stochastic control problem, i.e., the infimum of J φ oer all policies φ : X U. A. Dynamic programming In this section e briefly reie the dynamic programming characteriation of the solution to the stochastic control problem. For more details, see [3], [4]. The alue function of the stochastic control problem, V : X R { }, is gien by V = inf E γ t lx t, φx t, φ t=0 subject to the dynamics and x 0 = ; the infimum is oer all policies φ : X U, and the expectation is oer t for t = 0,,.... The quantity V is the cost incurred by an optimal policy, hen the system is started from state. The optimal total discounted cost is gien by J = E x0 V x 0. It can be shon that the alue function is the unique fixed point of the Bellman equation V = inf l, + γ E V f,, for all X. We can rite the Bellman equation in the form V = T V, 3 here e define the Bellman operator T as T g = inf l, + γ Egf,, for any g : X R {+ }. An optimal policy for the stochastic control problem is gien by φ = argmin for all X. l, + γ E V f,,, 4 B. Approximate dynamic programming In many cases of interest, it is intractable to compute or een represent the alue function V, let alone carry out the minimiation required ealuate the optimal policy 4. In such cases, a common alternatie is to replace the alue function ith an approximate alue function ˆV [6], [7], [8]. The resulting policy, gien by ˆφ = argmin l, + γ E ˆV f,,, for all X, is called an approximate dynamic programming ADP policy. Clearly, hen ˆV = V, the ADP policy is optimal. The goal of approximate dynamic programming is to find a ˆV for hich the ADP policy can be easily ealuated for instance, by soling a conex optimiation problem, and also attains nearoptimal performance. III. MIN-MAX APPROXIMATE DYNAMIC PROGRAMMING We consider a family of linearly parameteried candidate alue functions V α : X R, V α = K α i V i, i= here α R K is a ector of coefficients and V i : X R are fixed basis functions. No suppose e hae a set A R K for hich V α V, X, α A. Thus {V α α A} is a parameteried family of underestimators of the alue function. We ill discuss ho to obtain such a family later. For any α A e hae V α sup V α V, X, α A 2 425
3 i.e., the pointise supremum oer the family of underestimators must be at least as good an approximation of V as any single function from the family. This suggests the ADP control policy φ = argmin l, + γ E sup V α f,, α A, here e use sup α A V α as an approximate alue function. Unfortunately, this policy may be difficult to ealuate, since ealuating the expectation of the supremum can be hard, een hen ealuating E V α f,, for a particular α can be done. Our last step is to exchange expectation and supremum to obtain the min-max control policy φ mm = argmin sup α A l, +γ E V α f,, 5 for all X. Computing this policy inoles the solution of a min-max or saddle-point problem, hich e ill see is tractable in certain cases. One such case is here the function l, +E V α f,, is conex in for each and α and the set A is conex. A. Bounds The optimal alue of the optimiation problem in the min-max policy 5 is a loer on the alue function at eery state. To see this e note that inf sup α A l, + γ EV α f, u, inf l, + γ E sup α A V α f, u, inf l, + γ E V f, u, = T V = V, here the first inequality is due to Fatou s lemma [2], the second inequality follos from the monotonicity of expectation, and the equality comes from the fact that V is the unique fixed point of the Bellman operator. Using the pointise bounds, e can ealuate a loer bound on the optimal cost J ia Monte Carlo simulation: N J lb = /N V lb j j= here,..., N are dran from the same distribution as x 0 and V lb j is the loer bound e get from ealuating the min-max policy at j. The performance of the min-max policy can also be ealuated using Monte Carlo simulation, and proides an upper bound J ub on the optimal cost. Ignoring Monte Carlo error e hae J lb J J ub. Theses upper and loer bounds on the optimal alue of the stochastic control problem are readily ealuated numerically, through simulation of the min-max control policy. When J lb and J ub are close, e can conclude that the min-max policy is almost optimal. We ill use this technique to ealuate the performance of the minmax policy for our numerical examples. B. Ealuating the min-max control policy Ealuating the min-max control policy often requires exchanging the order of minimiation and maximiation. For any function f : R p R q R and sets W R p, Z R q, the max-min inequality states that sup inf Z W f, inf sup W Z f,. 6 In the context of the min-max control policy, this means e can sap the order of minimiation and maximiation in 5 and maintain the loer bound property. To ealuate the policy, e sole the optimiation problem maximie inf l, + γ E V α f,, subject to α A 7 ith ariable α. If A is a conex set, 7 is a conex optimiation problem, since the objectie is the infimum oer a family of affine functions in α, and is therefore concae. In practice, soling 7 is often much easier than ealuating the min-max control policy directly. In addition, if there exist W and Z such that f, f, f,, for all W and Z, then e hae the strong maxmin property or saddle-point property and 6 holds ith equality. In such cases the problems 5 and 7 are equialent, and e can use Neton s method or duality considerations to sole 5 or 7 [6], [22]. IV. ITERATED BELLMAN INEQUALITIES In this section e describe ho to parameterie a family of underestimators of the true alue function. The idea is based on the Bellman inequality, [0], [8], [2], and results in a conex condition on the coefficients α that guarantees V α V
4 A. Basic Bellman inequality The basic condition orks as follos. Suppose e hae a function V : X R, hich satisfies the Bellman inequality V T V. 8 Then by the monotonicity of the Bellman operator, e hae V lim k T k V = V, so any function that satisfies the Bellman inequality must be a alue function underestimator. Applying this condition to V α and expanding 8 e get V α inf l, + γ E V α f,,, for all X. For each, the left hand side is linear in α, and the right hand side is a concae function of α, since it is the infimum oer a family of affine functions. Hence, the Bellman inequality leads to a conex constraint on α. B. Iterated Bellman inequalities We can obtain better i.e., larger loer bounds on the alue function by considering an iterated form of the Bellman inequality [2]. Suppose e hae a sequence of functions V i : X R, i = 0,...,M, that satisfy a chain of Bellman inequalities V 0 T V, V T V 2,... V M T V M, 9 ith V M = V M. Then, using similar arguments as before e can sho V 0 V. Restricting each function to lie in the same subspace K V i = α ij V j, j= e see that the iterated chain of Bellman inequalities also results in a conex constraint on the coefficients α ij. Hence the condition on α 0j, j =,...,K, hich parameteries our underestimator V 0, is conex. It is easy to see that the iterated Bellman condition must gie bounds that are at least as good as the basic Bellman inequality, since any function that satisfies 8 must be feasible for 9 [2]. V. BOX CONSTRAINED LINEAR QUADRATIC CONTROL This section follos a similar example presented in [2]. We hae X = R n, U = R m, ith linear dynamics x t+ = Ax t + Bu t + t, here A R n n and B R n m. The noise has ero mean, E t = 0, and coariance E t t T = W. Our bounds and policy ill only depend on the first and second moments of t. The stage cost is gien by { l, = T R + T Q, +, >, here R = R T 0, Q = Q T 0. A. Iterated Bellman inequalities We look for conex quadratic approximate alue functions V i = T P i + 2p T i + s i, i = 0,...,M, here P i = Pi T 0, p i R n and r i R, are the coefficients of our linear parameteriation. The iterated Bellman inequalities are V i l, + γ EV i A + B +, for all, R n, i =,...,M. Defining S i = P i p i, L = R Q 0, 0 p T i s i G i = BT P i B B T P i A B T p i A T P i B A T P i A A T p i, p T i B pt i A TrP iw + s i for i = 0,...,M, e can rite the Bellman inequalities as a quadratic form in,, T L + γg i S i 0, 0 for all, R n, i =,...,M. We ill obtain a tractable sufficient condition for this using the S-procedure [5], [2]. The constraint can be ritten in terms of quadratic inequalities T e i e T i 0, i =,...,m, here e i denotes the ith unit ector. Using the S- procedure, a sufficient condition for 0 is the existence of diagonal matrices D i 0, i =,...,M for hich here L + γg i S i + Λ i 0, i =,...M, Λ i = D i TrD i. 2 Finally e hae the terminal constraint, S M = S M
5 B. Min-max control policy For this problem, it is easy to sho that the strong max-min property holds, and therefore problems 5 and 7 are equialent. To ealuate the min-max control policy e sole problem 7, hich e can rite as maximie inf l, + γ E V 0 A + B + subject to, S M = S M, P 0 0 P i 0, D i 0, i =,...,M, ith ariables P i, p i, s i, i = 0,...,M, and diagonal D i, i =,...,M. We ill conert this max-min problem to a max-max problem by forming the dual of the minimiation part. Introducing a diagonal matrix D 0 0 as the dual ariable for the box constraints, e obtain the dual function inf T L + γg 0 Λ 0 here Λ 0 has the form gien in 2. We can minimie oer analytically. If e block out the matrix L+γG 0 Λ 0 as M M L + γg 0 Λ 0 = 2 M2 T 3 M 22 here M R m m, then = M M 2 [ ]., Thus our problem becomes ] maximie M 22 M2 T M 2[ M subject to, S M = S M P i 0, D i 0, i = 0,...,M, hich is a conex optimiation problem in the ariables P i, p i, r i, D i, i = 0,...,M, and can be soled as an SDP. To implement the min-max control policy, at each time t, e sole the aboe problem ith = x t, and let u t = M M 2 [ xt ], here M and M 2 denote the matrices M and M 2, computed from P 0, p 0, s 0, D 0. C. Interpretations We can easily erify that the dual of the aboe optimiation problem is a ariant of model predictie control, that uses both the first and second moments of the state. In this context, the number of iterations, M, is the length of the prediction horion, and e can interpret Policy / Bound Value MPC policy.347 Min-max policy.345 Loer bound.307 TABLE I: Performance comparison, box constrained example. our loer bound as a finite horion approximation to an infinite horion problem, hich underestimates the optimal infinite horion cost. The S-procedure relaxation also has a natural interpretation: in [23], the author obtains similar LMIs by relaxing almost sure constraints into constraints that are only required to hold in expectation. D. Numerical instance We consider a numerical example ith n = 8, m = 3, and γ = 0.9. The parameters Q, R, A and B are randomly generated; e set B = and scale A so that max λ i A = i.e., so that the system is marginally stable. The initial state x 0 is Gaussian, ith ero mean. Table I shos the performance of the min-max policy and certainty equialent MPC, both ith horions of M = 5 steps, as ell as the loer bound on the optimal cost. In this case, both the min-max policy and MPC are no more than around % suboptimal, modulo Monte Carlo error. VI. DYNAMIC PORTFOLIO OPTIMIZATION In this example, e manage a portfolio of n assets oer time. Our state x t R n is the ector of dollar alues of the assets, at the beginning of inestment period t. Our action u t R n represents buying or selling assets at the beginning of each inestment period: u t i > 0 means e are buying asset i, for dollar alue u t i, u t i < 0 means e sell asset i. The post-trade portfolio is then gien by x t +u t, and the total gross cash put in is T u t, here is the ector ith all components one. The portfolio propagates oer time i.e., oer the inestment period according to x t+ = A t x t + u t here A t = diagρ t, and ρ t i is the total return of asset i in inestment period t. The return ectors ρ t are IID, ith first and second moments Eρ t = µ, Eρ t ρ T t = Σ
6 Here too, our bounds and policy ill only depend on the first and second moments of ρ t. We let ˆΣ = Σ µµ T denote the return coariance. We no describe the constraints and objectie. We constrain the risk of our post-trade portfolio, hich e quantify as the portfolio return ariance oer the period: x t + u t T ˆΣxt + u t l, here l 0 is the maximum ariance risk alloed. Our action buying and selling u t incurs a transaction cost ith an absolute alue and a quadratic component, gien by κ T u + u T Ru, here κ R n + is the ector of linear transaction cost rates and u means elementise, and R R n n, hich is diagonal ith positie entries, represents a quadratic transaction cost coefficients. Linear transactions cost model effects such as crossing the bid-ask spread, hile quadratic transaction costs model effects such as price-impact. Thus at time t, e put into our portfolio the net cash amount gu t = T u t + κ T u t + u T t Ru t. When this is negatie, it represents reenue. The first term is the gross cash in from purchases and sales; the second and third terms are the transaction fees. The stage cost, including the risk limit, is { g, + T ˆΣ + l l, = +, otherise. Our goal is to minimie the discounted cost or equialently, to maximie the discounted reenue. In this example, the discount factor has a natural interpretation as reflecting the time alue of money. A. Iterated Bellman Inequalities We incorporate another ariable, y R n, to remoe the absolute alue term from the stage cost function, and add the constraints y t i t i y t i, i =,...,n. 4 We define the stage cost ith these ne ariables to be l,, y = T + κ T y + /2 T R + /2y T Ry for,, y that satisfy y y, + ˆΣ + l, 5 and + otherise. Here, the first set of inequalities is interpreted elementise. We look for conex quadratic candidate alue functions, i.e. V i = T P i + 2p T i + s i, i = 0,...,M, here P i 0, p i R n, s i R are the coefficients of our linear parameteriation. Defining R/2 0 0 /2 0 R/2 0 κ/2 L = , T /2 κ T /2 0 0 P i Σ 0 P i Σ p i µ G i = P i Σ 0 P i Σ p i µ, p i µ T 0 p i µ T S i = 0 0 P i p i, i = 0,...,M, 0 0 p T i s i here denotes the Hadamard product, e can rite the iterated Bellman inequalities as T y y L + γg i S i 0, for all, y, that satisfy 5, for i =,...,M. A tractable sufficient condition for the Bellman inequalities is by the S-procedure the existence of λ i 0, ν i R n +, τ i R n +, i =,...,M such that here L + γg i S i + Λ i 0, i =,...,M, 6 Λ i = λ iˆσ 0 λiˆσ νi τ i ν i + τ i λ iˆσ 0 λiˆσ 0 νi T τi T νi T + τi T 0 λ i l. 7 Lastly e hae the terminal constraint, S M = S M. B. Min-max control policy The discussion here follos almost exactly the one presented for the preious example. It is easy to sho that in this case e hae the strong max-min property. At each step, e sole problem 7 by conerting the maxmin problem into a max-max problem, using Lagrangian duality. We can rite problem 7 as maximie inf,y l,, y + γ E ρ V 0 A + subject to 6, S M = S M P i 0, i = 0,...,M, 6 429
7 ith ariables P i, p i, s i, i = 0,...,M, and λ i R +, ν i R n +, τ i R n +, i =,...,M. Next, e derie the dual function of the minimiation part. We introduce ariables λ 0 R +, ν 0 R n +, τ 0 R n +, hich are dual ariables corresponding to the constraints 5 and 4. The dual function is gien by T y y inf L + γg 0 + Λ 0,y, here Λ 0 has the form gien in 7. If e define M M 2 L + γg 0 + Λ 0 = M2 T, 8 M 22 here M R 2n 2n, then the minimier of the Lagrangian is, y = M 2. Thus our problem becomes [ maximie M 22 M2 T M M 2 subject to 6, S M = S M P i 0, i = 0,...,M, hich is conex in the ariables P i, p i, s i, i = 0,...,M, and λ i R +, ν i R n +, τ i R n +, i = 0,...,M. To implement the policy, at each time t e sole the aboe optimiation problem as an SDP ith = x t, and let u t = I m 0 M x t M 2, here M and M 2 denote the matrices M and M 2, computed from P 0, p 0, s 0, λ 0, ν 0, τ 0. C. Numerical instance We consider a numerical example ith n = 8 assets and γ = The initial portfolio x 0 is Gaussian, ith ero mean. The returns follo a log-normal distribution, i.e., logρ t N µ, Σ. The parameters µ and Σ are gien by µ i = exp µ i + Σ ii /2, Σ ij = µ i µ j exp Σ ij. Table II compares the performance of the min-max policy for M = 40 and M = 5, and certainty equialent model predictie control ith horion T = 40, oer 50 simulations each consisting of 50 time steps. We ] can see that the min-max policy significantly outperforms MPC, hich actually makes a loss on aerage since the aerage cost is positie. The cost achieed by the min-max policy is close to the loer bound, hich shos that both the policy and the bound are nearly optimal. In fact, the gap is small een for M = 5, hich corresponds to a relatiely myopic policy. Bound / Policy Value MPC policy, T = Min-max policy, M = Min-max policy, M = Loer bound, M = Loer bound, M = TABLE II: Performance comparison, portfolio example. VII. CONCLUSIONS In this paper e introduce a control policy hich e refer to as min-max approximate dynamic programming. Ealuating this policy at each time step requires the solution of a min-max or saddle point problem; in addition, e obtain a loer bound on the alue function, hich can used to estimate ia Monte Carlo simulation the optimal alue of the stochastic control problem. We demonstrate the method ith to examples, here the policy can be ealuated by soling a conex optimiation problem at each time-step. In both examples the loer bound and the achieed performance are ery close, certifying that the min-max policy is ery close to optimal. REFERENCES [] R. Kalman, When is a linear control system optimal? Journal of Basic Engineering, ol. 86, no., pp. 0, 964. [2] S. Boyd and C. Barratt, Linear controller design: Limits of performance. Prentice-Hall, 99. [3] D. Bertsekas, Dynamic Programming and Optimal Control: Volume. Athena Scientific, [4], Dynamic Programming and Optimal Control: Volume 2. Athena Scientific, [5] D. Bertsekas and S. Shree, Stochastic optimal control: The discrete-time case. Athena Scientific, 996. [6] D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, st ed. Athena Scientific, 996. [7] W. Poell, Approximate dynamic programming: soling the curses of dimensionality. John Wiley & Sons, Inc., [8] Y. Wang and S. Boyd, Performance bounds for linear stochastic control, Systems & Control Letters, ol. 58, no. 3, pp , Mar [9] A. Manne, Linear programming and sequential decisions, Management Science, ol. 6, no. 3, pp , 960. [0] D. De Farias and B. Van Roy, The linear programming approach to approximate dynamic programming, Operations Research, ol. 5, no. 6, pp ,
8 [] P. Scheiter and A. Seidmann, Generalied polynomial approximations in markoian decision process, Journal of mathematical analysis and applications, ol. 0, no. 2, pp , 985. [2] Y. Wang and S. Boyd, Approximate dynamic programming ia iterated bellman inequalities, 200, manuscript. [3] C. Garcia, D. Prett, and M. Morari, Model predictie control: theory and practice, Automatica, ol. 25, no. 3, pp , 989. [4] J. Maciejoski, Predictie Control ith Constraints. Prentice- Hall, [5] S. Boyd, L. E. Ghaoui, E. Feron, and V. Balakrishnan, Linear Matrix Inequalities in System and Control Theory. Society for Industrial Mathematics, 994. [6] S. Boyd and L. Vandenberghe, Conex optimiation. Cambridge Uniersity Press, Sept [7] Y. Wang and S. Boyd, Fast model predictie control using online optimiation, IEEE Transactions on Control Systems Technology, ol. 8, pp , 200. [8], Fast ealuation of quadratic control-lyapuno policy, IEEE Transactions on Control Systems Technology, pp. 8, 200. [9] J. Mattingley, Y. Wang, and S. Boyd, Code generation for receding horion control, in IEEE Multi-Conference on Systems and Control, 200, pp [20] J. Mattingley and S. Boyd, CVXGEN: A code generator for embedded conex optimiation, 200, manuscript. [2] D. Cohn, Measure Theory. Birkhäuser, 997. [22] A. Ben-Tal, L. E. Ghaoui, and A. Nemiroski, Robust optimiation. Princeton Uniersity Press, [23] A. Gattami, Generalied linear quadratic control theory, Proceedings of the 45th IEEE Conference on Decision and Control, pp ,
Neuro-Dynamic Programming An Overview
1 Neuro-Dynamic Programming An Overview Dimitri Bertsekas Dept. of Electrical Engineering and Computer Science M.I.T. September 2006 2 BELLMAN AND THE DUAL CURSES Dynamic Programming (DP) is very broadly
Basic Linear Algebra
Basic Linear Algebra by: Dan Sunday, softsurfer.com Table of Contents Coordinate Systems 1 Points and Vectors Basic Definitions Vector Addition Scalar Multiplication 3 Affine Addition 3 Vector Length 4
Tetris: A Study of Randomized Constraint Sampling
Tetris: A Study of Randomized Constraint Sampling Vivek F. Farias and Benjamin Van Roy Stanford University 1 Introduction Randomized constraint sampling has recently been proposed as an approach for approximating
Optimization of warehousing and transportation costs, in a multiproduct multi-level supply chain system, under a stochastic demand
Int. J. Simul. Multidisci. Des. Optim. 4, 1-5 (2010) c ASMDO 2010 DOI: 10.1051/ijsmdo / 2010001 Available online at: http://www.ijsmdo.org Optimization of warehousing and transportation costs, in a multiproduct
Maximizing Revenue with Dynamic Cloud Pricing: The Infinite Horizon Case
Maximizing Reenue with Dynamic Cloud Pricing: The Infinite Horizon Case Hong Xu, Baochun Li Department of Electrical and Computer Engineering Uniersity of Toronto Abstract We study the infinite horizon
The FitzHugh-Nagumo Model
Lecture 6 The FitzHugh-Nagumo Model 6.1 The Nature of Excitable Cell Models Classically, it as knon that the cell membrane carries a potential across the inner and outer surfaces, hence a basic model for
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
Eigenvalues and the Laplacian of a graph
CHAPTER Eigenalues and the Laplacian of a graph.. Introduction Spectral graph theory has a long history. In the early days, matrix theory and linear algebra were used to analyze adjacency matrices of graphs.
Orthogonal Projection in Teaching Regression and Financial Mathematics
Journal of Statistics Education Volume 8 Number () Orthogonal Projection in Teaching Regression and Financial Mathematics Farida Kachapoa Auckland Uniersity of Technology Ilias Kachapo Journal of Statistics
Nonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris [email protected] Nonlinear optimization c 2006 Jean-Philippe Vert,
Nonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
TD(0) Leads to Better Policies than Approximate Value Iteration
TD(0) Leads to Better Policies than Approximate Value Iteration Benjamin Van Roy Management Science and Engineering and Electrical Engineering Stanford University Stanford, CA 94305 [email protected] Abstract
Duality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
Numerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
CSI:FLORIDA. Section 4.4: Logistic Regression
SI:FLORIDA Section 4.4: Logistic Regression SI:FLORIDA Reisit Masked lass Problem.5.5 2 -.5 - -.5 -.5 - -.5.5.5 We can generalize this roblem to two class roblem as well! SI:FLORIDA Reisit Masked lass
Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
Lecture 11: 0-1 Quadratic Program and Lower Bounds
Lecture : - Quadratic Program and Lower Bounds (3 units) Outline Problem formulations Reformulation: Linearization & continuous relaxation Branch & Bound Method framework Simple bounds, LP bound and semidefinite
Closed form spread option valuation
Closed form spread option aluation Petter Bjerksund and Gunnar Stensland Department of Finance, NHH Helleeien 30, N-5045 Bergen, Norway e-mail: [email protected] Phone: +47 55 959548, Fax: +47 55
Lecture 7: Finding Lyapunov Functions 1
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1
State Feedback Stabilization with Guaranteed Transient Bounds
State Feedback Stabilization with Guaranteed Transient Bounds D. Hinrichsen, E. Plischke, F. Wirth Zentrum für Technomathematik Universität Bremen 28334 Bremen, Germany Abstract We analyze under which
Monte Carlo Simulation
1 Monte Carlo Simulation Stefan Weber Leibniz Universität Hannover email: [email protected] web: www.stochastik.uni-hannover.de/ sweber Monte Carlo Simulation 2 Quantifying and Hedging
Using Markov Decision Processes to Solve a Portfolio Allocation Problem
Using Markov Decision Processes to Solve a Portfolio Allocation Problem Daniel Bookstaber April 26, 2005 Contents 1 Introduction 3 2 Defining the Model 4 2.1 The Stochastic Model for a Single Asset.........................
1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
Support Vector Machine (SVM)
Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
Date: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
Maximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
Power-law Price-impact Models and Stock Pinning near Option Expiration Dates. Marco Avellaneda Gennady Kasyan Michael D. Lipkin
Power-law Price-impact Models and Stock Pinning near Option Expiration Dates Marco Avellaneda Gennady Kasyan Michael D. Lipkin PDE in Finance, Stockholm, August 7 Summary Empirical evidence of stock pinning
Finitely Additive Dynamic Programming and Stochastic Games. Bill Sudderth University of Minnesota
Finitely Additive Dynamic Programming and Stochastic Games Bill Sudderth University of Minnesota 1 Discounted Dynamic Programming Five ingredients: S, A, r, q, β. S - state space A - set of actions q(
Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
How to assess the risk of a large portfolio? How to estimate a large covariance matrix?
Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk
Exponential Approximation of Multi-Skill Call Centers Architecture
Exponential Approximation of Multi-Skill Call Centers Architecture Ger Koole and Jérôme Talim Vrije Universiteit - Division of Mathematics and Computer Science De Boelelaan 1081 a - 1081 HV Amsterdam -
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
INTEREST RATES AND FX MODELS
INTEREST RATES AND FX MODELS 8. Portfolio greeks Andrew Lesniewski Courant Institute of Mathematical Sciences New York University New York March 27, 2013 2 Interest Rates & FX Models Contents 1 Introduction
LINES AND PLANES IN R 3
LINES AND PLANES IN R 3 In this handout we will summarize the properties of the dot product and cross product and use them to present arious descriptions of lines and planes in three dimensional space.
Vector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
Orthogonal Distance Regression
Applied and Computational Mathematics Division NISTIR 89 4197 Center for Computing and Applied Mathematics Orthogonal Distance Regression Paul T. Boggs and Janet E. Rogers November, 1989 (Revised July,
The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
Solving polynomial least squares problems via semidefinite programming relaxations
Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose
Black-Scholes model under Arithmetic Brownian Motion
Black-Scholes model under Arithmetic Brownian Motion Marek Kolman Uniersity of Economics Prague December 22 23 Abstract Usually in the Black-Scholes world it is assumed that a stock follows a Geometric
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter [email protected] February 20, 2009 Notice: This work was supported by the U.S.
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Bidding with Securities: Auctions and Security Design.
Bidding with Securities: Auctions and Security Design. Peter M. DeMarzo, Ilan Kremer and Andrzej Skrzypacz * Stanford Graduate School of Business September, 2004 We study security-bid auctions in which
Message-passing sequential detection of multiple change points in networks
Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal
Several Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
A Log-Robust Optimization Approach to Portfolio Management
A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983
4.6 Linear Programming duality
4.6 Linear Programming duality To any minimization (maximization) LP we can associate a closely related maximization (minimization) LP. Different spaces and objective functions but in general same optimal
The Virtual Spring Mass System
The Virtual Spring Mass Syste J. S. Freudenberg EECS 6 Ebedded Control Systes Huan Coputer Interaction A force feedbac syste, such as the haptic heel used in the EECS 6 lab, is capable of exhibiting a
Climate control of a bulk storage room for foodstuffs
Climate control of a bulk storage room for foodstuffs S. an Mourik, H. Zwart, Twente Uniersity, K.J. Keesman*, Wageningen Uniersity, The Netherlands Corresponding author: S. an Mourik Department of Applied
Distribution of the Ratio of Normal and Rice Random Variables
Journal of Modern Applied Statistical Methods Volume 1 Issue Article 7 11-1-013 Distribution of the Ratio of Normal and Rice Random Variables Nayereh B. Khoolenjani Uniersity of Isfahan, Isfahan, Iran,
Key Stage 3 Mathematics Programme of Study
Deeloping numerical reasoning Identify processes and connections Represent and communicate Reiew transfer mathematical across the curriculum in a ariety of contexts and eeryday situations select, trial
Binomial lattice model for stock prices
Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }
R&D White Paper WHP 031. Reed-Solomon error correction. Research & Development BRITISH BROADCASTING CORPORATION. July 2002. C.K.P.
R&D White Paper WHP 03 July 00 Reed-olomon error correction C.K.P. Clarke Research & Deelopment BRITIH BROADCATING CORPORATION BBC Research & Deelopment White Paper WHP 03 Reed-olomon Error Correction
Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010
Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte
Proximal mapping via network optimization
L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:
Constrained optimization.
ams/econ 11b supplementary notes ucsc Constrained optimization. c 2010, Yonatan Katznelson 1. Constraints In many of the optimization problems that arise in economics, there are restrictions on the values
Credit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College [email protected]
Portfolio Distribution Modelling and Computation Harry Zheng Department of Mathematics Imperial College [email protected] Workshop on Fast Financial Algorithms Tanaka Business School Imperial College
12. Inner Product Spaces
1. Inner roduct Spaces 1.1. Vector spaces A real vector space is a set of objects that you can do to things ith: you can add to of them together to get another such object, and you can multiply one of
Robust Optimization for Unit Commitment and Dispatch in Energy Markets
1/41 Robust Optimization for Unit Commitment and Dispatch in Energy Markets Marco Zugno, Juan Miguel Morales, Henrik Madsen (DTU Compute) and Antonio Conejo (Ohio State University) email: [email protected] Modeling
MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
Examples of Functions
Examples of Functions In this document is provided examples of a variety of functions. The purpose is to convince the beginning student that functions are something quite different than polynomial equations.
Real-Time Embedded Convex Optimization
Real-Time Embedded Convex Optimization Stephen Boyd joint work with Michael Grant, Jacob Mattingley, Yang Wang Electrical Engineering Department, Stanford University Spencer Schantz Lecture, Lehigh University,
AIMS AND FUNCTIONS OF FINANCE
1 AIMS AND FUNCTIONS OF FINANCE Introduction Finance Function Importance Concept of Financial Management Nature of Financial Management Scope of Financial Management Traditional Approach Modern Approach
Stochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
ALGORITHMIC TRADING WITH MARKOV CHAINS
June 16, 2010 ALGORITHMIC TRADING WITH MARKOV CHAINS HENRIK HULT AND JONAS KIESSLING Abstract. An order book consists of a list of all buy and sell offers, represented by price and quantity, available
4 Impulse and Impact. Table of contents:
4 Impulse and Impact At the end of this section you should be able to: a. define momentum and impulse b. state principles of conseration of linear momentum c. sole problems inoling change and conseration
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
1 Portfolio mean and variance
Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring
An Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia [email protected] Tata Institute, Pune,
Estimating an ARMA Process
Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators
On SDP- and CP-relaxations and on connections between SDP-, CP and SIP
On SDP- and CP-relaations and on connections between SDP-, CP and SIP Georg Still and Faizan Ahmed University of Twente p 1/12 1. IP and SDP-, CP-relaations Integer program: IP) : min T Q s.t. a T j =
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Group Theory and Molecular Symmetry
Group Theory and Molecular Symmetry Molecular Symmetry Symmetry Elements and perations Identity element E - Apply E to object and nothing happens. bject is unmoed. Rotation axis C n - Rotation of object
t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
Discrete Optimization
Discrete Optimization [Chen, Batson, Dang: Applied integer Programming] Chapter 3 and 4.1-4.3 by Johan Högdahl and Victoria Svedberg Seminar 2, 2015-03-31 Todays presentation Chapter 3 Transforms using
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.
Summer course on Convex Optimization Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.Minnesota Interior-Point Methods: the rebirth of an old idea Suppose that f is
Inner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
Optimization under uncertainty: modeling and solution methods
Optimization under uncertainty: modeling and solution methods Paolo Brandimarte Dipartimento di Scienze Matematiche Politecnico di Torino e-mail: [email protected] URL: http://staff.polito.it/paolo.brandimarte
Optimal Pricing for Multiple Services in Telecommunications Networks Offering Quality of Service Guarantees
To Appear in IEEE/ACM Transactions on etworking, February 2003 Optimal Pricing for Multiple Serices in Telecommunications etworks Offering Quality of Serice Guarantees eil J. Keon Member, IEEE, G. Anandalingam,
A Log-Robust Optimization Approach to Portfolio Management
A Log-Robust Optimization Approach to Portfolio Management Ban Kawas Aurélie Thiele December 2007, revised August 2008, December 2008 Abstract We present a robust optimization approach to portfolio management
Numerical Methods for Option Pricing
Chapter 9 Numerical Methods for Option Pricing Equation (8.26) provides a way to evaluate option prices. For some simple options, such as the European call and put options, one can integrate (8.26) directly
Rotated Ellipses. And Their Intersections With Lines. Mark C. Hendricks, Ph.D. Copyright March 8, 2012
Rotated Ellipses And Their Intersections With Lines b Mark C. Hendricks, Ph.D. Copright March 8, 0 Abstract: This paper addresses the mathematical equations for ellipses rotated at an angle and how to
