Optimal linear-quadratic control



Similar documents
1 Teaching notes on GMM 1.

Linearly Independent Sets and Linearly Dependent Sets

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

General Framework for an Iterative Solution of Ax b. Jacobi s Method

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Linear Algebra Notes for Marsden and Tromba Vector Calculus

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Introduction to Matrix Algebra

Elasticity Theory Basics

Understanding and Applying Kalman Filtering

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Similarity and Diagonalization. Similar Matrices

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Lecture 5: Singular Value Decomposition SVD (1)

Centre for Central Banking Studies

Continued Fractions and the Euclidean Algorithm

Machine Learning and Pattern Recognition Logistic Regression

Data Mining: Algorithms and Applications Matrix Math Review

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

Solution to Homework 2

Operation Count; Numerical Linear Algebra

Linear Algebra Review. Vectors

by the matrix A results in a vector which is a reflection of the given

Nonlinear Iterative Partial Least Squares Method

Linear-Quadratic Optimal Controller 10.3 Optimal Linear Control Systems

1 Solving LPs: The Simplex Algorithm of George Dantzig

Matrix Differentiation

Typical Linear Equation Set and Corresponding Matrices

Inner Product Spaces

Gaussian Conjugate Prior Cheat Sheet

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

CS3220 Lecture Notes: QR factorization and orthogonal transformations

University of Lille I PC first year list of exercises n 7. Review

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Least-Squares Intersection of Lines

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Definition 8.1 Two inequalities are equivalent if they have the same solution set. Add or Subtract the same value on both sides of the inequality.

Numerical Analysis Lecture Notes

Notes on Determinant

Module 3: Correlation and Covariance

Linear Algebra and TI 89

Section 1.1 Linear Equations: Slope and Equations of Lines

Review Jeopardy. Blue vs. Orange. Review Jeopardy

1 Short Introduction to Time Series

LS.6 Solution Matrices

K80TTQ1EP-??,VO.L,XU0H5BY,_71ZVPKOE678_X,N2Y-8HI4VS,,6Z28DDW5N7ADY013

Lecture 7: Finding Lyapunov Functions 1

An Introduction to the Kalman Filter

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Solving Mass Balances using Matrix Algebra

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

ASEN Structures. MDOF Dynamic Systems. ASEN 3112 Lecture 1 Slide 1

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Cryptography and Network Security. Prof. D. Mukhopadhyay. Department of Computer Science and Engineering. Indian Institute of Technology, Kharagpur

Lecture Notes 2: Matrices as Systems of Linear Equations

Dynamic Eigenvalues for Scalar Linear Time-Varying Systems

Excel supplement: Chapter 7 Matrix and vector algebra

DRAFT. Further mathematics. GCE AS and A level subject content

Chapter 4: Vector Autoregressive Models

Question 2: How do you solve a matrix equation using the matrix inverse?

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Linear Programming. March 14, 2014

Risk Decomposition of Investment Portfolios. Dan dibartolomeo Northfield Webinar January 2014

HIGH ACCURACY APPROXIMATION ANALYTICAL METHODS FOR CALCULATING INTERNAL RATE OF RETURN. I. Chestopalov, S. Beliaev

Multiple regression - Matrices

Numerical Methods I Eigenvalue Problems

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

3. INNER PRODUCT SPACES

QUALITY ENGINEERING PROGRAM

Statistical Machine Learning

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

0.8 Rational Expressions and Equations

Multiple Regression: What Is It?

5.1 Radical Notation and Rational Exponents

Regression III: Advanced Methods

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Algebraic Concepts Algebraic Concepts Writing

1.2 Solving a System of Linear Equations

Department of Chemical Engineering ChE-101: Approaches to Chemical Engineering Problem Solving MATLAB Tutorial VI

Vector and Matrix Norms

Direct Methods for Solving Linear Systems. Matrix Factorization

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Economics 1011a: Intermediate Microeconomics

MATH APPLIED MATRIX THEORY

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

Simple Regression Theory II 2010 Samuel L. Baker

On Marginal Effects in Semiparametric Censored Regression Models

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

The Basic New Keynesian Model

4.5 Linear Dependence and Linear Independence

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Transcription:

Optimal linear-quadratic control Martin Ellison 1 Motivation The lectures so far have described a general method - value function iterations - for solving dynamic programming problems. However, one problem alluded to at the end of the last lecture was that the method suffers from the curse of dimensionality. If the number of state variables is large then there are many arguments in the value function and it becomes computationally very intensive to iterate the value function. In practice, a state space with dimension about four or five already tests the limits of the power of the current generation of computers. In this lecture, we examine a particular class of dynamic programming problems that can be solved relatively easily. We focus on problems of linear-quadratic control, in which the payoff function is quadratic and the transition equation is linear. Many standard problems in economics can be cast in such a linear-quadratic framework. We will show how a combination of analytical and numerical analysis can be used to derive the solution to the linear-quadratic problem. 1

2 Key reading The formal analysis for this lecture is taken from Dynamic Macroeconomic Theory by Tom Sargent, Harvard University Press, 1987. 3 Other reading Optimal linear-quadratic control is discussed in most graduate macroeconomics textbooks, e.g. chapter 4 of Recursive Macroeconomic Theory, 2nd ed by Lars Ljungqvist and Tom Sargent, MIT Press, 2000. The concepts are taken from the engineering theory of optimal control so more sophisticated treatments can be found in books such as Analysis and Control of Dynamic Economic Systems by Gregory Chow, 1975. Gauss codes for matrix Riccati equation iterations in a dynamic general equilibrium context are available from Morten Ravn s homepage at http://faculty.london.edu/mravn/ 4 Linear-quadratic control The general framework we will analyse is one in which the agent chooses a vector of controls to influence a series of state variables. We do not limit the dimension of either of these vectors, although it is natural to consider cases where the number of state variables exceeds the number of controls, otherwise it may well be that there is a very simple trivial solution which controls the states perfectly. In the context of linear-quadratic control, we assume that the transition equation governing the evolution of state is linear linear in past values of the state variables and linear in current values of the control variables. We allow the problem to be stochastic by including random shocks (with variance-covariance matrix Σ) to the state variables. The payoff function is assumed to be quadratic in the state and control variables, 2

giving quadratic forms in the objective. The fully-specified linear-quadratic control problem is specified below. The symmetric matrices and are the weights of state and control variables in the payoff function. Matrices and govern the linear evolution of state variables in the transition equation. min { } P [ 0 + 0 ] =0 +1 = + + In dynamic programming form, the value function is defined over the state variables. ( )=min[ 0 + 0 + ( + + )] (1) It is certainly possible for us to proceed as before by discretising the state space for the value function and applying value function iteration to converge to the optimal policy. However, such a procedure is computationally very intensive and unnecessary in the special linear-quadratic case. Instead, we will use a different approach which combines analytical and numerical methods. The key to the method is that we know the general form of the policy and value functions for linear-quadratic control problems. Armed with this knowledge, it is much easier to proceed. We begin by postulating a quadratic form for the value function, in which is an idempotent matrix so 0 =. ( )= 0 + We proceed by substituting this form (with as yet undetermined matrices and ) into the value function (1). For convenience of notation, we drop thetimesubscripts.inallcases, and refer to time dated variables. 3

( ) =min 0 + 0 + ( + + ) 0 ( + + )+ Expanding the quadratic terms in brackets, while remembering that ( ) 0 = 0 0 gives ( ) =min 0 + 0 0 0 + 0 0 + 0 0 + + 0 0 + 0 0 + 0 0 + + 0 + 0 + 0 The expected values of the stochastic shocks is zero so terms of the form 0 0 0 0 0 and 0 drop out. We are left with " Ã! # ( ) =min 0 + 0 0 0 + 0 0 + + + 0 0 + 0 0 + 0 (2) The first order condition with respect to canbeusedtoderiveoptimal policy. Note that 0 =2, = 0 and 0 =. ( ) =2 +2 0 +2 0 =0 Solving in terms of implies Or, more succinctly, = ( + 0 ) 1 0 = = ( + 0 ) 1 0 4

Several things are worthy of note at this stage. Firstly, optimal control requires the control vector to react linearly to the state variables. We have yet to confirm that this implies a quadratic value function as first postulated, but it already suggests that the policy function has a very simple form. Secondly, the coefficient matrix in the policy function is a non-linear function of the fundamental matrices and the matrix in the postulated value function. We therefore can approach the problem as one of determining or. Our choice is to calculate, and then calculated the implied,but other techniques take the opposite approach. Economically, the policy reaction function is interesting because it is independent of the stochastic shocks. This is because certainty equivalence holds in a linear-quadratic framework. There is no effect on policy, unless shocks enter multiplicatively or payoffs are not quadratic. We continue next to demonstrate that the linear policy function (derived from a postulated quadratic value function) does actually imply a quadratic value function. In the process, we will be able to determine the two matrices and. To do this, we substitute the policy function = back into the value function (2). Note that 0 0 0 is a scalar and so equal to 0 0. 0 + 0 0 Ã! 0 + = 0 0 2 0 0 + + 0 + + 0 0 0 Comparing coefficients on constant terms, = 0 + We simplify this equation by applying the result 0 = ( 0 )= ( 0 )= ( Σ). 5

= ( Σ) 1 This equation shows how the additive uncertainty caused by the stochastic element does have an effect on the value function, but that this effect is limited to the constant term, which is independent of policy. Hence, certainty equivalence holds in this respect. Comparing coefficients on the terms quadratic in, Rearranging, = + 0 + ( 0 2 0 + 0 0 ) = + 0 2 0 + 0 ( + 0 ) We know that optimal policy defines as ( + 0 ) 1 0.Hence, we have = + 0 2 0 ( + 0 ) 1 0 + 0 (( + 0 ) 1 ) 0 ( + 0 )( + 0 ) 1 0 Using the fact that ( 1 ) 0 =( 0 ) 1 and ( + 0 ) 0 =( + 0 ), this reduces to = + 0 2 0 ( + 0 ) 1 0 This equation confirms that a linear policy function does imply a quadratic value function. It is often known as the algebraic matrix Riccati equation. At present, it implicitly defines that matrix in the value function in terms of the structural matrices and. The matrix Riccati equation is as far as we can go analytically in linear-quadratic control. It does define as a function of and, but the relationship is not linear and potentially is 6

highly non-linear. Fortunately, a relatively simple iterative technique based on a matrix Riccati difference equation can be applied. Instead of trying to solve the Riccati equation directly, we start from an initial guess of the matrix in the value function. The initial guess is updated to +1 according to +1 = + 0 2 0 ( + 0 ) 1 0 This equation is iterated until convergence, which is guaranteed to uniqueness under very weak conditions. Specifically, having eigenvalues in of modulus less than unity is a sufficient condition. In fact, even explosive systems with eigenvalues grater than one in absolute value can be handled if some other weak conditions hold. Iteration of the matrix Riccati equation is directly analogous to the value function iterations we discussed in previous lectures. In fact, what we are doing is actually to iterate over the value function, with each successive matrix equivalent to our earlier iterations over.once has converged, it is a simple matter to calculate in the optimal policy function. 5 Numerical application To illustrate the practicalities of matrix Riccati difference equation iterations, we discuss Matlab code to solve a simple example of linear-quadratic control. Our model is one in which a central bank is trying to simultaneously control inflation and output by choosing the interest rate.the instantaneous payoff function for the central bank is assumed to be quadratic in inflation, output and the interest rate. L = 2 + 2 +0 1 2 7

We assume that the central bank places equal weight on inflation and output deviations from target (normalised to zero for convenience) and a smaller weight on deviations in the interest rate from target. The objective of the central bank is to minimise the present discounted value of expected losses, with discounting at the rate. The structure of the economy is given by two equations. +1 = 0 75 0 5 + +1 = 0 25 0 5 + It is not intended that these equations are to be considered a serious representation of the structure of the economy. Rather, the purpose is to illustrate our technique. The first equation determines inflation, which is assumed to be highly persistent and negatively correlated with interest rates. The timing is such that current interest rate decisions only affect inflation with a lag - a timing convention favoured by Athanasios Orphanides amongst others. The second equation determines output in a similar fashion. High interest rates depress output but output itself is not as persistent as inflation. The timing convention remains the same so interest rate decision only affect output with a lag. Both inflation and output are subject to (potentially correlated) random disturbances in the form of shocks and. The full minimisation problem is min P 2 + 2 +0 1 2 { } =0 +1 =0 75 0 5 + +1 =0 25 0 5 + 8

The general form of optimal linear-quadratic control is min P [ 0 + 0 ] { } =0 +1 = + + To cast our model in this general form, we define state variables as ( ) 0, the control variables as =, and the disturbances as ( ) 0.Thematrices and are given by à 1 0! à 0 75 0! à 0 5! = 0 1 =0 1 = 0 0 25 = 0 5 The theory discussed in the previous section implies that all we need to do is iterate the matrix Riccati equation to find, the calculate the policy reaction coefficients. The equations we will need are therefore +1 = + 0 2 0 ( + 0 ) 1 0 = ( + 0 ) 1 0 The Matlab code to solve the optimal linear-quadratic control problem is discussed below. Firstly, a new program is started by clearing the screen and the discount factor is defined. CLEAR; beta=0.99; The matrices and are first defined to be of the correct dimension and the non-zero elements are set. 9

Q=zeros(1,1); R=zeros(2,2); A=zeros(2,2); B=zeros(2,1); Q(1,1)=0.1; R(1,1)=1; R(1,2)=0; R(2,1)=0; R(2,2)=1; A(1,1)=0.75; A(1,2)=0; A(2,1)=0; A(2,2)=0.25; B(1,1)=-0.5; B(2,1)=-0.5; The next section initialises the matrix Riccati equation iterations The variable is used to measure the largest absolute in the elements of between successive iterations. The variable is simply a count of how many iterations have been carried out. The initial guess of the matrix is contained in the matrix 0. Asinitialvalues,weuse Ã! 0 000001 0 0= 0 0 000001 These starting values are used rather than zero because, if =0and 0 is zero then the matrix + 0 in the Riccati equation is not invertible. In our example, 6= 0and we could just as easily used zeros as starting values. In practice, the algorithm is not sensitive to starting values in the vast majority of cases. 10

d=1; i=0; P0=-0.000001*eye(2); Begin matrix Riccati equation iterations. We continue iterations until the maximum absolute difference in the elements of between iterations is less than 0.0000000001. The new value +1 is stored in the matrix 1. After each iteration, the new value 1 is compared to the old value 0. The difference is contained in, from which the maximum absolute value is extracted into. If is not sufficiently small then the initial guess 0 is updated and iterations continue. For each iteration, the iteration number and maximum absolute deviation are collected in and respectively in order to be printed at the end. WHILE d 0.0000000001 P1=R+beta*A P0*A-(beta*A *P0*B)*(invpd(Q + beta*b *P0*B)) *(beta*b *P0*A); Pd=P1-P0; d=max(abs(pd)); d=max(d ); P0=P1; i=i+1; END; The matrix Riccati equation iterations are now complete. The policy function matrix is calculated from the final iteration of the matrix. Both policy function matrix and value function matrix are printed in the command window. P=P0; F=-inv(Q+beta*B *P*B)*(beta*B *P*A); 11

ID=[I(2:length(I)) D(2:length(I)) ]; disp( i d ); disp(id); disp( SOLUTIONS ); disp( F ); disp(f); disp( P ); disp(p); The output of the computer code is as follows i d 1.00000000000000 1.00000093812485 2.00000000000000 0.32523416362466 3.00000000000000 0.08055819178924 4.00000000000000 0.01887570616174 5.00000000000000 0.00433868656498 6.00000000000000 0.00099273060023 7.00000000000000 0.00022690780983 8.00000000000000 0.00005185174817 9.00000000000000 0.00001184823275 10.00000000000000 0.00000270731205 11.00000000000000 0.00000061861695 12.00000000000000 0.00000014135300 13.00000000000000 0.00000003229893 14.00000000000000 0.00000000738025 15.00000000000000 0.00000000168638 16.00000000000000 0.00000000038533 17.00000000000000 0.00000000008805 12

SOLUTIONS F 0.74495417123607 0.17590987848800 P 1.43029303877617-0.10618330436474-0.10618330436474 1.04418992871296 As can be seen from the low number of iterations, the matrix Riccati equation iterations converge quickly. Returning to the context of our numerical model, the results imply policy and value functions of the following form. = 0 745 +0 176 ( ) = 1 43 2 +1 04 2 2 0 11 According to the policy function, the interest rate needs to rise whenever inflation or output is above target. The result is intuitively appealing, with the central bank deflating the economy when inflation and/or output is too high. The larger reaction to inflation than output is due to our assumption that inflation is more persistent than output. Inflation is intrinsically more problematic in the model since, if inflation deviates from target in the current period then the deviation is likely to persist to the next period. The value function can similarly be interpreted. The coefficient on the square of inflation exceeds that on the square of output precisely because the higher persistence of inflation makes it more problematic. The negative coefficient on the cross-product of inflation and output reflects the fact that it is easier to control inflation and output when they are deviating from target in the same direction. A rise in the interest rate depresses both inflation and output, so if inflation is above target and output below target (i.e. stagflation) then it is very difficult to stabilise the economy. 13