Dynamic Programming Handout

4.45 Reciaion, February 8 h, 25 -- Todd Gormley --------------Dynamic Programming Handou -------------- Wha is Dynamic Programming (DP)? ---------------------------------- DP is when we us o change a problem from: Ino somehing of he form, max β (, ).. (, ), { u } + = = = rx u s x gx u x given { β [ ]} V( x ) = max rxu (, ) + V g ( x, u) u Which is he sae variable? Which is he conrol? Maximizaion on he RHS gives us a policy funcion, hx ( ), such ha, ( ) β ( ) V( x) = rxhx, ( ) + V g xhx, ( ) The goal of his handou is show you how o solve for hx ( ) and V( x )

Condiions for he DP Soluion ---------------------------------- In order o solve his problem, we need cerain condiions o be rue. One se ypically used is:. r concave and bounded 2. consrain se generaed by g is convex and compac Bu, for he purpose of 45, you should jus assume ha he necessary condiions for solving wih he Bellman equaion are saisfied. [For greaer deails on dynamic programming and he necessary condiions, see Sokey and Lucas (989) or Ljungqvis and Sargen (2). Ivan s 4.28 course also covers his in greaer deail.] General Resuls of Dynamic Programming ----------------------. V( xunique ) and sricly concave 2. u = hx ( ) is ime invarian and unique 3. Off corners, V( xis ) differeniable 4. Soluion can be found by ieraion: j, { β j } V j+ ( x) = max rxu (, ) + V ( x ) s.. x = gxu (, ), xgiven Two Mehods for Solving DP Problem ----------------------------. Ieraion over he Value Funcion 2. Guess and Verify This handou will now work you hrough an example of how o solve a DP problem in he conex of a basic growh model wih no leisure. I will firs solve he model in he usual way wih Opimal Conrol. Then i will show you how o solve i using DP.

A Simple Model of Economic Growh ----------------------------- The basic seup of he problem for he social planner is: max U = β Uc ( ) { c, k } + = = s.. c + k ( δ ) k + f( k ) + c, k + k > given As usual, assume all he oher ypical growh model assumpions apply. e.g. inada condiions Why can we ignore he condiions c, k? + Finally, o be explici, assume log preferences and a Cobb-Douglas producion funcion. Uc () = ln c f ( k) = k

Solving Using Opimal Conrol (i.e he ypical Lagrangian) ------------------ The Lagrangian looks as follows: β ( ) µ ( δ) ( ) + = = [ ] L = Uc + k + f k c k Noe ha we have an infinie number of consrains. One for each period. We can now solve his jus like a normal Lagrangian The FOCs are as follows: And our budge consrain is: Combining our FOCs, we have: c : β c = µ ( ) k : µ δ + k = µ + + + c + k = ( δ ) k + k + c c + = β + k + δ So, we can find he growh rae in he economy, which is nice. However, i is a bi difficul o see how much individuals consume and inves in each period of ime. This is one way Dynamic Programming can help us.

Solving Using Dynamic Programming ---------------------------------- Firs, le s rewrie he problem in he DP form. This is done as follows: { β } V( k ) = max ln c + V( k ) + c, k+ s.. c + k = k + Noe: I ve se δ = in order o simplify he mah. How should we inerpre V( k? ) We can acually now drop he ime subscrips of he problem, and plug in for he consrain. Afer doing his, we have he following: k' k { ( ) β } V( k) = max ln k k' + Vk ( ') There are now wo ways o solve he problem via Guess and Verify or via Ieraion. I ll firs sar wih Guess and Verify.

Guess and Verify Mehod: The Policy Funcions ---------------------------------- Le s guess ha he V( k ) is of he form E+ Fln k where E and F are unknown consans. Using his guess, we solve he maximizaion problem on he RHS of he Bellman equaion. k' { ( k k ) + βe + βf k } max ln ' ln ' The FOC is: Solving his, we have: β F + = k k' k' βf k' = k + β F How can we inerpre his? Using he budge consrain, we see ha i is opimal for he individual o consume c = k + β F Boh of hese soluions are he policy funcions. i.e. Capial omorrow, k ', and consumpion oday, c, are only funcions of capial oday, k! We express hese policy funcions as ck ( ) and k'( k ). This is clearly a much cleaner resul han when we solved wih opimal conrol. Guess and Verify Mehod: Finding & Proving he Value Funcion ------------------ To finish he problem, we acually need o finish solving for he consans E and F, and prove ha our guess saisfies he Bellman equaion. To do his, we plug our policy funcions back ino he Bellman equaion, and solve for our consans. This is shown below: { ( ) β β } ( ) β β V( k) = max ln k k' + E + F ln k' E+ Flnk = ln k k'( k) + E+ F ln k'( k) βf βf E+ Flnk = ln k k + βe+ βf ln k + βf + βf Wih a fair amoun of algebra, you can solve for he consans. I ll leave ha mah o you.

Solving DP via Ieraion: Why Does i Work? ----------------------------- Suppose we have he following ype of Bellman Equaion: This can hen be rewrien as: ck, ' { β } V( k) = max Uc () + Vk ( ') s.. c+ k' = f ( k) ck, ' { ( ) β } k' f ( k) V( k) = max U f( k) k' + Vk ( ') Under he proper assumpions, his Bellman equaion is a conracion mapping and has a unique fixed poin. More specifically: Le B be a se of coninuous and bound funcions v :[, f( k)] R and consider he mapping T : B B defined as follows: ck, ' { β } Tv( k) = max Uc () + vk ( ') s.. c+ k' = f ( k) ck, ' Because T is a conracion mapping, and has a unique fixed poin V = TV, we can use ieraion o solve he problem. Specifically, we use V j+ ( k) = max { Uc () + βvj( k') }, and do he following:. Make an iniial guess a he value funcion, V( k. ) Call hisv( k ). 2. Perform he maximizaion of he Bellman equaion, using your guess V( k ) 3. This yields you a new value funcion, V( k ) 4. Replace V( k ) wih V( k ), and repea Sep 2. 5. Coninue his ieraion unil convergence o he fixed poin V( k )

Solving via Ieraion on he Value Funcion (in pracice) ------------------------- We firs need o make a guess a he value funcion. We ll call he firs guess V( k ). And o keep i simple, I ll guess ha V( k ) =. Now le s solve he following problem: Plugging in for our guess, we have: Wha is k ' and V( k )? { ( ) β } V ( k) = max ln k k' + V ( k') k' k k' k { ( )} V ( k) = max ln k k' Now, we use he same mehod o find V2( k ), and we coninue doing his unil convergence. Unforunaely, his mehod is quie ime and algebra inensive, so I won work ou any more seps here. Thankfully, here is a quicker way o do his, and his is done by using a mahemaical program such as MATLAB. This handou will now provide a raher deailed skech on how o numerically solve a dynamic programming using a mahemaical program, such as MATLAB. For help wih MATLAB synax, please see he handou wrien by Francesco Franco.

A Rough Ouline on How o Numerically Solve a DP Problem. Creae a vecor of discree values for your sae variable, k a. This will be your vecor of poenial sae variables o choose from. You migh wan o creae a vecor of values ha spans he seady sae value of he economy. * b. For example. Suppose he seady sae is k = 3. Then, you migh creae he following vecor of sae values: 2 3 4 5 2. For each value of he sae variable in your vecor, calculae he poenial uiliy possible from each choice over your vecor of possible saes and sore hese values. a. For example, using he above 5 possible saes: k =,2,3,4,5 i. Calculae he consumpion and uiliy of he agen if k = and she chooses k ' =. Suppose his yields U() = 2.4. Sore his value. ii. Now calculae and sore he uiliy of he agen when k = and k ' = 2. Suppose his yields U() = 2.7 iii. In his example, you will have five uiliies values for each of he five differen saes, depending on your five poenial choices of k '. Thus, you should have 25 values saved in oal. Noe: You have o be a bi careful here, some of he values for k' in your vecor of poenial saes may imply a c <. I.e. You are violaing he consrains of he problem. You need o wach for hese cases, and make sure hey are no considered a viable choice for he individual. b. Save hese values o a marix as follows: 2.4 3 2.6 2.7 2. 4. 2 3 2 3.3 i. Columns represen k, rows represen k '

3. Make an iniial guess a he value funcion. Call his Vold( k ). a. One way o do his is o creae an appropriaely long vecor of zeros. I.e. Your iniial guess is ha for any of he saes you are considering, he value funcion reurns zero. V( k) = k i. In he example I ve been using so far, I would need o creae a vecor of lengh five since I have five possible saes: k =,2,3,4,5. ii. This would look like: 4. Now ierae over he value funcion unil i converges i.e. Use your iniial guess a he value funcion, Vold( k ), o calculae a new guess a he value funcion, Vnew ( k ). a. Firs, hink of your Bellman equaion as follows: Vnew( k) = max { Uc () + βvold ( k') } b. Second, choose he maximum value for each poenial sae variable by using your iniial guess a he value funcion, Vold( k ) and he uiliies you calculaed in par 2. i.e. calculae Uc () + βv ( k') for each k and k' combo and choose he maximum value for each k. This is your Vnew ( k ). old i. In he above example of k =, i is easy o see from my marix of uiliy values in par 2B ha he maximum uiliy he agen can achieve is 2.7, and his occurs when she chooses k ' = 2. ii. So, using Uc () = 2.7 and β V old (2) = (because my iniial guess is V ( k) = k ), we see ha V () = 2.7 old c. Third, sore hese values ino a vecor, Vnew ( k ). This is your new guess a he value funcion for each poenial sae. new i. In he above example, I would have he following afer he firs ieraion: 2.7 2. 4. 2.6 3.3

d. Use some es o see if your new guess a he value funcion, Vnew ( k ) is arbirarily close o your iniial guess, Vold( k ). i. If i fails he es, replace your old guess a he value funcion wih he new one, and sar a 4A again. I.e. V ( k) = V ( k) ii. If i passes he es, you re done! Move on o Sep 5. old new 5. Plo your value funcion 6. Plo your policy funcions: ck ( ) and k'( k ) a. In he above example when k =, we saw ha i was opimal o choose k ' = 2. Suppose his remains rue afer we achieved convergence. Then, k '() = 2. You wan o sore his in a vecor, along wih he k'( k ) for all oher possible values of k i. You can do somehing similar o find ck ( ) Noe: If you waned o, you could calculae he policy funcion afer each ieraion on he value funcion.