Inaugural - Dissertation

Transcription

1 Inaugual - Dissetation zu Elangung de Doktowüde de Natuwissenschaftlich-Mathematischen Gesamtfakultät de Rupecht - Kals - Univesität Heidelbeg vogelegt von Diplom-Mathematike Makus Fische aus Belin Datum Tag de mündlichen Püfung: 2. Novembe 27

2

3 Discetisation of continuous-time stochastic optimal contol poblems with delay Gutachte: Pof. D. Makus Reiß Univesität Heidelbeg Pof. Salah-Eldin A. Mohammed Southen Illinois Univesity, Cabondale

4

5 i Abstact In the pesent wok, we study discetisation schemes fo continuous-time stochastic optimal contol poblems with time delay. The dynamics of the contol poblems to be appoximated ae descibed by contolled stochastic delay o functional diffeential equations. The value functions associated with such contol poblems ae defined on an infinite-dimensional function space. The discetisation schemes studied ae obtained by eplacing the oiginal contol poblem by a sequence of appoximating discete-time Makovian contol poblems with finite o finite-dimensional state space. Such a scheme is convegent if the value functions associated with the appoximating contol poblems convege to the value function of the oiginal poblem. Following a geneal method fo the discetisation of continuous-time contol poblems, sufficient conditions fo the convegence of discetisation schemes fo a class of stochastic optimal contol poblems with delay ae deived. The geneal method itself is cast in a fomal famewok. A semi-discetisation scheme fo a second class of stochastic optimal contol poblems with delay is poposed. Unde standad assumptions, convegence of the scheme as well as unifom uppe bounds on the discetisation eo ae obtained. The question of how to numeically solve the esulting discete-time finite-dimensional contol poblems is also addessed.

6 ii Zusammenfassung In de voliegenden Abeit untesuchen wi Schemata zu Disketisieung von zeitstetigen stochastischen Kontollpoblemen mit Zeitvezögeung. Die Dynamik solche Pobleme wid von gesteueten stochastischen Diffeentialgleichungen mit Gedächtnis beschieben. Die zugehöigen Wetfunktionen sind auf einem unendlich-dimensionenalen Funktionenaum definiet. Man ehält die Disketisieungsschemata, die wi betachten, indem man das Ausgangspoblem duch eine Folge appoximieende zeitdiskete Makovsche Kontollpobleme esetzt, deen Zustandsaum endlich-dimensional ode endlich ist. Ein solches Schema ist konvegent, wenn die Wetfunktionen de appoximieenden Steuungspobleme gegen die Wetfunktion des uspünglichen Poblems steben. Indem wi eine allgemeine Methode zu Disketisieung zeitstetige Kontollpobleme anwenden, ehalten wi hineichende Bedingungen fü die Konvegenz von Disketisieungsschemata fü eine Klasse von stochastischen Steueungspoblemen mit Zeitvezögeung. Die Methode zu Konvegenzanalyse selbst wid in einen fomalen Rahmen gefasst. Wi fühen dann ein Semidisketisieungsschema fü eine zweite Klasse von stochastischen Steueungspoblemen mit Zeitvezögeung ein. Unte üblichen Annahmen weden die Konvegenz des Schemas, abe auch gleichmäßige obee Schanken fü den Disketisieungsfehle hegeleitet. Schließlich widmen wi uns de Fage, wie die esultieenden endlich-dimensionalen Steueungspobleme numeisch gelöst weden können.

7 iii Danksagung Fü den Voschlag des Gebietes, in das die voliegende Abeit fällt, und die Beteuung und ununtebochene Untestützung in allen fachlichen und auch außefachlichen Fagen danke ich Pof. Makus Reiß. Mein Dank gebüht Pof.ssa Giovanna Nappo von de Univesität La Sapienza fü die fuchtbae Zusammenabeit und ihe Gastfeundschaft wähend meines Aufenthalts in Rom von Apil bis Septembe 26. Fü die Untestützung bei de Oganisation dieses Aufenthalts danke ich Pof. Pete Imkelle. Den Mitglieden de Foschungsguppe Stochastische Algoithmen und Nichtpaametische Statistik am Weiestaß Institut WIAS in Belin und den Mitglieden de Statistikguppe an de Univesität Heidelbeg danke ich fü die angenehme gemeinsam vebachte Zeit. Fü wetvolle Hinweise zu Theoie deteministische Kontollpobleme danke ich Pof. Mauizio Falcone. Fü fachliche Gespäche, Anegungen und Untestützung danke ich Stefan Ankichne, Chistian Bende, Chistine Gün, Jan Johannes, Alexande Linke, Jan Neddemeye, Eva Saskia Rohbach und Kasten Tabelow. Finanzielle Untestützung von Seiten de Deutschen Foschungsgemeinschaft DFG und des ESF-Pogamms Advanced Methods in Mathematical Finance ekenne ich dankend an. Schließlich und endlich danke ich Stella fü ihe Geduld und ih Veständnis.

8 iv

9 Contents Notation and abbeviations vii 1 Intoduction Stochastic optimal contol poblems with delay Stochastic delay diffeential equations Optimal contol poblems with delay Examples of optimal contol poblems with delay Linea quadatic contol poblems A simple model of esouce allocation Picing of weathe deivatives Delay poblems educible to finite dimension Appoximation of continuous-time contol poblems Aim and scope The Makov chain method Kushne s appoximation method An abstact famewok Optimisation and contol poblems Appoximation and convegence Application to stochastic contol poblems with delay The oiginal contol poblem Existence of optimal stategies Appoximating chains Convegence of the minimal costs An auxiliay esult Discussion Two-step time discetisation and eo bounds The oiginal contol poblem Fist discetisation step: Eule-Mauyama scheme Second discetisation step: piecewise constant stategies Bounds on the total eo Solving the contol poblems of degee N, M Conclusions and open questions v

10 vi CONTENTS A Appendix 79 A.1 On the Pinciple of Dynamic Pogamming A.2 On the modulus of continuity of Itô diffusions A.3 Poofs of constant coefficients eo bounds Bibliogaphy 95

11 vii Notation and abbeviations a b a b the smalle of the two numbes a, b the bigge of the two numbes a, b 1 A indicato function of the set A x x N N Z BX CX, Y CX DI Gauß backet of the eal numbe x, that is, the lagest intege not geate than x the least intege not smalle than the eal numbe x the set of natual numbes stating fom one the set of all non-negative integes the set of all integes the space of all bounded eal-valued functions on the set X the space of all continuous functions fom the topological space X to the topological space Y the space of all continuous eal-valued functions on the topological space X the Skoohod space of all eal-valued càdlàg functions on the inteval I C in Chapte 3: the space C[, ], R d C N in Chapte 3: the space C[ N, ], Rd ĈN A T càdlàg iff in Chapte 3: the space of all ϕ C which ae piecewise linea w.. t. the gid {k N k Z} [, ] tanspose of the matix A ight-continuous with left-hand limits Fench aconym if and only if w.. t. with espect to

12 viii

13 Chapte 1 Intoduction In this thesis, discetisation schemes fo the appoximation of continuous-time stochastic optimal contol poblems with time delay in the state dynamics ae studied. Optimal contol poblems of this kind ae infinite-dimensional contol poblems in a sense to be made pecise below; they aise in engineeing, economics and finance, among othes. We will deive esults about the convegence of discetisation schemes. Fo a moe specific semi-discetisation scheme, a pioi bounds on the discetisation eo will also be obtained. Such esults ae useful in the numeical solution of the oiginal contol poblems. Section 1.1 pesents the class of optimal contol poblems we will be concened with. In Section 1.2, some examples of optimal contol poblems with delay ae given. Section 1.3 povides an oveview ove appoaches and some esults fom the liteatue elated to the discetisation of continuous-time optimal contol poblems with o without delay. The oganisation of the main pat of the pesent wok, its aim and scope ae specified in Section Stochastic optimal contol poblems with delay Hee, we intoduce the type of optimal contol poblems we will be concened with in this thesis. An optimal contol poblem is composed of two pats: a contolled system and a pefomance citeion. Given an initial condition of the system and a stategy, the system poduces a unique output. A numeical value is assigned to each output accoding to the pefomance citeion. In this way, the pefomance of any stategy fo any given initial condition is measued. The objective is to find stategies which pefom as good as possible, and to calculate optimal pefomance values. A contolled system is usually modelled as a discete- o continuous-time paametised dynamical system. In continuous time, contolled systems ae often descibed by some kind of diffeential equation. The continuous-time contolled systems we ae inteested in, hee, ae modelled as stochastic o deteministic delay diffeential equations. We descibe this class of equations in Subsection 1.1.1; a standad efeence is Mohammed In Subsection 1.1.2, the class of stochastic optimal contol poblems with delay we study in this wok is intoduced. If the time delay is zeo, then those poblems educe to odinay stochastic optimal contol poblems. Fo this latte class of poblems a welldeveloped theoy exists; see, fo instance, Yong and Zhou 1999 o Fleming and Sone 1

14 2 CHAPTER 1. INTRODUCTION 26. Basic optimality citeia, in paticula the Pinciple of Dynamic Pogamming, ae also mentioned in Subsection Stochastic delay diffeential equations An odinay Itô stochastic diffeential equation SDE is an equation of the fom 1.1 dxt = b t, Xt dt + σt, Xt dw t, t, whee b is the dift coefficient, σ the diffusion coefficient and W. a Wiene pocess. When the diffusion coefficient σ is zeo, then Equation 1.1 takes on the fom of an odinay diffeential equation ODE. Let the state space be R d. The unknown function X. in Equation 1.1 is then an R d -valued stochastic pocess with continuous o càdlàg 1 tajectoies. The dift coefficient b is a function [, R d R d, the diffusion coefficient σ a matix-valued function [, R d R d d 1, and W is a d 1 -dimensional Wiene pocess defined on a filteed pobability space Ω, F, P adapted to the filtation F t t. In the notation, we often omit the dependence on ω Ω. Equation 1.1 is to be undestood as an integal equation. Standad assumptions on the coefficients b, σ guaantee that the initial value poblem 1.2 Xt = { X + b s, Xs ds + σs, Xs dw s, t >, x, t =, possesses, fo each x R d, a unique stong solution, that is, thee is a unique up to indistinguishability R d -valued stochastic pocess X = Xt t with continuous o càdlàg tajectoies which is defined on Ω, F, P and adapted to the filtation F t t such that Equation 1.2 is satisfied. The initial condition may also be stochastic, namely an F - measuable R d -valued andom vaiable. Standad assumptions guaanteeing stong existence and uniqueness of solutions to Equation 1.2 ae that b, σ ae jointly measuable, Lipschitz continuous in the second vaiable unifomly in the fist and that they satisfy a condition of sublinea gowth in the second vaiable unifomly in the fist; see paagaph in Kaatzas and Sheve 1991: p. 289, fo example. An impotant popety of solutions of SDEs is that they ae Makov pocesses w.. t. the given filtation. Anothe equally impotant popety is that they ae continuous semimatingales with semi-matingale decomposition given by the SDE itself. In addition to the notion of stong solution, thee is the notion of weak solution to an SDE. While stong solutions must live on the given pobability space and must be adapted to the given filtation, weak solutions ae only equied to exist on some suitable stochastic basis; fo example, the given filtation may be the one induced by the diving Wiene pocess, but solutions exist only when they ae adapted to some lage filtation. Thus, thee ae two notions of existence and also two notions of uniqueness fo an SDE, cf. Kaatzas and Sheve 1991: Sects. 5.2 & A function defined on an inteval is càdlàg iff it is ight-continuous with limits fom the left.

15 1.1. STOCHASTIC OPTIMAL CONTROL PROBLEMS WITH DELAY 3 The basic existence and uniqueness esults cay ove to the case of andom coefficients, that is, b, σ ae defined on [, R d Ω, povided b, σ ae F t -adapted. 2 A contolled SDE can be epesented in the fom 1.3 dxt = b t, Xt, ut dt + σt, Xt, ut dw t, t, whee u. is a contol pocess, that is, an F t -adapted function [, Ω Γ. Hee, Γ is a sepaable metic space, called the space of contol actions. The coefficients in Equation 1.3 ae deteministic functions [, R d Γ R d and [, R d Γ R d d 1, espectively. Fo any given contol pocess u., howeve, b.,., u., σ.,., u. ae adapted andom coefficients. A contol pocess u. such that the initial value poblem coesponding to the contolled equation, hee Equation 1.3, has a unique solution fo each initial condition of inteest will be called an admissible stategy o, simply, a stategy. Thoughout this thesis, we will epesent contol pocesses and stategies as Γ-valued functions defined on [, Ω, that is, defined on the poduct of time and scenaio space. In the deteministic case, contol pocesses educe to functions [, Γ, so-called open-loop contols. In the liteatue, contol pocesses ae often epesented as feedback contols, that is, as deteministic functions defined on the poduct of time and state space. This epesentation, though being natual fo the contol of Makov pocesses, leads to technical difficulties aleady fo discete-time contol poblems, see Betsekas and Sheve Feedback contols give ise to contol pocesses in the fom consideed hee. Systems with delay ae chaacteised by the popety that thei futue evolution, as seen fom any instant t, depends not only on t and the cuent state at t and possibly the contol, but also on states of the system a cetain amount of time into the past. We will assume thoughout that the system has bounded memoy; thus, thee is some finite > such that the futue evolution of the system as seen fom time t depends only on t and system states ove the peiod [t, t]. The paamete is the maximal length of the memoy o delay. Stochastic delay diffeential equations SDDEs model systems with delay. The dift and diffusion coefficient of an SDDE ae functions of time and tajectoy segments and, possibly, the contol action. Fo an R d -valued function ψ = ψ. living on the time inteval [,, the segment of length at time t [, is the function ψ t : [, ] R d, ψ t s := ψt+s, s [, ]. If ψ is a continuous function, then the segment ψ t at time t is a continuous function defined on [, ]. Likewise, if ψ is a càdlàg function, then the segment ψ t at time t is a càdlàg function defined on [, ]. Accodingly, if Xt t is an R d -valued stochastic pocess with continuous tajectoies, then the associated segment pocess X t t is a stochastic pocess taking its values in C := C[, ], R d, the space of all R d -valued continuous functions on the inteval [, ]. In this wok, the space C will always be equipped with the supemum nom induced by the standad nom on R d. 2 Stictly speaking, the statement about SDEs with andom coefficients is tue only if existence and uniqueness ae undestood in the stong sense. The notions of weak existence and weak uniqueness make sense also fo solutions to contolled SDEs with o without delay, cf. Section 3.1.

16 4 CHAPTER 1. INTRODUCTION The segment pocess associated with an R d -valued stochastic pocess with càdlàg tajectoies takes its values in the space D := D[, ], R d of all R d -valued càdlàg functions on [, ]. We will efe to the space of tajectoy segments as the segment space. As segment space we will choose eithe D o C. Notice that both spaces depend on the dimension d and the maximal length of the delay ; both d and may vay. In the notation just intoduced, an SDDE is of the fom 1.4 dxt = b t, X t dt + σt, Xt dw t, t. The coefficients b, σ ae now functions defined on [, D o, in the case of andom coefficients, on [, D Ω, whee D is the segment space. In ode to obtain unique solutions, as initial condition we have to pescibe not a point x R d, but an initial segment ϕ D. The initial segment might also be stochastic, namely a D-valued F -measuable andom vaiable. Let the segment space be the space C of continuous functions. Theoem II.2.1 in Mohammed 1984: p. 36 gives sufficient conditions such that, fo each initial segment ϕ C, the initial value poblem 1.5 Xt = { X + b s, X s ds + σs, X s dw s, t >, ϕt, t [, ], possesses a unique stong solution. Sufficient conditions ae that the coefficients b, σ ae measuable, ae Lipschitz continuous in thei segment vaiable unde the supemum nom on C unifomly in the time vaiable, satisfy a linea gowth condition and, in case they ae andom, ae F t -pogessively measuable. Existence and uniqueness esults fo SDDEs can also be deived fom the existence and uniqueness esults fo geneal functional SDEs as given, fo instance, in Potte 23: Ch. 5. Thee, the coefficients of the SDE ae allowed to be andom and to depend on the entie tajectoy of the solution fom time zeo up to the cuent time. Initial conditions, howeve, ae not tajectoy segments, but points in R d o R d -valued andom vaiables. Hence, to tansfe the esults, the dift and diffusion coefficient of the SDDE have to be edefined accoding to the given initial condition. A contolled SDDE can be epesented in the fom 1.6 dxt = b t, X t, ut dt + σt, X t, ut dw t, t, whee u. is a Γ-valued contol pocess as above and b, σ ae deteministic functions defined on [, D Γ. Existence and uniqueness ae again a consequence of the geneal esults applied to the andom coefficients b.,., u., σ.,., u.. Obseve that, in Equation 1.6, thee is no delay in the contol. At time t >, the coefficients b, σ depend on ut, and ut is F t -measuable. Systems with delay in the contol ae outside the scope of the pesent wok. Some kind of implementation delay, howeve, can be captued. Let w be some measuable function Γ R l. We can now add l additional dimensions to the state space R d and conside an SDDE of the fom dxt = b t, X t, Y t, ut dt + σt, X t, Y t, ut dw t, dy t = w ut dt,

17 1.1. STOCHASTIC OPTIMAL CONTROL PROBLEMS WITH DELAY 5 whee X. epesents the fist d components and Y. the emaining l components. The coefficients b, σ do not diectly depend on the tajectoy of u., but, though Y t, on segments of the pocess wusds t and the initial segment Y ; b, σ may, fo example, be functions of the diffeence Y t δ Y t, whee δ [,. In this way, distibuted implementation delay can be modelled. The solution of an SDDE like Equation 1.4 is, in geneal, not a Makov pocess. Suppose the coefficients of the SDDE ae deteministic and uncontolled o else a constant contol is applied, and let X. be a solution pocess. Then the segment pocess X t t associated with X. enjoys the Makov popety, cf. Theoem III.1.1 in Mohammed 1984: p. 51. The Makov semigoup of linea opeatos induced by the tansition pobabilities of the segment pocess is weakly, but not stongly continuous. In paticula, only the weak infinitesimal geneato exists. A epesentation of the weak infinitesimal geneato on a subset of its domain as a diffeential opeato can be deived, cf. Theoem III.4.3 in Mohammed 1984: pp The solution of an SDDE like Equation 1.4, although geneally not a Makov pocess, is an Itô diffusion and a continuous semi-matingale afte time zeo, and the Itô fomula is applicable as usual. Howeve, the usual Itô fomula does not apply to the segment pocess. It is possible to develop an Itô-like calculus also fo the segment pocesses associated with solutions of SDDEs, see Hu et al. 24 and Yan and Mohammed 25. In this thesis, the diving noise pocess of the continuous-time systems will always be a Wiene pocess. Extensions of some of the esults of this thesis, in paticula the convegence analysis of Section 2.3, to systems diven by moe geneal Lévy pocesses ae possible. In Chaptes 2 and 3, we will be concened with the discetisation of contolled systems with delay; hee, we give some efeences to woks concened with the discetisation of uncontolled systems with delay. An oveview of numeical methods fo uncontolled SDDEs is given in Buckwa 2. The simplest discetisation pocedue is the Eule-Mauyama scheme. The wok by Mao 23 gives the ate of convegence fo this scheme povided the SDDE has globally Lipschitz continuous coefficients and the dependence on the segments is in the fom of genealised distibuted delays; Poposition 3.3 in Section 3.2 of the pesent wok povides a patial genealisation of Mao s esults and uses aguments simila to those in Calzolai et al. 27. The most common fist ode scheme is due to Milstein; see Hu et al. 24 fo the ate of convegence of this scheme applied to SDDEs with point delay Optimal contol poblems with delay Recall that an optimal contol poblem is composed of a contolled system and a pefomance citeion. In what follows, the contolled system will always be descibed by a contolled SDDE like Equation 1.6 in Subsection As initial condition, an element of the segment space D has to be pescibed; the segment space D will be eithe C := C[, ], R d o D := D[, ], R d. When, in addition to the initial segment ϕ D, also the initial time t [, is allowed to vay, then the system output fo

18 6 CHAPTER 1. INTRODUCTION initial condition t, ϕ unde contol pocess u. is detemined by 1.7 { ϕ + Xt = b t +s, X s, us ds + σt +s, X s, us dw s, t >, ϕt, t [, ], povided a unique solution X = X t,ϕ,u exists. Notice that the solution pocess X. is defined ove time [,, and the evolution of the system stats at time zeo. The initial time t only appeas in the time agument of the coefficients. The pefomance citeion is usually given in tems of a cost functional. The cost functionals we will conside ae of the fom 1.8 t, ϕ, u. τ E f t +s, X s, us ds + g X τ, whee X = X t,ϕ,u is the solution to Equation 1.7 with initial condition t, ϕ unde stategy u. and τ is the emaining time, which may depend on t and X t,ϕ,u. The functions f, g ae called the cost ate and teminal cost, espectively; they may depend on segments of the solution pocess; in geneal, f is a function [, D Γ R, while g is a function D R. Two vesions of 1.8 will play a ole. The fist vesion gives ise to optimal contol poblems with finite time hoizon. Choose T >, the deteministic time hoizon, and set τ := T t. Fo the second vesion, choose a bounded open set O R d, let ˆτ O be the time of fist exit of X t,ϕ,u fom O and set τ := ˆτ O T t, whee T, ]. This leads to optimal contol poblems with andom time hoizon. Let T > be finite, and let τ in 1.8 be T t. Denote by U the set of admissible stategies, that is, the set of all those contol pocesses u. such that the initial value poblem 1.7 yields a unique solution and the expectation in 1.8 a finite value fo each initial condition t, ϕ [, T ] D. Let the function J : [, T ] D U R be defined accoding to 1.8. Then J is the cost functional of an optimal contol poblem with finite time hoizon. Given an optimal contol poblem, thee is a twofold objective: detemine the minimal costs and find an optimal stategy fo any initial condition. A stategy u is optimal fo a given initial condition t, ϕ iff 1.9 Jt, ϕ, u = inf u U Jt, ϕ, u. Existence of optimal stategies is not always guaanteed. Let us assume that the ight hand side of Equation 1.9 is finite fo all initial conditions which is not necessaily the case. A diect minimisation of Jt, ϕ,. ove the set U is usually not possible. Obseve that initial conditions ae time-state pais; hee, states ae segments, that is, continuous o càdlàg functions on [, ]. A simple, yet fundamental appoach, associated with the wok of R. Bellman, to solving the dynamic optimisation poblem is as follows. Intoduce the function which assigns the minimal costs to each time-state pai. This function is called the value function. The values of the value function ae, of couse, unknown at this stage. If the system, the set of stategies and the cost functional have a cetain additive stuctue in time, then the value function obeys Bellman s Pinciple of Optimality o, as it is also called, the Pinciple of

19 1.1. STOCHASTIC OPTIMAL CONTROL PROBLEMS WITH DELAY 7 Dynamic Pogamming PDP. Let V denote the value function of some optimal contol poblem; thus, V is a function I S R, whee I is a time inteval and S the state space. Bellman s Pinciple then states that V satisfies 1.1 V t, x = T t, V,. t, x fo all x S, t, I, t, whee T t, is a two-paamete semigoup of monotone opeatos, called Bellman opeatos; see Fleming and Sone 26: Sect. II.3 fo this abstact fomulation of the PDP. In the case at hand, the value function is defined by 1.11 V : [, T ] D R, V t, ϕ := inf u U Jt, ϕ, u. The Pinciple of Dynamic Pogamming takes on the fom 1.12 V t, ϕ = inf E f t +s, Xs u, us ds + V t +t, Xt u, t T t, u U whee X u is the solution to Equation 1.7 unde contol pocess u with initial condition t, ϕ. The minimisation on the ight hand side of Equation 1.12 could be esticted to stategies defined on the time inteval [, t]. Obseve that the validity of the PDP has to be veified fo each class of optimal contol poblems. Fo finite hoizon stochastic and deteministic optimal contol poblems with delay, the PDP is indeed valid, see Lassen 22 and also Appendix A.1 fo the pecise statement. The Makov popety of the segment pocesses associated with solutions to Equation 1.7 unde cetain stategies is essential fo the validity of the PDP in the fom of Equation Notice that an optimal contol poblem with delay is, geneally, infinite-dimensional in the sense that the coesponding value function lives on an infinite-dimensional function space, namely the segment space. When the contolled pocesses ae contolled Makov pocesses with finite-dimensional state space and the value function is sufficiently smooth, then the PDP in conjunction with Dynkin s fomula allows to deive a patial diffeential equation PDE which is solved by the value function. Such a PDE, which involves the family of infinitesimal geneatos associated with the contolled Makov pocesses and chaacteises the value function, is called Hamilton-Jacobi-Bellman equation HJB equation. In geneal, the value function need not be sufficiently smooth; consequently, the HJB equation does not necessaily possess classical solutions. Viscosity solutions povide the ight genealisation of the concept of solution fo HJB equations, see Fleming and Sone 26. In pinciple, it is possible to deive an HJB equation and define viscosity solutions also fo contolled Makov pocesses with infinite-dimensional state space. See Chang et al. 26 fo esults in this diection in connection with contolled SDDEs; also cf. Subsection While, in Chapte 3, we will make extensive use of the PDP, we will not need any kind of HJB equation. Let us also mention the fact that knowledge of the value function of an optimal contol poblem enables us to constuct optimal o nealy optimal stategies. When time is discete and the space of contol actions Γ is finite o compact, then optimal stategies can be constucted in feedback fom and fo each initial condition. We will etun to this point in Section 3.4.

20 8 CHAPTER 1. INTRODUCTION A second fundamental appoach to optimal contol poblems is via Pontyagin s Maximum Pinciple, see Yong and Zhou 1999: Chs. 3 & 7 fo the case of finite-dimensional contolled SDEs. Pontyagin s Pinciple povides necessay conditions which an optimal stategy and the associated optimal pocess if such exist have to satisfy in tems of the so-called adjoint equations, which evolve backwads in time. Unde cetain additional assumptions, the necessay conditions become sufficient. Vesions of this pinciple fo the contol of deteministic systems with delay exist, cf. the example in Subsection Fo stochastic contol poblems with delay of a special fom, a vesion of the Pontyagin Maximum Pinciple is deived in Øksendal and Sulem 21. Fo the esults of this thesis, we will not ely on the Maximum Pinciple. We have not made pecise any assumptions on the coefficients of the contol poblems intoduced above. This will be done in Subsection and Section 3.1, espectively, whee we specify the classes of continuous-time contol poblems to be appoximated. 1.2 Examples of optimal contol poblems with delay Some examples of continuous-time optimal contol poblems with delay, mostly fom the liteatue, ae given in this section. Contol poblems with linea dynamics and a quadatic cost citeion ae well-studied in many settings. In Subsection 1.2.1, we cite esults concening the epesentation of optimal stategies fo a class of linea quadatic egulatos with point as well as distibuted delay. In Subsection 1.2.2, a simple deteministic poblem with point delay modelling the optimal allocation of poduction esouces is pesented. Subsection descibes a stochastic optimal contol poblem with delay which may aise in finance when picing deivatives that depend on maket extenal pocesses. Special cases of optimal contol poblems with delay ae eally equivalent to finite-dimensional contol poblems without delay. Subsection contains esults fom the liteatue about those educible poblems. A futhe example is the deteministic infinite hoizon model of optimal economic gowth studied in Boucekkine et al. 25. Optimal contol poblems also aise in finance when the asset pices in a financial maket ae modelled as SDDEs, see Øksendal and Sulem 21, fo instance Linea quadatic contol poblems When the system dynamics ae linea in the state as well as in the contol vaiable, the noise is additive and the cost functional has a quadatic fom ove a finite o infinite time hoizon, then it is possible to deive a epesentation of the optimal stategies of the contol poblem. Such contol poblems ae efeed to as linea quadatic poblems o linea quadatic egulatos. Optimal stategies ae given in feedback fom; the epesentation involves the solution of an associated system of deteministic diffeential equations, the so-called Riccati equations. This is the case not only fo finite-dimensional stochastic and deteministic systems, but also fo systems descibed by abstact evolution equations cf. Bensoussan et al., 27. Hee, we just cite a esult fo finite hoizon linea quadatic systems with one point and one distibuted delay and additive noise, see Kolmanovskiǐ and Shaǐkhet 1996: Ch. 5.

21 1.2. EXAMPLES OF OPTIMAL CONTROL PROBLEMS WITH DELAY 9 We conside the time-homogeneous case. The dynamics of the contol poblem ae given by the affine-linea equation dxt = A Xtdt + A 1 Xt dt + GsXt+sds dt B utdt + σdw t, t >, whee > is the delay length, W. a d 1 -dimensional standad Wiene pocess adapted to the filtation F t t, u. a stategy, σ is a d d 1 -matix, A, A 1 ae d d-matices, G is a bounded continuous function [, ] R d d, and B is a d l-matix. The stategy u. in Equation 1.13 is any R l -valued F t -adapted squae integable pocess. Let U denote the set of all such pocesses. Let D := D[, ], R d denote the space of all R d -valued càdlàg functions on [, ]. Given ϕ D and a stategy u. U, thee is a unique up to indistinguishability d-dimensional càdlàg pocess X. = X ϕ,u. such that Equation 1.13 is satisfied and Xt = ϕt fo all t [, ]. Let T > be the deteministic time hoizon. The quadatic cost functional fo fixed initial time zeo is the function J : D U R given by T 1.14 Jϕ, u := E X T T C XT + X T t C Xt + u T tm ut dt, whee C, C ae positive semi-definite d d-matices and M is a positive definite l l-matix. The associated value function at initial time zeo is defined by V ϕ := inf u U Jϕ, u, ϕ D. Fo the contol poblem detemined by 1.13 and 1.14, a vesion of the Hamilton- Jacobi-Bellman equation 3 allows to deive a epesentation in feedback fom of the optimal stategies. Define the function u : [, T ] D R l by u t, ϕ := M 1 B P T tϕ + Qt, sϕsds, whee P, Q ae matix-valued functions [, T ] R d d and [, T ] [, ] R d d, espectively. The functions P, Q ae detemined by the following system of diffeential equations, which involves, in addition, the unknown functions R: [, T ] [, ] [, ] R d d and g : [, T ] R: 1.15 d dt P t + AT P t + P tat + Qt, + Q T t, + C = P tb M 1 B T P t, t s Qt, s + P tgt, s + A T Qt, s + Rt,, τ = P tb M 1 B T Qt, s, t s τ Rt, s, τ + G T t, sqt, τ + Q T t, sgt, τ = Q T t, sb M 1 B T Qt, τ, d dt gt + tace σ T P tσ =, t [, T ], s, τ [, ], 3 The deivation of the HJB equation in Kolmanovskiǐ and Shaǐkhet 1996: Ch. 5 is not completely igoous; see Chang et al. 26 and the efeences theein fo a moe caeful teatment. The development thee stats fom the expession fo the weak infinitesimal geneato of the segment pocess as deived in Mohammed 1984.

22 1 CHAPTER 1. INTRODUCTION with bounday conditions 1.16 P T = C, RT, s, τ =, QT, s =, gt =, P ta 1 = Qt,, A T 1 Qt, s = Rt,, s, t [, T ], s, τ [, ]. Equations 1.15 can be shown to possess a unique continuously diffeentiable solution P, Q, R, g unde bounday conditions 1.16, see Theoem in Kolmanovskiǐ and Shaǐkhet 1996: p It is also shown that u is indeed an optimal feedback contol. This means the following. Let ϕ D, and let X = X,ϕ be the unique solution to 1.17 X t = ϕ + t A X τ + A 1 X τ + + B u τ, X τ dτ + σ W t if t >, ϕt if t [, ]. GsX τ +sds dτ Recall the notation X τ fo the segment of X. at time τ. Obseve that u is Lipschitz continuous in supemum nom in its segment vaiable, whence stong existence and uniqueness of the solution X ae guaanteed. Indeed, due to the fom of u, Equation 1.17 is an affine-linea uncontolled SDDE, and X can be expessed by a vaiation-of-constants fomula. Set u t := u t, X, t. Then it holds that Jϕ, u = V ϕ, that is, u is an optimal stategy and X is the optimal pocess fo the given initial condition ϕ. In special cases, Equations 1.15 can be solved explicitly. Fo geneal linea quadatic poblems, they may seve as a stating point fo the numeical computation of optimal stategies and minimal costs A simple model of esouce allocation The following finite-hoizon deteministic poblem can be intepeted as a simplified model of optimal esouce allocation; see Betsekas 25: Ex , fo the non-delay case. Let T > be the time hoizon, let [, T be the length of the time delay, and c > a paamete. The dynamics of the model ae given by 1.18 {ẋt = c ut xt, if t >, xt = ϕt, if t [, ], whee the initial path ϕ is in C + := C[, ],, ; if =, then ϕ is just a positive eal numbe. An admissible stategy u. is any element of the set U of all Boel measuable functions [, [, 1]. The initial time will be fixed and equal to zeo. The objective is to maximise, fo each initial segment ϕ C +, the cost functional Jϕ, u := T ove u U. Clealy, this is equivalent to minimising 1.19 Jϕ, u := T 1 ut xt dt ut 1 xt dt

23 1.2. EXAMPLES OF OPTIMAL CONTROL PROBLEMS WITH DELAY 11 ove u U, since sup u U J.,., u = infu U J.,., u. A possible intepetation of the contol poblem detemined by 1.18 and 1.19 is the following cf. Betsekas, 25: Ex The state tajectoy x. = x u. descibes the poduction ate of cetain commodities e. g. wheat. Consequently, the total amount of goods poduced in any time peiod [, τ] is τ xtdt in suitable units. Duing the entie poduction peiod fom time zeo to time T the poduce has the choice between poducing fo einvestment and the poduction of stoable goods. This means that, at any time t [, T ], a potion ut [, 1] of the poduction ate is allocated to einvestment, while the emaining potion 1 ut goes into the poduction of stoable goods. The poduction ate changes in popotion to the level of einvestment. If einvestment is zeo, then the poduction ate will emain constant. In ode to justify Equation 1.18, it is instuctive to conside small time steps. Denote by yt the total amount of goods poduced up to time t, that is, yt = y + xsds, whee x. is the poduction ate. Let h > be the length of a small time step. Clealy, yt+h = yt + +h t xsds. On the othe hand, xt+h xt + c ut yt yt h, whee the paamete c > egulates the effectiveness of einvestment. Letting h tend to zeo and taking into account the initial condition, we obtain The objective is to maximise the total amount of stoed goods, that is, to maximise Jϕ, u ove all stategies u U fo each initial condition ϕ C + on the poduction ate. Equivalently, we can minimise Jϕ, u ove u U fo each ϕ C +. The paamete when positive intoduces a time delay. At time t, instead of allocating a potion ut of the cuent poduction ate xt, the poduce may allocate a potion of the past poduction ate xt. The total amount of stoed goods is measued accodingly, namely by T 1 utxt dt. We may think of as the time it takes to tansfom o sell the goods poduced. The contol poblem descibed above can be solved explicitly, and optimal stategies can be found. In the non-delay case, this is possible by elying on the Pontyagin Maximum Pinciple, see Theoem in Yong and Zhou 1999: p. 13, fo example. Pontyagin s Maximum Pinciple gives a set of necessay conditions an optimal stategy must satisfy if it exists in tems of the so-called adjoint vaiable. Unde additional assumptions, those conditions ae also sufficient fo a stategy to be optimal, cf. Theoem in Yong and Zhou 1999: p In case =, the solution of the above simple contol poblem by means of the Maximum Pinciple is given in Betsekas 25: pp Fo, we may ely on a vesion of Pontyagin s Pinciple fo deteministic systems with delay, cf. Gabasov and Kiillova 1977: p. 84. Given any initial segment ϕ C +, it can be shown that a coesponding optimal stategy satisfies { 1 if pt u 1 c t =, if pt < 1 c, t [, T ], whee p. is the solution to the adjoint teminal-value poblem given, in the case at hand,

24 12 CHAPTER 1. INTRODUCTION by 1.2 pt =, t [T, T ], ṗt = 1, t [T 1 c, T ], ṗt = c pt+, t [, T 1 c ]. Equations 1.2 descibe a deteministic backwad delay diffeential equation with teminal condition. It follows that an optimal stategy is given by 1.21 u t = { 1 if t [, T 1 c ], if t [T 1 c, T ]. Obseve that u depends on the delay length and the effectiveness paamete c >, but not on the initial condition. The minimal costs Jϕ, u, howeve, depend on ϕ C +. If =, we have an explicit solution, if >, we can integate in steps of length. The optimal stategy as given by Equation 1.21 is of bang-bang type. It consists in einvesting as much as possible befoe a citical switching time T 1 c, and then not to einvest any moe, but to poduce and stoe until the final time is eached Picing of weathe deivatives The example poblem of this subsection is based on Ankichne et al. 27, whee picing and hedging of insuance deivatives that depend on extenal physical pocesses is studied. Let X. be a continuous-time stochastic pocess one-dimensional, fo simplicity descibing some physical quantity, e. g. suface tempeatue at a given place o aveaged ove a cetain egion. Suppose X can be modelled as an SDDE of the fom 1.22 dxt = b t, X t dt + σ t, Xt dw t, t >, whee X t is the segment of length > of X. at time t, W. a standad Wiene pocess and b, σ ae appopiate functions; Equation 1.22 should possess a unique solution fo each initial condition ϕ D, whee D = C[, ], fo example. Suppose futhe that an economic agent A e. g. an insuance company intends to sell a financial deivative on the pocess X.. At matuity T >, the deivative yields fom the pespective of A an income F X T, whee F is some deteministic function D R. The income thus may depend on the evolution of X. ove the peiod [T, T ]. Notice that the length of the time delay may be atificially inceased. The question is which pice A should ask fo the deivative coesponding to F. It is assumed that A has the possibility to invest in a financial maket. In this maket, thee is a isky asset with pice pocess S. such that S. and X. ae coelated. We assume that S. is given by the modified Black and Scholes model 1.23 dst = µ t, St Stdt + β 1 StdW t + β 2 Std W t, whee W is a second standad Wiene pocess independent of the fist. The pocesses S. and X. ae coelated though β 1. The financial maket is incomplete, as the physical quantity descibed by X is not taded. The pice p of the deivative that A should ask can be detemined as the utility

25 1.2. EXAMPLES OF OPTIMAL CONTROL PROBLEMS WITH DELAY 13 indiffeence pice, povided a utility function descibing A s attitude towads isk is given. Let Ψ: R R denote such a function. We assume that Ψ is an exponential utility function. Then the pice p is detemined by the equation 1.24 sup u U E Ψ V u T + F X T p = sup E Ψ V u T, u U whee V u. is the value of A s potfolio unde investment stategy u U; see Ankichne et al. 27 fo the details. Fo an exponential utility function Ψ, the unknown p in Equation 1.24 factos out, and, on the left hand side of 1.24, we have a stochastic optimal contol poblem with delay of the type studied in Chapte Delay poblems educible to finite dimension In this subsection, we follow Baue and Riede 25; but also cf. Elsanosi et al. 2 and Lassen and Risebo 23, whee a simila appoach is taken. The value function of an optimal contol poblem with delay lives, fo fixed initial time, on the segment space associated with the system dynamics. The segment space is, apat fom the case when the delay length is equal to zeo, an infinite-dimensional space of functions, say D; fo example, D = C[, ], R d. In geneal, it is not possible to educe the value function to a finite-dimensional object, that is, it is not geneally possible to find a numbe n N and continuous functions Θ: D R n, Ψ: R n R such that V = Ψ Θ. If the contolled SDDE as well as the cost functional have a special fom and cetain additional assumptions ae fulfilled, then the optimal contol poblem with delay is educible to a contol poblem without delay, that is, the poblem is effectively finite-dimensional. Let Γ be a closed subset of Euclidean space of any dimension. Let W be a onedimensional standad Wiene pocess adapted to the filtation F t t. Denote by U the set of all F t -pogessively measuable Γ-valued pocesses. Let >, and let the dynamics of the contol poblem with delay be given by the one-dimensional contolled SDDE 1.25 dxt = µ 1 t, Xt, Y t, ut dt + µ2 Xt, Y t ξtdt + σ t, Xt, Y t, ut dw t, t >, whee u U is a stategy, ξt := wxt and Y t := eλ s wxt+sds fo some continuously diffeentiable function w : R R and a constant λ R. Hee, we only give the one-dimensional esult with initial time set to zeo; see Baue and Riede 25 fo a full account. The coefficients of Equation 1.25 ae measuable functions µ 1 : [, R R Γ R, µ 2 : R R R, σ : [, R R Γ R. Equation 1.25 descibes a system whose evolution depends not only on the cuent state X., but also on a cetain weighted aveage ove the past, namely Y., as well as a point delay, namely ξ.. Let us assume, fo example, that µ 2 is bounded and Lipschitz continuous, thee is a constant K > such that fo all t, γ Γ, x, y R, µ 1 t, x, y, γ + σt, x, y, γ K 1 + x y,

26 14 CHAPTER 1. INTRODUCTION µ 1, σ ae Lipschitz continuous in thei espective second and thid vaiable unifomly in the othe vaiables. Then, fo evey initial segment ϕ C := C[, ] and evey stategy u U, thee is a unique continuous pocess X = X ϕ,u such that Equation 1.25 is satisfied and Xt = ϕt fo all t [, ]. Let T > be the deteministic time hoizon. The cost functional of the optimal contol poblem is the function J : C U R given by T 1.26 Jϕ, u := E f t, Xt, Y t, ut dt + g XT, Y T. The associated value function V is defined by V ϕ:= inf u U Jϕ, u, ϕ C. At this point, an idea could be that V ϕ depends on its agument ϕ C only though ϕ coespondig to Xt and wϕsds coespondig to Y t. Obseve that in Equation 1.25 thee is still the point delay ξt = wxt. Also notice that the pocess Y. is of bounded vaiation. Let Ψ C 2,1 R R. By Itô s fomula, fo any solution X. and the associated pocess Y., dψ Xt, Y t = x Ψ Xt, Y t dxt x x Ψ Xt, Y t d X, X t. y Ψ Xt, Y t dy t While expessions fo dxt and d X, X t now follow fom Equation 1.25, we have 1.27 dy t = w Xt dt e λ ξtdt λy tdt, by constuction of Y. Intoduce the hypothesis that HT thee is Ψ C 2,1 R R such that fo all x, y R, x Ψx, yµ 2x, y e λ y Ψx, y =. If Hypothesis HT holds, then the tansfomed pocess ΨX, Y obeys an equation of the fom 1.28 dψ Xt, Y t = µ t, Xt, Y t, ut dt + σ t, Xt, Y t, ut dw t, whee the coefficients µ, σ can be expessed in tems of the oiginal coefficients. Notice that the point delay ξt has disappeaed. Indeed, Hypothesis HT has been chosen so that the ξt tem stemming fom Equation 1.25 and the ξt tem in Equation 1.27 cancel out. The appeaance of the point delay in Equation 1.27, on the othe hand, is inevitable in view of the fom of Y. If the coefficients µ, σ ae such that they depend on thei x- and y-vaiable only though Ψx, y, then ΨX, Y, the tansfomed pocess, obeys an odinay SDE of the fom dψ Xt, Y t = µ t, ΨXt, Y t, ut dt + σ t, ΨXt, Y t, ut dw t, whee µ, σ ae the new coefficients which can be found by hypothesis. Unde Hypothesis HT and the educibility hypothesis, the tansfomed dynamics can be witten in tems of Zt := ΨXt, Y t. If also the coefficients f, g in 1.26

27 1.3. APPROXIMATION OF CONTINUOUS-TIME CONTROL PROBLEMS 15 ae educible, that is, if the coefficients of the cost functional depend on thei x- and y-vaiable only though Ψx, y, then a finite-dimensional contol poblem without delay aises which is elated to the oiginal contol poblem though the tansfomation Ψ and the coesponding eduction of the coefficients. Notice that the educibility of the coefficients is a second hypothesis. If the Hamilton-Jacobi-Bellman equation associated with the finite-dimensional contol poblem without delay admits a classical solution and if optimal stategies exist, then the finite-dimensional and the delay poblem ae equivalent in that thei value functions ae equivalent, see Theoem 1 in Baue and Riede 25. That all hypotheses can be satisfied at once is shown in Baue and Riede 25: Sects. 4-6 by way of specific examples: a linea quadatic egulato, a model of optimal consumption, and a deteministic model fo congestion contol. 1.3 Appoximation of continuous-time contol poblems Thee ae vaious possible appoaches to appoximating continuous-time optimal contol poblems. We focus on those appoaches which yield an appoximation to the value function of the oiginal poblem. Recall that knowledge of the value function allows to choose optimal o nealy optimal stategies so that an optimal contol poblem is essentially solved once its value function is known. The methods we mention wee mostly developed fo finite-dimensional systems stochastic as well as deteministic. A basic idea is to eplace the oiginal contol poblem by a sequence of contol poblems which ae numeically solvable in such a way that the associated value functions convege to the value function of the oiginal poblem. It is often possible to eintepet a given scheme in tems of appoximating contol poblems even though the scheme itself need not be defined in these tems. A natual ansatz fo constucting a suitable sequence of contol poblems is to deive thei dynamics and cost functionals fom a discetisation of the dynamics and cost functional of the oiginal poblem. This method, known as the Makov chain method, was intoduced by H. J. Kushne and is well-established in the case of finite-dimensional stochastic and deteministic optimal contol poblems, see Kushne and Dupuis 21 and the efeences theein. The method allows to pove convegence of the appoximating value functions to the value function of the oiginal poblem unde vey boad conditions. The most impotant condition to be satisfied is that of local consistency of the discetised dynamics with the oiginal dynamics. Due to its geneal natue, the Makov chain method can also be applied to contol poblems with delay. In Chapte 2, we will study this method in detail and develop an abstact famewok fo the poof of convegence. The famewok may seve as a guide fo using the Makov chain method in the convegence analysis of appoximation schemes fo vaious classes of optimal contol poblems. In Section 2.3, the convegence analysis is caied out fo the discetisation of stochastic optimal contol poblems with delay and a andom time hoizon. We note, howeve, that while the method is well-suited fo establishing convegence of a scheme, it usually povides no infomation about the speed of convegence. The value function of a continuous-time finite-dimensional optimal contol poblem can

28 16 CHAPTER 1. INTRODUCTION often be chaacteised as the unique viscosity solution of an associated patial diffeential equation. Fo classical contol poblems, that equation is the Hamilton-Jacobi-Bellman equation HJB equation associated with the contol poblem, which is a fist ode PDE in the case of a deteministic system and a second ode PDE in the case of a stochastic system diven by a Wiene pocess. Examples show that the value function of a deteministic o degeneate stochastic contol poblem is not necessaily continuously diffeentiable e. g. Fleming and Sone, 26: II.2, whence classical solutions to the HJB equation do not always exist. 4 An appoximation to the value function of a continuous-time optimal contol poblem can be obtained by discetising the associated HJB equation. In paticula, finite diffeence schemes can be used fo the discetisation. In the case of finite-dimensional deteministic optimal contol poblems, convegence as well as ates of convegence fo such schemes wee obtained in the 198s, see, fo instance, Capuzzo Dolcetta and Ishii 1984 o Capuzzo Dolcetta and Falcone Mee convegence of a discetisation scheme fo finite-dimensional deteministic and stochastic equations without eo bounds can be checked by elying on a theoem due to Bales and Souganidis Thei esult is not limited to the analysis of HJB equations aising in contol theoy in that it applies to a wide class of equations possessing a viscosity solution. About ten yeas ago, N. V. Kylov was the fist to obtain ates of convegence fo finite diffeence schemes appoximating finite-dimensional stochastic contol poblems with contolled and possibly degeneate diffusion matix, see Kylov 1999, 2 and the efeences theein. The eo bound obtained thee in the special case of a time discetisation scheme with coefficients that ae Lipschitz continuous in space and 1 2-Hölde continuous in time is of ode h 1/6 with h the length of the time step. Notice that in Kylov 1999 the ode of convegence is given as h 1/3, whee the time step has length h 2. When the space too is discetised, the atio between time and space step is like h 2 against h o, equivalently, h vs. h, which explains why the ode of convegence is expessed in two diffeent ways. In Kylov 25, shap eo bounds ae obtained fo fully discete finite diffeence schemes in a special fom; the bounds ae of ode h 1/2 in the mesh size h of the space discetisation and of ode τ 1/4 in the length τ of the time step. Using puely analytic techniques fom the theoy of viscosity solutions, Bales and Jakobsen 25, 27 obtain eo bounds fo a boad class of finite diffeence schemes fo the appoximation of PDEs of Hamilton-Jacobi-Bellman type. In the case of a simple time discetisation scheme, the estimate fo the speed of convegence they find is of ode h 1/1 in the length h of the time step. A possible ansatz fo extending those esults to the appoximation of contol poblems with delay is to ty to deive a HJB equation fo the value function. Recall that a vesion of the Pinciple of Dynamic Pogamming still holds fo delay systems, cf. Appendix A.1. As in the finite-dimensional setting, such an HJB equation is not guaanteed to admit classical i. e. Féchet-diffeentiable solutions, and viscosity solutions have to be defined. The HJB equation can then be used as a stating point fo constucting finite diffeence 4 Genealised solutions fo the HJB equation can be shown to exist also in the case when thee ae no classical solutions, but uniqueness of genealised solutions does not always hold. Fo viscosity solutions, on the othe hand, existence and uniqueness can be guaanteed; moeove, viscosity solutions ae the ight solutions in the sense that they coincide with the value function of the undelying contol poblem.

29 1.4. AIM AND SCOPE 17 schemes; see Chang et al. 26 fo fist esults in this diection. A diffeent appoach to the appoximation of contol poblems with delay is to stat fom a epesentation of the system dynamics as an evolution equation in Hilbet space. A suitable Hilbet space fo this pupose is the space M 2 := L 2 [, ], R d R d, the Delfou-Mitte space, whee > is the maximal length of the delay. Notice that the segment space C[, ], R d intoduced in Section 1.1 can be continuously embedded into M 2. Pojection methods could be used to obtain an appoximation scheme. Fo the epesentation of contolled deteministic systems with delay, especially linea systems, see Bensoussan et al. 27: II.4; fo how to epesent stochastic systems with delay in Hilbet space, see Da Pato and Zabczyk A futhe appoach to the discetisation of optimal contol poblems is based on the Makov popety. Fo a suitable choice of the state space, the contolled pocesses enjoy the Makov popety povided only feedback contols ae used as stategies. In the case of poblems with delay, the Makov popety holds fo the segment pocesses. The dynamics of the oiginal poblem ae epesented by the family of contolled Makov semigoups. Discetisation schemes, especially fo time discetisation, can then be studied in tems of convegence of the infinitesimal geneatos associated with the Makov semigoups; see van Dijk 1984 fo an ealy wok. Obseve, howeve, that in ode to obtain ates of convegence stong egulaity hypotheses may be necessay aleady in the finite-dimensional case; this amounts to assuming that an optimal stategy in feedback fom with sufficiently egula e. g. Lipschitz continuous feedback function exists o that the value function is two o thee times continuously diffeentiable. In this wok, we will not use any infinite-dimensional epesentation of the system dynamics; instead, we will stick to the semi-matingale setting. The Makov popety of the infinite-dimensional segment pocesses will nevetheless be exploited. In Section 2.3, we constuct appoximating discete-time pocesses as extended Makov chains. In Chapte 3, we will make extensive use of a vesion of the Pinciple of Dynamic Pogamming, which is based on the Makov popety of the segment pocesses, cf. Appendix A.1. Woking in the semi-matingale setting has seveal advantages. Existence and uniqueness esults fo contolled SDDEs ae well-established. Thee is an elaboate theoy chaacteising weak convegence of R d -valued semi-matingales e. g. Jacod and Shiyaev, This theoy will be essential fo the convegence analysis of Section 2.3. When the noise pocess of the dynamics of the oiginal system is a Wiene pocess as will be the case in this wok, then the solution pocesses ae Itô diffusions. Stong esults on thei path egulaity, in paticula on the moments of thei moduli of continuity, ae available, cf. Appendix A.2 and Section 3.2. In Section 3.3, we will make use of a finite-dimensional stochastic mean value theoem due to N. V. Kylov. The main ingedients in the poof of that esult ae a mollification tick, the usual PDP and the Itô fomula, cf. Theoem A.2 in Appendix A Aim and scope The aim of this thesis is to study discetisation schemes fo continuous-time stochastic optimal contol poblems with time delay in the state dynamics. The noise pocess diving the system of the oiginal contol poblem will always be a Wiene pocess one-dimensional

30 18 CHAPTER 1. INTRODUCTION in Section 2.3 and multi-dimensional in Chapte 3. The object to be appoximated is the value function associated with the oiginal contol poblem. We ae concened with questions of convegence as well as ates of convegence o bounds on the discetisation eo. Eo bounds tell how much cannot be lost o gained in passing fom the oiginal model to a discetised model. This is also the fist step in the appoximate numeical solution of continuous-time models. Fo a continuous-time contol poblem, an appoximate numeical solution is usually the only kind of explicit solution available. The geneal idea we follow is to eplace the oiginal continuous-time contol poblem by a sequence of appoximating discete-time contol poblems which ae easie to solve numeically. Obseve that the value function associated with a continuous-time contol poblem with delay of the type studied hee lives, in geneal, on a function space, namely the segment space, whence the poblem may be consideed to be infinite-dimensional. We will take two appoaches. In Chapte 2, we follow the Makov chain method mentioned above, which is a ecipe fo constucting discetisation schemes and poving convegence in the sense of convegence of associated value functions. In Section 2.1, we pesent the method as it is found in the wok of H. J. Kushne and othes. In Section 2.2, we develop an abstact famewok in which to state sufficient conditions guaanteeing convegence of appoximation schemes. We then apply the method to the discetisation of a class of stochastic optimal contol poblems with delay and a andom time hoizon the time of fist exit fom a compact set, cf. Section 2.3. In Chapte 3, we study a moe specific scheme, which applies to finite-hoizon stochastic contol poblems with delay, contolled and possibly degeneate diffusion coefficient and multi-dimensional state as well as noise pocess, cf. Section 3.1. Accoding to the scheme, time and segment space ae discetised in two steps, see Sections 3.2 and 3.3. Unde quite natual assumptions, we obtain not only convegence, but also bounds on the eo of the discetisation scheme, see Section 3.4. The wost-case bound on the discetisation eo in the geneal case is of ode nealy h 1/12 in the length of the inne time step h. The two-step scheme poduces a sequence of appoximating finite-dimensional contol poblems in discete time. In Section 3.5, we addess the question of how to solve these poblems numeically. Instead of futhe discetising the state space as in Section 2.3, we popose to use a vaiant of appoximate Dynamic Pogamming, exploiting the twostep stuctue of the scheme. Memoy equiements, in paticula, can be kept at a ealistic level. 5 Notwithstanding the special stuctue of the discetisation scheme, its use is not confined to the appoximation of finite hoizon contol poblems. It should also apply to systems with a eflecting bounday o systems contolled up to the time of fist exit fom a compact set. In this thesis, we ae inteested in discetisation schemes which yield an appoximation to the value function of the oiginal poblem. The value function gives the globally minimal costs, and knowing it allows to constuct globally optimal o nealy optimal stategies fo each initial condition. Thee ae efficient pocedues fo finding locally optimal stategies and calculating locally minimal costs, but we will not be concened with any of them. Moeove, we will not use any hypotheses on the egulaity of optimal stategies not 5 The amount of compute memoy equied fo the two-step scheme depends on the mesh size of the oute time gid. In tems of the length h of this oute time step, a wost-case eo bound of ode h 1/2 ln1/ h holds.

31 1.4. AIM AND SCOPE 19 even existence no any egulaity assumptions on the value function which ae not a consequence of popeties of the system coefficients. If such hypotheses wee assumed, it would be possible to deive much bette ates of convegence. The eason why we efain fom making such assumptions is that they ae, usually, difficult o impossible to veify based on the infomation available about the system to be contolled.

32 2 CHAPTER 1. INTRODUCTION

33 Chapte 2 The Makov chain method Thee is a geneal pocedue, known as the Makov chain method and developed by Haold J. Kushne, fo endeing optimal contol poblems in continuous time accessible to numeical computation. The basic idea is to constuct a family of discete optimal contol poblems by discetising the oiginal dynamics and the oiginal cost functional in time and space. The impotant point to establish then is whethe the value functions associated with the discete poblems convege to the oiginal value function as the mesh size of the discetisation tends to zeo. If the value functions convege, then the discete contol poblems ae a valid appoximation to the oiginal poblem and standad algoithms, notably those based on Dynamic Pogamming e. g. Betsekas, 25, 27, can be applied at least in pinciple to calculate the minimal costs and to find optimal stategies fo each of the discete contol poblems. When the dynamics of the oiginal poblem ae given by odinay deteministic o stochastic diffeential equations, suitable discete contol poblems ae obtained by eplacing the oiginal contolled diffeential equations with contolled Makov chains whose tansition pobabilities ae consistent with the oiginal dynamics. Unde compactness and continuity assumptions on the oiginal poblem, a condition of local consistency fo the tansition pobabilities of the contolled Makov chains suffices to guaantee convegence of the coesponding value functions. In Section 2.1 we descibe the Makov chain method following Kushne and Dupuis 21 by means of a deteministic example poblem. Section 2.2 sets up an abstact famewok fo appoximating a given optimal contol poblem by a sequence of discete poblems. Thee the continuity and compactness assumptions undelying Kushne s method ae made explicit. In Section 2.3, we apply the method to a class of stochastic contol poblems with delay and a stopping condition as time hoizon. Most of the mateial of that section has been published in Fische and Reiß 27. In Kushne 25, discetisation schemes fo a class of stochastic contol poblems with delay and eflection ae studied; howeve, the poofs fo the delay case do not seem to be as closely analogous to the non-delay case as is suggested thee. Section 2.4 contains a bief discussion of the scope of the Makov chain method. 21

34 22 CHAPTER 2. THE MARKOV CHAIN METHOD 2.1 Kushne s appoximation method As an illustation of how Kushne s method woks, let us conside a deteministic optimal contol poblem with finite time hoizon. The system dynamics ae descibed by a contolled odinay diffeential equation: 2.1 ẋt = b t +t, xt, ut, t >, whee b is a measuable function [, R d Γ R d and u. a measuable function [, Γ. The space Γ is called the space of contol actions and it is assumed that Γ is a compact metic space. This hypothesis will be cucial late. The initial state is x = y fo some y R d. In the fomulation adopted hee, solutions x. to Equation 2.1 if thee ae any always stat at time zeo, while the initial time t entes the equation though the coefficient b. Let U ad be the set of all Boel measuable functions u : [, Γ such that Equation 2.1 possesses a unique absolutely continuous solution x. = x t,y,u. fo each t, y [, R d. The elements of U ad ae called admissible stategies o, simply, stategies. Let T > be the deteministic time hoizon. Associated with stategy u U ad and initial condition t, y [, T ] R d ae the costs 2.2 J det t, y, u. := T t f t +t, x t,y,u t, ut dt + g x t,y,u T t, whee f and g ae suitable measuable functions [, R d Γ R and R d R, espectively, such that the above integal makes sense as an element of [, ]. The value function of the contol poblem detemined by 2.1 and 2.2 is given by V det t, y := inf u U ad J det t, y, u., t, y [, T ] R d. The idea is now to constuct a suitable family P M M N of optimal contol poblems in discete time and with discete state space so that the coesponding value functions convege pointwise to V det. The poblem P M of degee M may be obtained as follows. Let S M R d be a egula tiangulation of the state space R d. Hence, any state y R d can be epesented as the convex combination of at most d+1 elements of S M. The dynamics of the contol poblem P M ae detemined by the choice of a timeinhomogeneous contolled tansition function p M : N S M Γ S M [, 1], that is, a function p M which is jointly measuable and such that p M n, y, γ,. defines a pobability distibution on S M fo all n N, y S M, γ Γ. Obseve that the set S M is at most countable. The numbe p M n, y, γ, z should be intepeted as the pobability that, between time step n and n+1, the system switches fom state y S M to state z S M unde the action of contol γ Γ. Admissible stategies fo the poblem P M ae adapted sequences un n N of Γ-valued andom vaiables such that, fo each initial condition n, y N S M, thee is an adapted S M -valued sequence ξn n N whose tansition pobabilities ae given by the function p M. Stictly speaking, an admissible stategy consists in a complete pobability space Ω, F, P equipped with a filtation F n and an F n -adapted sequence un of Γ-valued andom vaiables; thus, the undelying filteed pobability space is pat of the stategy. Fo simplicity, we usually omit the stochastic basis fom the notation.

35 2.1. KUSHNER S APPROXIMATION METHOD 23 If u is admissible, then, fo each n, y N S M, thee is a discete-time pocess ξn such that fo all z S M, all n N, P ξn+1 = z F n = p M n +n, ξn, un, z, ξ = y P-a. s., whee P is the pobability measue which is pat of the stategy. The distibution of ξ is uniquely detemined by u, the tansition function p M and the initial condition n, y. denote by Uad M the set of all admissible stategies of degee M. An altenative to the above definition of the set of admissible stategies is to take feedback contols as stategies, that is, measuable functions of the time step, the cuent and past states of the system and the past contol actions; see, fo example, Henández-Lema and Lassee 1996: Ch. 2. The advantage of the seemingly moe complicated fomulation hee is that the admissible stategies ae defined diectly on the undelying pobability space. The admissibility equiement above means that the undelying stochastic basis is ich enough so that the contolled pocess ξn coesponding to a Γ-valued adapted sequence un can be constucted. In Section 2.3, we will estict the set of admissible stochastic bases to those caying a Wiene pocess. To conclude the constuction of the poblem P M we need an analogue of the cost functional 2.2. Let T M N be the discete time hoizon of degee M; fo example, T M could be equal to M T. We eplace the integal by a sum and take expectations since P M is a stochastic contol poblem. Fo n {,..., T M }, y S M, un Uad M set 2.3 J M det n, y, u := E TM n 1 n= f M n +n, ξn, un dt + g M ξtm n, whee ξ = ξn is the discete-time pocess associated with stategy u and initial condition n, y. The functions f M, g M should be appopiate discetisations of f and g, espectively. Suppose that one time step fo the discete poblem of degee M coesponds to a step of length h M := 1 M in continuous time. Then the equiement that the family pm M N of tansition functions be locally consistent with the oiginal dynamics means that fo all n, n N, y, z S M, γ Γ, 2.4 z S M p M n +n, y, γ, z z = y + h M b n +n M, y, γ + oh M, whee oh M is the M-th element of a sequence that tends to zeo faste than h M M N. In addition, one only has to equie that the maximal jump size of the associated contolled Makov chains tend to zeo as the discetisation degee M goes to infinity. Condition 2.4 can also be expessed in tems of the contolled Makov chains, cf. Section It is staightfowad to constuct a sequence of tansition functions such that the jump height and the local consistency conditions can be fulfilled. We may define the function p M by, fo example, p M n, y, γ, z := { λi if z = x i else,

36 24 CHAPTER 2. THE MARKOV CHAIN METHOD whee x 1,..., x d+1 S M ae the vetices of the simplex containing y + h M bn, y, γ and λ 1,..., λ d+1 [, 1] ae such that y + h M bn, y, γ = d+1 λ i x i. This choice yields a family of locally consistent tansition functions povided the mesh size of the tiangulations S M, M N, tends to zeo like h M = 1 M as M goes to infinity. Besides local consistency of the family of tansition pobabilities thee is a second impotant hypothesis in Kushne s method, namely the semi-continuity of the cost functionals with espect to a suitable notion of convegence. The cost functional J det of the oiginal poblem, fo instance, can be extended to a mapping which takes an initial condition t, y [, T ] R d, a stategy u. U ad and an absolutely continuous function x. and which yields a eal numbe o ±. In 2.2, the definition of J det, on the othe hand, the connection between t, y, u. and x. is detemined by the system dynamics as given by Equation 2.1. Fo the discete stochastic poblem of degee M, the cost functional Jdet M is a mapping which assigns a cost to any initial condition n, y {,..., T M } S M, stategy un Uad M and S M-valued adapted sequence ξn. Consequently, we may intepet the cost functionals as defined on poduct spaces whose components ae the set of initial conditions, the space of stategies and a suitable space of functions o andom pocesses encompassing all possible tajectoies of the system. We can even find a common poduct space fo all the cost functionals involved. This can often be achieved by eplacing the stategies and state sequences of the discete-time poblems by thei piecewise constant continuous-time intepolations. As fo the example poblem, the deteministic stategies and solutions of the system equation ae e-intepeted as paticula degeneate andom pocesses. The poduct space foming the domain of the extended cost functionals is endowed with a notion of convegence, namely that of weak convegence of andom pocesses o weak convegence of the associated pobability distibutions. The induced topology endes the cost functionals J det, Jdet M continuous povided the coefficients f, g and f M, g M in 2.2 and 2.3, espectively, ae continuous. We will see moe details of this constuction in Section 2.2. Thee is a last impotant point in the set-up of Kushne s method: the compactification of the space of admissible stategies. Remembe that the space of contol actions Γ is assumed to be a compact metic space. Nonetheless, the space U ad of admissible stategies, equipped with the topology of weak convegence, need not be compact. The eason why compactness of the stategy space is desiable, hee, is that it guaantees the existence of optimal stategies fo the oiginal poblem fo discete-time contol poblems the compactness of Γ itself is sufficent. Moe geneally, any sequence of stategies will possess limit points that ae themselves stategies. A simila compactness popety will be implicit in the assumptions of Theoem 2.1 in Section 2.2, the convegence esult fo the abstact famewok. Thee, howeve, topological popeties will egad the system space only. We should stess that Kushne s method is not confined to such simple schemes as we have sketched fo the example poblem. In paticula, fo the discetisation of time, the gid need not be unifomly spaced. It is possible to analyse non-deteministic, adaptive schemes. Also a wide vaiety of diffeent deteministic and stochastic optimal contol i=1

37 2.1. KUSHNER S APPROXIMATION METHOD 25 poblems can be handled. The system dynamics, fo instance, might be descibed as a contolled jump-diffusion and the pefomance citeion might involve a andom time hoizon. In the emainde of this Section, we intoduce the concept of elaxed contols necessay fo the compactification of the space of admissible stategies of a continuous-time contol poblem, cf. Kushne 199: Ch. 3 and the efeences theein. Definition 2.1. A deteministic elaxed contol ove a compact metic space Γ is a positive measue ρ on BΓ [,, the Boel σ-algeba on Γ [,, such that 2.5 ργ [, t] = t fo all t. Denote by RΓ the set of all deteministic elaxed contols ove Γ. Fo each G BΓ, the function t ρg [, t] is absolutely continuous with espect to the Lebesgue measue on [, by vitue of popety 2.5. Denote by ρ., G any density of ρg [,.]. The family of densities ρ., G, G BΓ, can be chosen in a Boel measuable way such that ρt,. is a pobability measue on BΓ fo each t, and ρb = 1 {γ,t B} ρt, dγ dt fo all B BΓ [,. Γ The space RΓ of all deteministic elaxed contols ove Γ is equipped with the weakcompact topology induced by the following notion of convegence: a sequence ρ n n N of elaxed contols conveges to ρ RΓ if and only if gγ, t dρ n γ, t n gγ, t dργ, t fo all g C c Γ [,, Γ [, Γ [, whee C c Γ [, is the space of all eal-valued continuous functions on Γ [, having compact suppot. By the compactness of Γ, RΓ is sequentially compact unde the weak-compact topology. Suppose ρ n n N is a convegent sequence in RΓ with limit point ρ. Given T >, let ρ n T denote the estiction of ρ n to the Boel σ-algeba on Γ [, T ], and denote by ρ T the estiction of ρ to BΓ [, T ]. Then ρ n T, n N, ρ T ae all finite measues and ρ n T conveges weakly to ρ T. Any odinay deteministic stategy u. gives ise to a deteministic elaxed contol, namely to 2.6 ρb := 1 {γ,t B} δ ut dγ dt, B BΓ [,, Γ whee δ γ is the Diac measue at γ Γ. Moeove, any deteministic elaxed contol can be appoximated in the weak-compact topology by a sequence of odinay deteministic stategies. The dynamics of a contol poblem descibed by contolled odinay diffeential equations can be ewitten using elaxed contols. The elaxed vesion in integal fom of Equation 2.1, fo instance, is 2.7 xt = y + b t +s, xs, γ dργ, s, t, Γ [,t]

38 26 CHAPTER 2. THE MARKOV CHAIN METHOD whee t, y [, R d is the initial condition. In the stochastic case, the analogue of deteministic elaxed contols ae elaxed contol pocesses. Definition 2.2. A elaxed contol pocess ove a compact metic space Γ is an RΓ-valued andom vaiable R defined on a stochastic basis Ω, F, F t, P such that the mapping is F t -measuable fo all t, G BΓ. Ω ω RG [, t]ω [, t] Since, by definition, Condition 2.5 holds scenaio-wise fo a elaxed contol pocess R, thee is a family Ṙt,. of deivative measues such that, P-almost suely, RBω = Γ 1 {γ,t B} Ṙt, dγω dt fo all B BΓ [,. The family Ṙt,. can be constucted in a measuable way cf. Kushne, 199: p. 52. Any odinay contol pocess, that is, any Γ-valued F t -adapted pocess, can be epesented as a elaxed contol pocess: fo u an odinay contol pocess, set 2.8 RBω := 1 {γ,t B} δ ut,ω dγ dt, B BΓ [,, ω Ω, Γ whee δ γ is the Diac measue at γ Γ. Then R is the elaxed contol epesentation of u. 2.2 An abstact famewok In this section we povide an abstact famewok fo the convegence analysis of discetisation schemes constucted accoding to the Makov chain method. The famewok not only fomalises the ideas outlined in Section 2.1, it also extends thei scope of applicability. This is possible because Kushne s method does not equie the system dynamics o cost functional to have any special stuctue. In paticula, no additivity popeties like the Pinciple of Dynamic Pogamming ae exploited, not even the Makov popety of the system is needed. The definitions to be given below ae illustated by means of the deteministic contol poblem fom Section 2.1. In Section 2.3, the convegence analysis fo a class of stochastic optimal contol poblems with delay is caied out in detail. The wok to be done thee consists mainly in veifying that the hypotheses of Theoem 2.1 below ae satisfied Optimisation and contol poblems Optimal contol poblems ae paametised optimisation poblems; the paametes coespond to the initial data fo the system dynamics. Since the paamete set may be a singleton, we omit the modifie paametised in the following definition. Definition 2.3. An optimisation poblem is a tiple D, A, F, whee D, A ae non-empty sets and F is a mapping D A [, ].

39 2.2. AN ABSTRACT FRAMEWORK 27 The function F of an optimisation poblem D, A, F is called the objective function o taget function. The set D is the data set of the poblem, the set A may be called the esticto set. Given a datum ϕ D, the aim is to minimise o maximise F ϕ,. ove A. Definition 2.4. Let P = D, A, F be an optimisation poblem. The function V : D [, ] defined by V ϕ := inf{f ϕ, α α A} is called the value function associated with P. The poblem P is finite iff its value function is finite, that is, iff V is R-valued. We estict attention to minimisation poblems. Maximisation poblems can be ewitten as minimisation poblems in the obvious way. Moe geneal optimisation poblems could be fomulated by letting the taget function attain values in an abitay patially odeed set. Clealy, thee is moe stuctue to an optimal contol poblem than to an optimisation poblem. Definition 2.5. An optimal contol poblem is a quintuple D, A, H, Ψ, J, whee D, A, H ae non-empty sets, Ψ is a mapping D A H and J is a mapping H [, ]. The components of an optimal contol poblem D, A, H, Ψ, J ae denominated as follows: D is the data set, A is the set of admissible stategies o, simply, the set of stategies, H is called the system space, the mapping Ψ is the system functional, and J is called the cost functional. An optimal contol poblem D, A, H, Ψ, J gives ise to an optimisation poblem, namely the tiple D, A, F with F := J Ψ. Definition 2.6. The value function associated with an optimal contol poblem is defined to be the value function associated with the induced optimisation poblem. An optimal contol poblem is finite iff its value function is finite. Fo simplicity, we will use the expession contol poblem without the modifying optimal also in the sense of optimal contol poblem. Let us illustate the definitions by applying them to the deteministic example poblem of Section 2.1. In ode to identify the components of that contol poblem accoding to Definition 2.5, set D det := [, T ] R d, let A det be the set of stategies U ad, and let H det be the set D det A det C[,, R d. Define the system functional Ψ det as the mapping D det A det H det, t, y, u. t, y, u., x t,y,u., whee x t,y,u. is the unique solution to Equation 2.1 unde stategy u. and initial condition t, y. Lastly, define the cost functional J det to be the mapping H det [, ] given by t, y, u., x. T t f t +t, xt, ut dt + g xt t. Notice that in the above definition of J det the function x. is not necessaily a solution to Equation 2.1, but may be any continuous function [, R d. The quintuple D det, A det, H det, Ψ det, J det thus defined is an optimal contol poblem in the sense of Definition 2.5. Let V det be the associated value function accoding to Definition 2.6. Then V det coincides with the value function induced by the cost functional 2.2 in Section 2.1.

40 28 CHAPTER 2. THE MARKOV CHAIN METHOD The epesentation of ou example poblem as an optimal contol poblem in the sense of Definition 2.5 is not unique. Fo example, in the definition of the system space H det we could eplace the component C[,, R d by C abs [,, R d, the space of all absolutely continuous functions [, R d. The definitions given so fa ae about sets without any additional stuctue. Fo the discetisation and convegence analysis, we will equie the system space to cay a suitable topology and the system functional and cost functional to have cetain continuity popeties. Nevetheless, we will neithe obtain no need unique epesentations of contol poblems. Befoe coming to this, let us intoduce some basic elations between contol poblems. Definition 2.7. Let P = D, A, H, Ψ, J, P = D, Ã, H, Ψ, J be optimal contol poblems. Then P is a subpoblem of P iff thee ae injective mappings ι D : D D, ι A : A Ã, ι H : H H such that Ψ ι D ι A = Ψ and J ι H = J. The mappings ι D, ι A, ι H which make P a subpoblem of P ae called data embedding, stategy embedding and system embedding, espectively. Notice that these embeddings need not be unique. We may say that P is a supepoblem of P to indicate that P is a subpoblem of P. Let V, Ṽ be the value functions associated with the contol poblems P and P, espectively. Suppose P is a subpoblem of P with data embedding ιd. Then, by definition, Ṽ ι D ϕ V ϕ fo all ϕ D. Howeve, Definition 2.7 does not guaantee that Ṽ ι D = V. The elation defined next is to ensue this popety. Definition 2.8. Let P, P be optimal contol poblems with associated value functions V and Ṽ, espectively. Then P is a elaxation of P iff P is a subpoblem of P fo some data embedding ι D such that Ṽ ι D = V. The contol poblem P is a estiction of P iff P is a subpoblem of P fo some data embedding ι D such that Ṽ = V ι D. Definition 2.9. Two optimal contol poblems P and P ae said to be compatible iff P is a elaxation o estiction of P such that the data embedding involved is onto. In this case, we also say that P is compatible with P o vice vesa. Passing to a elaxation o estiction of a given contol poblem allows us to vay the set of stategies as well as the system. Hence, when two contol poblems ae compatible, we can eplace one with the othe, at least as fa as the value functions ae concened. Definition 2.1. Let D, A, H, Ψ, J be a contol poblem and ϕ D. A stategy α A is called an optimal stategy fo the datum ϕ iff Jϕ, α = V ϕ. A stategy α A is called an ε-optimal stategy iff ε > and Jϕ, α V ϕ + ε. Thus, if α is an optimal stategy fo a given datum ϕ D, then Jϕ,. attains its minimum at α. The existence of an optimal stategy cannot always be guaanteed, see Kushne and Dupuis 21: p. 86 fo a deteministic example. The passage to a elaxation of the contol poblem may allow us to wok with a lage set of stategies whee optimal stategies ae guaanteed to exist. Recall that, at least in discete time, the value function can be used fo the synthesis of optimal o ε-optimal stategies. Obseve that, while compatible contol poblems have value functions that can be identified with each othe, the coesponding optimal o nealy optimal stategies do not, in geneal, coincide since the system functionals may be diffeent.

41 2.2. AN ABSTRACT FRAMEWORK 29 Let us again apply these notions to the deteministic example poblem. Recall the definition of the quintuple P det = D det, A det, H det, Ψ det, J det. We constuct a contol poblem P det = D det, Ãdet, H det, Ψ det, J det such that P det is a elaxation of P det. To this end, set D det := D det = [, T ] R d. Recall fom Definition 2.1 how the set RΓ of deteministic elaxed contols with values in Γ was defined. Let Ãdet be the set of all ρ RΓ such that Equation 2.7 unde ρ has a unique absolutely continuous solution fo each initial condition t, ϕ [, R d. Define the system space H det in analogy to H det as the set D det Ãdet C[,, R d. Define the system functional Ψ det by D det Ãdet H det, t, y, u. t, y, ρ, x t,y,ρ., whee x t,y,ρ. is the unique solution to Equation 2.7, the elaxed vesion of the system equation 2.1, unde the deteministic elaxed contol ρ and initial condition t, y. Finally, define J det to be the mapping H det [, ] given by t, y, ρ, x. f t +t, xt, γ dργ, t + g xt t. Γ [,T t ] In ode to veify that P det = D det, Ãdet, H det, Ψ det, J det thus constucted is indeed a elaxation of P det, we ecall fom Section 2.1 that any odinay contol is associated with a deteministic elaxed contol accoding to 2.6. This defines the stategy embedding. The data embedding is just the identity on [, T ] R d. The system embedding again uses the intepetation of odinay contol stategies as elaxed contols. The value functions of P det and P det ae identical, because any deteministic elaxed contol can be appoximated by odinay deteministic stategies, the cost functionals J det and J det coincide fo odinay stategies, and J det is continuous with espect to the weak topology on H det. The poblem P det can be futhe elaxed by allowing fo elaxed contol pocesses as stategies. In place of Equation 2.7, we then have the andom odinay diffeential equation 2.9 xt, ω = y + b t +s, xs, ω, γ drγ, s, ω, t, ω Ω, Γ [,t] whee R is a elaxed contol pocess on the stochastic basis Ω, F, F t, P in the sense of Definition 2.2. In the cost functional we must now take expectations, that is, instead of J det we have t, y, R, X E f t +t, Xt, γ drγ, t + g XT t, Γ [,T t ] whee X. is any R d -valued continuous stochastic pocess adapted to the filtation coming with the elaxed contol pocess R. The value functions of P det, P det and the new poblem will still be identical, because an ε-optimal stategy of the deteministic poblem is also ε-optimal fo almost all tajectoies of the andomized poblem. Let us denote by ˆP det = ˆD det, Âdet, Ĥdet, ˆΨ det, Ĵdet the stochastic elaxation of Pdet and P det. We choose ˆD det := D det as the data set. The set of stategies Âdet is the set of pais of stochastic bases and adapted elaxed contol pocesses ove Γ. Obseve that the

42 3 CHAPTER 2. THE MARKOV CHAIN METHOD new cost functional as defined above actually depends only on the joint distibution of the pocesses X and R, and on the initial data. Theefoe, as system space we could choose ˆD det { pobability measues on B RΓ C[,, R d }. In place of C[,, R d we will take D[,, R d, the Skoohod space of all functions [, R d which ae continuous fom the ight and have limits fom the left. The space D[,, R d is equipped with the Skoohod topology, cf. Billingsley 1999: Ch. 3 and also Section The Skoohod space allows fo an easie appoximation of functions, even when they ae continuous, in paticula, fo the appoximation by piecewise constant functions. Hence, we define the system space Ĥdet to be the poduct space ˆD det { pobability measues on B RΓ D[,, R d }. The system functional ˆΨ det is the mapping 2.1 ˆDdet Âdet t, ϕ, Ω, F, F t, P, R t, ϕ, P R,X Ĥdet, whee X = x t,y,r. is the solution to Equation 2.9 and P R,X denotes the joint distibution of R and X unde P, the pobability measue which is pat of the admissible stategy. Fo ˆΨ det to be well-defined, we need that solutions to Equation 2.9 be unique in distibution. Lastly, we ewite the cost functional and define Ĵdet to be the mapping Ĥ det [, ] given by 2.11 t, y, Q f t +t, xt, γ dργ, t + g xt t dqρ, x., Γ [,T t ] whee the integal with espect to the pobability measue Q is ove RΓ D[,, R d. Fo the appoximation of a given contol poblem we need the following notions of discetisation. Definition Let P = D, A, H, Ψ, J, P = D, Ā, H, Ψ, J be contol poblems. Then P is a diect discetisation of P iff H = H and thee is a sujective mapping π D : D D and an injective mapping ι A : Ā A. The mappings π D, ι A which make P a diect discetisation of P ae called data pojection and stategy embedding, espectively. Definition A contol poblem P is a discetisation of P iff thee ae contol poblems P, P such that P is compatible with P, P is compatible with P, and P is a diect discetisation of P. Clealy, a contol poblem P which is a diect discetisation of some othe contol poblem P is also a discetisation of P. When we have two contol poblems whee one is a diect discetisation of the othe, then both poblems must have the same system space. This is not necessaily the case when one contol poblem is just a discetisation of the othe. Let us see how the discete contol poblems descibed in Section 2.1 fit into ou famewok. We define them in such a way that they ae diect discetisations of the contol poblem ˆP det.

43 2.2. AN ABSTRACT FRAMEWORK 31 Fo M N, a contol poblem ˆP M = ˆD M, ÂM, ĤM, ˆΨ M, ĴM compatible with the contol poblem of degee M can be defined in the following way. Set ˆD M := {,..., T M } S M, whee S M is a egula tiangulation of R d as in Section 2.1. Set Âdet := Uad M. Thus, admissible stategies ae pais of stochastic bases Ω, F, F n n N, P and F n -adapted Γ- valued sequences ūn n N. The system space ĤM is the same as Ĥdet. Denote by ι el the mapping induced by the epesentation of odinay Γ-valued contol pocesses as elaxed contol pocesses accoding to 2.8. Define the system functional ˆΨ M as the mapping ˆD M ÂM n, y, Ω, F, F n, P, ūn n M, y, Q Ĥdet, whee the pobability measue Q is the distibution unde P of the andom vaiable Ω ω ι el [, t ū t M, ω, [, t ξ t M, ω RΓ D[,, R d, and ξn is the S M -valued F n -adapted sequence detemined by the stategy ū, the initial condition n, y and the tansition function p M fom Section 2.1. The cost functional Jˆ M is defined to be the mapping Ĥdet [, ] given by T M n 1 t, y, Q f M n +n, x n M, γ dργ, n M + g T M x M n M dqρ, x., n= Γ whee the integal with espect to Q is again ove RΓ D[,, R d. In ode to check whethe the contol poblem ˆP M just constucted is indeed a diect discetisation of ˆP det, it is enough to find a stategy embedding ι M A and a data pojection πm D. Define ιm A to be the mapping Â M ū n n N ι el [, t ū t M Âdet, that is, the sequence ū n is associated with the elaxed contol epesentation of its piecewise constant intepolation elative to a gid of mesh size 1 M. This is the same opeation as in the definition of the system functional JM ˆ. As data pojection πm D we may choose the mapping ˆD det t, ϕ M t, Λ M ϕ ˆD M, whee Λ M maps ϕ R d to its neaest neighbou in S M R d Appoximation and convegence Recall that ou objective is to povide sufficient conditions fo the convegence of the value functions associated with a family of discete contol poblems to the value function of some given continuous contol poblem. The conditions ae stated fo appoximating contol poblems which ae diect discetisations of the oiginal poblem. Definition A sequence P M M N of optimal contol poblems appoximates an optimal contol poblem P iff P M is a diect discetisation of P with data pojection π D M, each M N, and V M π D M M N conveges to V pointwise ove D, whee V M, V ae the value functions associated with P M, P, espectively.

44 32 CHAPTER 2. THE MARKOV CHAIN METHOD Let P = D, A, H, Ψ, J, P M = D M, A M, H, Ψ M, J M, M N, be contol poblems such that, fo each M N, the poblem P M is a diect discetisation of P with data pojection πm D. Note that the system space H is the same fo all contol poblems involved, while the cost functional may vay depending on the discetisation degee M N. Fo poving convegence we will suppose that thee is a topology on H such that H1 the mapping J : H, ] is sequentially lowe semi-continuous, H2 J M tends to J as M unifomly on sequentially compact subsets of H, H3 fo each ϕ D, each α A, thee is a sequence α M M N with α M A M such that lim sup M J Ψ M π D M ϕ, α M J Ψϕ, α, H4 fo each ϕ D, any sequence α M M N such that α M A M, the closue of the set {Ψ M π D M ϕ, α M : M N} is sequentially compact in H, H5 fo each ϕ D, any sequence α M M N such that α M A M, the limit points of the sequence Ψ M π D M ϕ, α M M N ae contained in Ψϕ,.A. The conditions just stated guaantee that the sequence of contol poblems P M M N appoximates P. Theoem 2.1. Let P be a contol poblem and P M M N be a sequence of diect discetisations of P as above. If all the contol poblems ae finite and thee is a topology on the system space H such that Assumptions H1-H5 hold, then P M appoximates P. Poof. Let ϕ D. Fo M N set ϕ M := π D M ϕ. We have to show that V Mϕ M V ϕ as M. The fist step is to check that lim sup M V M ϕ M V ϕ. To this end, let ε > and choose α A such that V ϕ JΨϕ, α ε. Fo this ϕ and this α, choose a sequence α M M N with α M A M accoding to Assumption H3 such that lim sup M J Ψ M π D M ϕ, α M J Ψϕ, α. By Assumptions H2 and H4 we have that the diffeence between J Ψ M ϕ M, α M and J M Ψ M ϕ M, α M tends to zeo as M. Hence we find M N such that fo all M M V M ϕ M J M ΨM ϕ M, α M J Ψϕ, α + ε V ϕ + 2ε. Since ε > was abitay, it follows that lim sup M V M ϕ M V ϕ. The second step is to show that lim inf M V M ϕ M V ϕ. Fo each M N, choose α M A M such that V M ϕ M J M Ψ M ϕ M, α M 1 M. Set x M := Ψ M ϕ M, α M. Then, by constuction, lim inf V Mϕ M = lim inf J Mx M. M M Assume we had lim inf M J M x M < V ϕ. By Assumption H4, x M would be contained in a sequentially compact set, whence we could choose a convegent subsequence x Mi x M with unique limit point x := lim i x Mi. By Assumption H5, thee would be a stategy α A such that x = Ψϕ, α, whence Jx = JΨϕ, α V ϕ. By Assumption H1 we would have lim inf i Jx Mi Jx, while Assumptions H2 and H4 togethe would imply that the diffeence between Jx Mi and J Mi x Mi tends to zeo as i. This would yield lim inf i J Mix Mi = lim inf i J Mix Mi Jx Mi + Jx Mi Jx + Jx V ϕ,

45 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 33 a contadiction to the hypothesis that lim inf M J M x M < V ϕ. Theefoe, we must have lim inf M J M x M V ϕ. It follows that lim inf M V M ϕ M V ϕ. The conclusion of Theoem 2.1 continues to hold if we eplace Assumptions H1 and H5 by the following hypotheses: H1 the mapping J : H, ] is sequentially continuous, H5 fo all ϕ D, any sequence α M M N such that α M A M, the limit points of the sequence Ψ M π D M ϕ, α M M N ae contained in the closue of Ψϕ,.A. Let us biefly comment on Assumptions H1-H5. Hypothesis H1 is a continuity assumption on the cost functional of the oiginal poblem only. Hypothesis H2 states that the cost functionals of the appoximating poblems convege locally unifomly to the costs of the oiginal poblem. Hypothesis H3 could be called a scatteing assumption, because it implies that any continuous stategy can be appoximated by discete stategies in the sense of conveging costs. Hypothesis H4 is about the existence of convegent subsequences of solutions to the dynamics of the appoximating poblems. It is usually a consequence of the compactification of the space of stategies mentioned in Section 2.1. Hypothesis H5 says that limits of solutions to the appoximating dynamics can be identified as solutions to the oiginal dynamics. Thee ae two impotant points when it comes to applying Theoem 2.1. The fist is to e-fomulate the contol poblems involved so that the appoximating poblems ae diect discetisations of the oiginal poblem. The second point is the choice of a suitable topology on the system space. As fa as the example poblem is concened, ˆPdet, ˆPM, M N, ae appopiate efomulations of the oiginal contol poblem and the associated discete poblems, espectively. Also, fo each M N, ˆPM is a diect discetisation of ˆP det. It emains to choose the topology on the system space Ĥdet. In view of how the cost functionals ae defined, the topology of weak convegence of pobability measues on BRΓ D[,, R d coupled with the standad topology on [, T ] R d is a good choice. We do not povide the convegence analysis fo the example poblem. Notice that we did not make pecise any assumptions egading the coefficients b, f, g of the oiginal poblem. In Section 2.3, howeve, the details of the convegence analysis fo a class of stochastic optimal contol poblems with delay ae woked out. 2.3 Application to stochastic contol poblems with delay Hee, we study the appoximation of cetain continuous-time stochastic optimal contol poblems with time delay in the dynamics accoding to the Makov chain method. The contol poblems whose value functions ae to be appoximated ae specified in Subsection In Subsection 2.3.2, the oiginal contol poblem is efomulated by enlaging and compactifying the set of admissible stategies. Fo the esulting elaxed contol poblem, optimal stategies ae guaanteed to exist. The dynamics of the appoximating contol poblems ae defined in Subsection 2.3.3; time as well as the state space ae discetised, and an appopiate condition of local consistency is given. In Subsection 2.3.4, the cost functionals of the appoximating poblems ae specified and convegence of the coesponding

46 34 CHAPTER 2. THE MARKOV CHAIN METHOD value functions is shown. Subsection contains a technical esult which is needed in the poof of Poposition The oiginal contol poblem We conside the contol of a dynamical system given by a one-dimensional stochastic delay diffeential equation diven by a Wiene pocess. Both dift and diffusion coefficient may depend on the solution s histoy a cetain amount of time into the past. Let > denote the delay length, i. e. the maximal length of dependence on the past. In ode to simplify the analysis, we estict attention to the case whee only the dift tem can be diectly contolled. Typically, the solution pocess of an SDDE does not enjoy the Makov popety, while the segment pocess associated with that solution does, cf. Subsection The segment pocess X t t associated with a eal-valued càdlàg pocess Xt t takes its values in D := D[, ], the space of all eal-valued càdlàg functions on the inteval [, ]. Thee ae two natual topologies on D. The fist is the one induced by the supemum nom. The second is the Skoohod topology of càdlàg convegence e. g. Billingsley, 1999: Ch. 3. The main diffeence between the Skoohod and the unifom topology lies in the diffeent evaluation of convegence of functions with jumps, which appea natually as initial segments and discetised pocesses. Fo continuous functions both topologies coincide. Simila statements hold fo D := D[, and D := D[,, the spaces of all eal-valued càdlàg functions on the intevals [, and [,, espectively. The spaces D and D will always be supposed to cay the Skoohod topology, while D will canonically be equipped with the unifom topology. Let Γ, d Γ be a compact metic space, the space of contol actions. Denote by b the dift coefficient of the contolled dynamics, and by σ the diffusion coefficient. Let W t t be a one-dimensional standad Wiene pocess on a filteed pobability space Ω, F, F t t, P satisfying the usual conditions, and let ut t be a contol pocess, i. e. an F t -adapted measuable pocess with values in Γ. Conside the contolled SDDE 2.12 dxt = b X t, ut dt + σx t dw t, t. The contol pocess u. togethe with its stochastic basis including the Wiene pocess is called an admissible stategy if, fo evey deteministic initial condition ϕ D, Equation 2.12 has a unique solution which is also weakly unique. Wite U ad fo the set of admissible stategies of Equation The stochastic basis coming with an admissible contol will often be omitted in the notation. A solution in the sense used hee is an adapted càdlàg pocess defined on the stochastic basis of the contol pocess such that the integal vesion of Equation 2.12 is satisfied. Given a contol pocess togethe with a standad Wiene pocess, a solution to Equation 2.12 is unique if it is indistinguishable fom any othe solution almost suely satisfying the same initial condition. A solution is weakly unique if it has the same law as any othe solution with the same initial distibution and satisfying Equation 2.12 fo a contol pocess on a possibly diffeent stochastic basis so that the joint distibutions of contol and diving Wiene pocess ae the same fo both solutions. Let us specify the egulaity assumptions to be imposed on the coefficients b and σ:

47 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 35 A1 Càdlàg functionals: the mappings ψ, γ [ t bψ t, γ, t ], ψ [ t σψ t, t ] define measuable functionals D Γ D and D D, espectively, whee D, D ae equipped with thei Boel σ-algebas. A2 Continuity of the dift coefficient: thee is an at most countable subset of [, ], denoted by I ev, such that fo evey t the function defined by D Γ ψ, γ bψ t, γ is continuous on D ev t Γ unifomly in γ Γ, whee D ev t := {ψ D ψ is continuous at t + s fo all s I ev }. A3 Global boundedness: b, σ ae bounded by a constant K >. A4 Unifom Lipschitz condition: Thee is a constant K L > such that fo all ϕ, ϕ D, all γ Γ bϕ, γ b ϕ, γ + σϕ σ ϕ K L sup ϕs ϕs. s [,] A5 Ellipticity of the diffusion coefficient: σϕ σ fo all ϕ D, whee σ > is a positive constant. Assumptions A1 and A4 on the coefficients allow us to invoke Theoem V.7 in Potte 23: p.253, which guaantees the existence of a unique solution to Equation 2.12 fo evey piecewise constant contol attaining only finitely many diffeent values. The boundedness Assumption A3 poses no limitation except fo the initial conditions, because the state evolution will be stopped when the state pocess leaves a bounded inteval. Assumption A2 allows us to use segmentwise appoximations of the solution pocess, see the poof of Poposition 2.1. The assumptions imposed on the dift coefficient b ae satisfied, fo example, by 2.13 bϕ, γ := f ϕ 1,..., ϕ n, ϕsw 1 sds,..., ϕsw m sds gγ, whee 1,..., n [, ] ae fixed, f, g ae bounded continuous functions and f is Lipschitz, and the weight functions w 1,..., w m lie in L 1 [, ]. Apat fom the contol tem, the diffusion coefficient σ may have the same stuctue as b in We next give an example of a function that could be taken fo σ if the càdlàg continuity in Assumption A1 wee missing. In Subsection it will become clea that the coesponding contol poblem cannot be appoximated by a simple discetisation pocedue, because the evaluation of σϕ fo any ϕ D depends on the discetisation gid. Let A M be the subset of the inteval [, ] given by A M := { t 2 3M, t] t = n 2 M 1 fo some n {1,..., 2 M } }.

48 36 CHAPTER 2. THE MARKOV CHAIN METHOD Let A be the union of the sets A M, M N. With positive constants σ, K, we define a functional σ : D R by 2.14 σϕ := σ + K sup { ϕt ϕt t A }, whee ϕt is the left hand limit of ϕ at t [, ]. Assumptions A3 and A4 ae clealy satisfied if we choose σ accoding to 2.14, but σ would not induce a càdlàg functional D D. This can be seen by consideing the mapping [, t σψ t fo a function ψ D which is constant except fo a single discontinuity. If we had defined σ with the set A being the union of only finitely many sets A M, then we would have obtained a càdlàg functional. We conside contol poblems in the weak fomulation cf. Yong and Zhou, 1999: p. 64. Given an admissible contol u. and a deteministic initial segment ϕ D, denote by X ϕ,u the unique solution to Equation Let I be a compact inteval with non-empty inteio. Define the stopping time τ T ϕ,u of fist exit fom the inteio of I befoe time T > by 2.15 τ T ϕ,u := inf{t X ϕ,u t / inti} T. In ode to define the costs, we pescibe a cost ate k : R Γ [, and a bounday cost g : R [, which we take to be jointly continuous bounded functions. Let β denote the exponential discount ate. Then define the cost functional on D U ad by τ 2.16 Jϕ, u := E exp βs k X ϕ,u s, us ds + g X ϕ,u τ, whee τ = τ T ϕ,u. Ou aim is to minimize Jϕ,.. We intoduce the value function 2.17 V ϕ := inf{jϕ, u u U ad }, ϕ D. The contol poblem now consists in calculating the function V and finding admissible contols that minimize J. Such contol pocesses ae called optimal contols o optimal stategies Existence of optimal stategies In the class U ad of admissible stategies it may happen that thee is no optimal contol. A way out is to enlage the class of stategies, allowing fo so-called elaxed contols, cf. Subsection and the discussion afte Definition 2.8 in Subsection Let R be a elaxed contol pocess in the sense of Definition 2.2. Then Equation 2.12 takes on the fom 2.18 dxt = bx t, γ dt Ṙt, dγ + σx t dw t, t, Γ whee Ṙt,. t is the family of deivative measues associated with R. A elaxed contol pocess togethe with its stochastic basis including the Wiene pocess is called admissible elaxed contol o an admissible stategy if, fo evey deteministic initial condition, Equation 2.18 has a unique solution which is also weakly unique. Denote by Ûad the set of all admissible elaxed contols. Instead of 2.16 we define a cost functional on D Ûad by τ 2.19 Ĵϕ, R := E exp βs k X ϕ,r s, γ Ṙs, dγ ds + g X ϕ,r τ, Γ

49 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 37 whee X ϕ,r is the solution to Equation 2.18 unde the elaxed contol pocess R with initial segment ϕ and τ is defined in analogy to Instead of 2.17 as value function we have 2.2 ˆV ϕ := inf{ Ĵϕ, R R Ûad}, ϕ D. The cost functional Ĵ depends only on the joint distibution of the solution X ϕ,r and the undelying contol pocess R, since τ, the time hoizon, is a deteministic function of the solution. The distibution of X ϕ,r, in tun, is detemined by the initial condition ϕ and the joint distibution of the contol pocess and its accompanying Wiene pocess. Letting the time hoizon vay, we may egad Ĵ as a function of the law of X, R, W, τ, that is, as being defined on a subset of the set of pobability measues on BD R D [, ]. Notice that the time inteval has been compactified. The domain of definition of Ĵ is detemined by the class of admissible elaxed contols fo Equation 2.18, the definition of the time hoizon and the distibutions of the initial segments X. The following poposition gives the analogue of Theoem in Kushne and Dupuis 21: pp fo ou setting. We pesent the poof in detail, because the identification of the limit pocess is diffeent fom the classical case. Poposition 2.1. Assume A1 A4. Let R M, W M M N be any sequence of admissible elaxed contols fo Equation 2.18, whee R M, W M is defined on the filteed pobability space Ω M, F M, Ft M, P M. Let X M be a solution to Equation 2.18 unde contol R M, W M with deteministic initial condition ϕ M D, and assume that ϕ M tends to ϕ unifomly fo some ϕ D. Fo each M N, let τ M be an Ft M -stopping time. Then X M, R M, W M, τ M M N is tight. Denote by X, R, W, τ a limit point of the sequence X M, R M, W M, τ M M N. Define a filtation by F t := σxs, Rs, W s, τ1 {τ t}, s t, t. Then W. is an F t - adapted Wiene pocess, τ is an F t -stopping time, R, W is an admissible elaxed contol, and X is a solution to Equation 2.18 unde R, W with initial condition ϕ. Poof. Tightness of X M follows fom the Aldous citeion cf. Billingsley, 1999: pp : given M N, any bounded F M t -stopping time ν and δ > we have E M X M ν + δ X M ν 2 F M ν 2K 2 δδ + 1 as a consequence of Assumption A3 and the Itô isomety. Notice that X M tends to X as M goes to infinity by hypothesis. The sequences R M and τ M ae tight, because the value spaces R and [, ], espectively, ae compact. The sequence W M is tight, since all W M induce the same measue. Finally, componentwise tightness implies tightness of the poduct cf. Billingsley, 1999: p. 65. By abuse of notation, we do not distinguish between the convegent subsequence and the oiginal sequence and assume that X M, R M, W M, τ M conveges weakly to X, R, W, τ. The andom time τ is an F t -stopping time by constuction of the filtation. Likewise, R is F t -adapted by constuction, and it is indeed a elaxed contol pocess, because RΓ [, t] = t, t, P-almost suely by weak convegence of the elaxed contol pocesses R M to R. The pocess W has Wiene distibution and continuous paths with pobability one, being the limit of standad Wiene pocesses. To check that W is an

50 38 CHAPTER 2. THE MARKOV CHAIN METHOD F t -Wiene pocess, we use the matingale poblem chaacteization of Bownian motion. To this end, fo g C c Γ [,, ρ R define the paiing g, ρt := Γ [,t] gγ, s dργ, s, t. Notice that eal-valued continuous functions on R can be appoximated by functions of the fom R ρ H g j, ρt i, i, j N p N q R, whee p, q ae natual numbes, {t i i N p } [,, and H, g j, j N q, ae suitable continuous functions with compact suppot and N N := {1,..., N} fo any N N. Let t, t 1,..., t p [, t], h, g 1,..., g q be functions in C c Γ [,, and H be a continuous function of 2p + p q + 1 aguments with compact suppot. Since W M is an F M t -Wiene pocess fo each M N, we have fo all f C 2 cr E M H X M t i, g j, R M t i, W M t i, τ M 1 {τ M t}, i, j N p N q f W M t + h f W M t 1 2 t+h t 2 f x 2 W M s ds =. By the weak convegence of X M, R M, W M, τ M M N to X, W, R, τ we see that E H Xt i, g j, Rt i, W t i, τ1 {τ t}, i, j N p N q f W t + h f W t 1 2 t+h t 2 f W s ds x 2 = fo all f C 2 cr. As H, p, q, t i, g j vay ove all possibilities, the coesponding andom vaiables HXt i, g j, Rt i, W t i, τ1 {τ t}, i, j N p N q induce the σ-algeba F t. Since t, h wee abitay, it follows that f W t f W f x 2 W s ds, t, is an F t -matingale fo evey f C 2 cr. Consequently, W is an F t -Wiene pocess. It emains to show that X solves Equation 2.18 unde contol R, W with initial condition ϕ. Notice that X has continuous paths on [, P-almost suely, because the pocess Xt t is the weak limit in D of continuous pocesses. Fix T >. We have to check that, P-almost suely, Xt = ϕ + Γ bx s, γ Ṙs, dγ ds + σx s dw s fo all t [, T ]. By vitue of the Skoohod epesentation theoem cf. Billingsley, 1999: p. 7 we may assume that the pocesses X M, R M, W M, M N, ae all defined on the same pobability space Ω, F, P as X, R, W and that convegence of X M, R M, W M to X, R, W is

51 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 39 P-almost sue. Since X, W have continuous paths on [, T ] and ϕ M conveges to ϕ in the unifom topology, one finds Ω F with P Ω = 1 such that fo all ω Ω sup X M tω Xtω M, sup W M tω W tω M, t [,T ] t [,T ] and also R M ω Rω in R. Let ω Ω. We fist show that Γ b X M s ω, γ Ṙ M s, dγω ds M Γ b X s ω, γ Ṙs, dγω ds unifomly in t [, T ]. As a consequence of Assumption A4, the unifom convegence of the tajectoies on [, T ] and popety 2.5 of the elaxed contols, we have Γ [,T ] b X M s ω, γ b X s ω, γ dr M γ, sω M. By Assumption A2, we find a countable set A ω [, T ] such that the mapping γ, s bx s ω, γ is continuous in all γ, s Γ [, T ] \ A ω. Since A ω is countable we have RωΓ A ω =. Hence, by the genealized mapping theoem cf. Billingsley, 1999: p. 21, we obtain fo each t [, T ] Γ [,t] b X s ω, γ dr M γ, sω M Γ [,t] b X s ω, γ drγ, sω. The convegence is again unifom in t [, T ], as b is bounded and R M, M N, R ae all positive measues with mass T on Γ [, T ]. Define càdlàg pocesses C M, M N, on [, by C M t := ϕ M + bxs M, γ dr M γ, s, t, Γ [,t] and define C in analogy to C M with ϕ, R, X in place of ϕ M, R M, X M, espectively. Fom the above, we know that C M t Ct holds unifomly ove t [, T ] fo any T > with pobability one. Define opeatos F M : D D, M N, mapping càdlàg pocesses to càdlàg pocesses by { Y t+sω if t+s, F M Y tω := σ [, ] s, t, ω Ω, ϕ M t+s else and define F in the same way as F M with ϕ M eplaced by ϕ. Obseve that X M solves X M t = C M t + F M X M s dw M s, t. Denote by ˆXt t the unique solution to ˆXt = Ct + F ˆXs dw s, t, and set ˆXt := ϕt fo t [,. Assumption A4 and the unifom convegence of ϕ M to ϕ imply that F M ˆX conveges to F ˆX unifomly on compacts in pobability

52 4 CHAPTER 2. THE MARKOV CHAIN METHOD convegence in ucp. Theoem V.15 in Potte 23: p. 265 yields that X M conveges to ˆX in ucp, that is X M t ˆXt M sup t [,T ] in pobability P fo any T >. Theefoe, X is indistinguishable fom ˆX. By definition of C and F, this implies that ˆX solves Equation 2.18 unde contol R, W with initial condition ϕ, and so does X. If the time hoizon wee deteministic, then the existence of optimal stategies in the class of elaxed contols would be clea. Given an initial condition ϕ D, one would select a sequence R M, W M M N such that Ĵϕ, RM conveges to its infimum. By Poposition 2.1, a suitable subsequence of R M, W M and the associated solution pocesses would convege weakly to R, W and the associated solution to Equation Taking into account 2.19, the definition of the costs, this in tun would imply that Ĵϕ,. attains its minimum value at R o, moe pecisely, at X, R, W. A simila agument is still valid if the time hoizon depends continuously on the paths with pobability one unde evey possible solution. That is to say, the mapping 2.21 ˆτ : D [, ], ˆτψ := inf{t ψt / inti} T, is Skoohod continuous with pobability one unde the measue induced by any solution X ϕ,r, R any elaxed contol. This is indeed the case if the diffusion coefficient σ is bounded away fom zeo as equied by Assumption A5, cf. Kushne and Dupuis 21: pp By intoducing elaxed contols, we have enlaged the class of possible stategies. The infimum of the costs, howeve, emains the same fo the new class. This is a consequence of the fact that stochastic elaxed contols can be abitaily well appoximated by piecewise constant odinay stochastic contols which attain only a finite numbe of diffeent contol values. A poof of this assetion is given in Kushne 199: pp in case the time hoizon is finite, and extended to the case of contol up to an exit time in Kushne and Dupuis 21: pp Notice that nothing hinges on the pesence o absence of delay in the contolled dynamics. Let us summaize ou findings. Theoem 2.2. Assume A1 A5. Given any deteministic initial condition ϕ D, the elaxed contol poblem detemined by 2.18 and 2.19 possesses an optimal stategy, and the minimal costs ae the same as fo the oiginal contol poblem. When efomulated along the lines of the example poblem in Section 2.2, the elaxed contol poblem detemined by 2.18 and 2.19 is indeed a elaxation in the sense of Definition 2.8 of the oiginal contol poblem fom Subsection Appoximating chains In ode to constuct finite-dimensional appoximations to ou contol poblem, we discetise time and state space. In the non-delay case a andom time gid pemits simple poofs. Since in the delay case the segment pocess must be well appoximated, a deteministic gid is natual and pefeable, but calls fo poof techniques deviating fom the classical way adopted by Kushne and Dupuis 21 o Kushne 25.

53 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 41 Denote by h > the mesh size of an equidistant time discetisation stating at zeo. Let S h := hz be the coesponding state space, and set I h := I S h. Notice that S h is countable and I h is finite. Let Λ h : R S h be a ound-off function. We will simplify things even futhe by consideing only mesh sizes h = M fo some M N, whee is the delay length. The numbe M will be efeed to as discetisation degee. The admissible stategies fo the finite-dimensional contol poblems coespond to piecewise constant pocesses in continuous time. A discete-time pocess u = un n N on a stochastic basis Ω, F, F t, P with values in Γ is a discete admissible contol of degee M if u takes on only finitely many diffeent values in Γ and un is F nh -measuable fo all n N. Denote by ūt t the piecewise constant càdlàg intepolation to u on the time gid. We call a discete-time pocess ξn n { M,...,} N a discete chain of degee M if ξn takes its values in S h and ξn is F nh -measuable fo all n N. In analogy to ū, wite ξt t fo the càdlàg intepolation to the discete chain ξn n { M,...,} N. We denote by ξ t the D -valued segment of ξ. at time t. Let ϕ D be a deteministic initial condition, and suppose we ae given a sequence of discete admissible contols u M M N, that is u M is a discete admissible contol of degee M on a stochastic basis Ω M, F M, Ft M, P M fo each M N. In addition, suppose that the sequence ū M of intepolated discete contols conveges weakly to some elaxed contol R. We ae then looking fo a sequence appoximating the solution X of Equation 2.18 unde contol R, W with initial condition ϕ, whee the Wiene pocess W has to be constucted fom the appoximating sequence. Given M-step o extended Makov tansition functions p M : S M+1 h Γ S h [, 1], M N, we define a sequence of appoximating chains associated with ϕ and u M as a family ξ M M N of pocesses such that ξ M is a discete chain of degee M defined on the same stochastic basis as u M, povided the following conditions ae fulfilled fo h = h M := M tending to zeo: i Initial condition: ξ M n = Λ h ϕnh fo all n { M,..., }. ii Extended Makov popety: fo all n N, all x S h P M ξ M n+1 = x Fnh M = p M ξ M n M,..., ξ M n, u M n, x. iii Local consistency with the dift coefficient: µ ξ M n := E M ξ M n+1 ξ M n Fnh M = h b ξm nh, u M n + oh =: h b h ξm nh, u M n. iv Local consistency with the diffusion coefficient: E M ξ M n+1 ξ M n µ ξ M n 2 F M nh = h σ 2 ξ M nh + oh =: h σ2 h ξ M nh. v Jump heights: thee is a positive numbe Ñ, independent of M, such that sup ξ M n + 1 ξ M n Ñ h M. n

54 42 CHAPTER 2. THE MARKOV CHAIN METHOD It is staightfowad, unde Assumptions A3 and A5, to constuct a sequence of extended Makov tansition functions such that the jump height and the local consistency conditions can be fulfilled. Assuming that the bounding constant K fom A3 is a natual numbe, we may define the functions p M fo all M N big enough by, fo example, 1 σ Z + h 2K 2 2K b Z, γ, if x = Z + K h, p M 1 σ 2K Z M,..., Z, γ, x := Z h 2 2K b Z, γ, if x = Z K h, 1 1 σ Z if x = Z K 2 else, whee h = h M, Z = Z M,..., Z S M+1 h, γ Γ, x S h, and Z D is the piecewise constant intepolation associated with Z. The family p M as just defined, in tun, is all we need in ode to constuct a sequence of appoximating chains associated with any given ϕ, u M. We will epesent the intepolation ξ M as a solution to an equation coesponding to Equation 2.12 with contol pocess ū M and initial condition ϕ M, whee ϕ M is the piecewise constant S h -valued càdlàg intepolation to ϕ, that is ϕ M = ξ M. Define the discete pocess L M n n N by L M := and ξ M n = ϕ M + n 1 h b h ξm ih, u M i + L M n, n N. i= Obseve that L M is a matingale in discete time with espect to the filtation F M nh. Setting ε M 1 t := t h 1 i= h b h ξm ih, ū M ih b ξm s, ū M s ds, t, the intepolated pocess ξ M can be epesented as solution to ξ M t = ϕ M + b ξm s, ū M s ds + L M t h + εm 1 t, t. With T >, we have fo the eo tem E M sup t [,T ] ε M 1 t T h 1 i= h E M b h ξm ih, u M i b ξm ih, u M i + K h + h T h E M b ξm h s h, ūm s b ξm s, ū M s ds, which tends to zeo as M goes to infinity by Assumptions A2, A3, dominated convegence and the defining popeties of ξ M. Moeove, ε M 1 t is bounded by 2K T fo all t [, T ] and all M big enough, whence also E M sup t [,T ] ε M 1 t 2 M.

55 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 43 The discete-time matingale L M can be ewitten as a discete stochastic integal. Define W M n n N by setting W M := and W M n := n 1 i= 1 σ ξ M ih L M i+1 L M i, n N. Using the piecewise constant intepolation W M of W M, the pocess ξ M can be expessed as the solution to 2.22 ξ M t = ϕ M + b ξm s, ū M s ds + σ ξm h s h d W M s + ε M 2 t, t, whee the eo tems ε M 2 convege to zeo as εm 1 befoe. We ae now pepaed fo the convegence esult, which should be compaed to Theoem in Kushne and Dupuis 21: p. 29. The poof is simila to that of Poposition 2.1. We meely point out the main diffeences. Poposition 2.2. Assume A1 A5. Fo each M N, let τ M be a stopping time with espect to the filtation geneated by ξ M s, ū M s, W M s, s t. Let R M denote the elaxed contol epesentation of ū M. Suppose ϕ M conveges to the initial condition ϕ unifomly on [, ]. Then ξ M, R M, W M, τ M M N is tight. Fo a limit point X, R, W, τ set F t := σ Xs, Rs, W s, τ1 {τ t}, s t, t. Then W is an F t -adapted Wiene pocess, τ is an F t -stopping time, R, W is an admissible elaxed contol, and X is a solution to Equation 2.18 unde R, W with initial condition ϕ. Poof. The main diffeences in the poof ae establishing the tightness of W M and the identification of the limit points. We calculate the ode of convegence fo the discete-time pevisible quadatic vaiations of W M : W M n = n 1 E W M i+1 W M i 2 Fih M i= n 1 = nh + oh i= 1 σ 2 ξ M ih fo all M N, n N. Taking into account Assumption A5 and the definition of the time-continuous pocesses W M, we see that W M tends to Id [, in pobability unifomly on compact time intevals. By Theoem VIII.3.11 of Jacod and Shiyaev 1987: p. 432 we conclude that W M conveges weakly in D to a standad Wiene pocess W. That W has independent incements with espect to the filtation F t can be seen by consideing the fist and second conditional moments of the incements of W M fo each M N and applying the conditions on local consistency and the jump heights of ξ M. Suppose ξ M, R M, W M is weakly convegent with limit point X, R, W. The emaining diffeent pat is the identification of X as a solution to Equation 2.18 unde the elaxed contol R, W with initial condition ϕ. Notice that X is continuous on [, because of the condition on the jump heights of ξ M, cf. Theoem in Ethie and Kutz 1986: p Let us define càdlàg pocesses C M, C on [, by C M t := ϕ M + Ct := ϕ + Γ [,t] b ξm s, ū M s ds + ε M 2 t, t, bx s, γ drs, γ, t.

56 44 CHAPTER 2. THE MARKOV CHAIN METHOD Then C, C M ae bounded on compact time intevals unifomly in M N. Invoking Skoohod s epesentation theoem, one establishes weak convegence of C M to C as in the poof of Poposition 2.1. The sequence W M is of unifomly contolled vaiations, hence a good sequence of integatos in the sense of Kutz and Potte 1991, because the jump heights ae unifomly bounded and W M is a matingale fo each M N. We have weak convegence of W M to W. The esults in Kutz and Potte 1991 guaantee weak convegence of the coesponding adapted quadatic vaiation pocesses, that is [ W M, W M ] conveges weakly to [W, W ] in D = D R [,, whee the squae backets indicate the adapted quadatic co-vaiation. Convegence also holds fo the sequence of pocess pais W M, [ W M, W M ] in D R 2[,, see Theoem 36 in Kutz and Potte 24. We now know that each of the sequences ξ M, C M, W M, [ W M, W M ] is weakly convegent in D R [,. Actually, we have weak convegence fo the sequence of pocess quaduples ξ M, C M, W M, [ W M, W M ] in D R 4[,. To see this, notice that each of the sequences ξ M + C M, ξ M + W M, ξ M + [ W M, W M ], C M + W M, C M + [ W M, W M ], and W M + [ W M, W M ] is tight in D R [,, because the limit pocesses C, X, W, and [W, W ] = Id [, ae all continuous on [,. Accoding to Poblem 22 in Ethie and Kutz 1986: p. 153 this implies tightness of the quaduple sequence in D R 4[,. Since the fou component sequences ae all weakly convegent, the fou-dimensional sequence must have a unique limit point, namely X, C, W, [W, W ]. By vitue of Skoohod s theoem, we may again wok unde P-almost sue convegence. Since C, X, W, [W, W ] ae all continuous, it follows that C M C, ξm X, W M W, [ W M, W M ] [W, W ] unifomly on compact subintevals of [, with pobability one. Define the mapping F : D D D by { xt+s if t+s, F ϕ, xt := σ [, ] s, t. ϕt+s else Fo M N, let F M be the mapping fom D to D given by F M x := F ϕ M, x. Let H M : D D be the càdlàg intepolation opeato of degee M, that is H M x is the piecewise constant càdlàg intepolation to x D along the time gid of mesh size M stating at zeo. Define F M : D D by F M xt := F ϕ M, H M x t M, t, whee t M := M M t. If ψ D, we will take F M ψ, F M ψ and F ψ to equal F M x, F M x and F ϕ, x, espectively, whee x is the estiction of ψ to [,. Equation 2.22 tanslates to ξ M t = C M t + Let ˆξ be the unique càdlàg pocess solving F M ξ M s d W M s, t. ˆξs = ϕs, s [,, ˆξt = Ct + F ˆξs dw s, t. Fix T >. Since ξ M conveges to X as M goes to infinity unifomly on compacts with pobability one, it is enough to show that E ˆξt ξm t 2 M. sup t [,T ]

57 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY 45 Fist obseve that E sup t [,T ] Ct C M t 2 M, sup ˆξt ξm t 2 t [, M, because C is unifomly bounded on compact time intevals and ϕ is càdlàg and continuous on [,. Given ε >, by Lemma 2.1 in Section and by Gonwall s lemma we find that thee is a positive numbe M = M ε such that fo all M M E sup t [,T ] F ˆξs dw s This yields and the assetion follows. F M ξ M s d W M s 2 76T εk 2 +1 exp 4KLT 2. If we conside appoximations along all equidistant patitions of [, ], then the hypothesis about the unifom convegence of the initial conditions implies that ϕ must be continuous on [, ]\{}. In case ϕ has jumps at positions locatable on one of the equidistant patitions, the convegence esults continue to hold when we estict to a sequence of efining patitions Convegence of the minimal costs The objective behind the intoduction of sequences of appoximating chains was to obtain a device fo appoximating the value function V of the oiginal poblem. At this point we define, fo each discetisation degee M N, a discete contol poblem with cost functional J M so that J M is an appoximation of the cost functional J of the oiginal poblem in the following sense: Given a suitable initial segment ϕ D and a sequence of discete admissible contols u M such that ū M weakly conveges to a elaxed contol R, we have Jϕ, u M Ĵϕ, R as M tends to infinity. Unde the assumptions intoduced above, it will follow that also the value functions associated with the discete cost functionals convege to the value function of the oiginal poblem. Fix M N, and set h := M. Denote by U ad M the set of discete admissible contols of degee M. Define the cost functional of degee M by 2.23 J M ϕ, u Nh 1 := E exp βnh k ξn, un h + g ξn h, n= whee ϕ D, u U M ad is defined on the stochastic basis Ω, F, F t, P and ξn is a discete chain of degee M defined accoding to p M and u with initial condition ϕ. The discete exit time step N h is given by 2.24 N h := min{n N ξn / I h } T h. Denote by τ M := h N h the exit time fo the coesponding intepolated pocesses. The value function of degee M is defined as 2.25 V M ϕ := inf { J M ϕ, u u U M ad }, ϕ D. We ae now in a position to state the esult about convegence of the minimal costs. Poposition 2.3 and Theoem 2.3 ae compaable to Theoems and in Kushne

58 46 CHAPTER 2. THE MARKOV CHAIN METHOD and Dupuis 21: pp Let us suppose that the initial condition ϕ D and the sequence of patitions of [, ] ae such that the discetised initial conditions convege to ϕ unifomly on [, ]. Poposition 2.3. Assume A1 A5. If the sequence ξ M, ū M, W M, τ M of intepolated pocesses conveges weakly to a limit point X, R, W, τ, then X is a solution to Equation 2.18 unde elaxed contol R, W with initial condition ϕ, τ is the exit time fo X as given by 2.15, and we have J M ϕ, u M M Ĵϕ, R. Poof. The convegence assetion fo the costs is a consequence of Poposition 2.2, the fact that, by vitue of Assumption A5, the exit time ˆτ defined in 2.21 is Skoohodcontinuous, and the definition of J M and J o Ĵ. Theoem 2.3. Assume A1 A5. Then we have lim M V M ϕ = V ϕ. Poof. Fist notice that lim inf M V M ϕ V ϕ as a consequence of Popositions 2.2 and 2.3. In ode to show lim sup M V M ϕ V ϕ choose a elaxed contol R, W so that Ĵϕ, R = V ϕ accoding to Poposition 2.1. Given ε >, one can constuct a sequence of discete admissible contols u M such that ξ M, ū M, W M, τ M is weakly convegent, whee ξ M, W M, τ M ae constucted as above, and lim sup J M ϕ, u M Ĵϕ, R ε. M The existence of such a sequence of discete admissible contols is guaanteed, cf. the discussion at the end of Subsection By definition, V M ϕ J M ϕ, u M fo each M N. Using Poposition 2.3 we find that lim sup M and since ε was abitay, the assetion follows. V M ϕ lim sup J M ϕ, u M V ϕ + ε, M The assetion of Theoem 2.3 coesponds to the convegence statement of Theoem 2.1 in Subsection Let us check whethe the hypotheses of Theoem 2.1 ae satisfied. Hypothesis H1 is met since the cost functional Ĵ given by 2.19 may be egaded as a mapping D BD R D [, ],, which is continuous with espect to the topology of weak convegence on the second component and unifom convegence on D. Hypothesis H2 is satisfied because of 2.23, the definition of the discete cost functionals. Hypothesis H4 is a consequence of the fist pat of Poposition 2.2 and the compactness of the space of elaxed contol pocesses. Hypothesis H5 follows fom the second pat of Poposition 2.2. Lastly, Hypothesis H3, which we have skipped so fa, is implied by Poposition 2.3 and the fact that continuous-time elaxed contol pocesses can be appoximated in the sense of weak convegence by odinay contol pocesses which ae piecewise constant on unifom gids of mesh size 1 M. A diffeent citeion fo the appoximation of continuoustime stategies will be applied in Chapte 3. Thee, not only the dift coefficient of the state equation, but also the diffusion coefficient may be contolled.

59 2.3. APPLICATION TO STOCHASTIC CONTROL PROBLEMS WITH DELAY An auxiliay esult The poof of the following lemma makes use of standad techniques. In the context of appoximation of SDDEs, it should be compaed to Section 7 in Mao 23. Lemma 2.1. In the notation and unde the assumptions of Poposition 2.2, it holds that fo evey ε > thee is M N such that fo all M M, Poof. Clealy, 2.26 E 4K 2 L E sup t [,T ] T 2 E + 2 E sup t [,T ] E sup t [,T ] F ˆξs dw s sup t [,s] sup t [,T ] F M ξ M s d W M s 2 ˆξt ξm t 2 ds + 76T εk F ˆξs dw s F ˆξs dw s F M ξ M s dw s F M ξ M s d W M s 2 F M ξ M s dw s 2 F M ξ M s d W M s 2 Using Doob s maximal inequality, Itô s isomety, Fubini s theoem and Assumption A4, fo the fist expectation on the ight hand side of 2.26 we obtain the estimate E sup t [,T ] F ˆξs dw s F M ξ M s dw s T 4 E F ˆξs F M ξ M s 2 ds 4K 2 L T E sup t [,s] ˆξt ξm t 2 ds. Fix any N N. The second expectation on the ight hand side of 2.26 splits up into thee tems accoding to E F M ξ M s dw s F M ξ M s d W M s E + 4 E + 4 E sup t [,T ] sup t [,T ] sup t [,T ] sup t [,T ] F M ξ M s dw s F N ξ M s dw s F N ξ M s d W M s F N ξ M s dw s 2 F N ξ M s d W M s 2 F M ξ M s d W M s 2. Again using Doob s maximal inequality and a genealized vesion of Itô s isomety cf. Potte, 23: pp , fo the fist and thid expectation on the ight hand side of

60 48 CHAPTER 2. THE MARKOV CHAIN METHOD Inequality 2.28 we obtain 2.29 and 2.3 E sup t [,T ] F M ξ M s dw s T 4 E F M ξ M s F N ξ M s 2 ds E sup t [,T ] F N ξ M s d W M s F N ξ M s dw s 2 F M ξ M s d W M s 2 T 4 E F M ξ M s F N ξ M s 2 d [ W M, W M] s. Notice that, path-by-path, we have T F M ξ M s F N ξ M s 2 d [ W M, W M] s M T i= F M ξ M M i F N ξ M M i 2 [ W M, W M] M i+1 [ W M, W M] M i. In ode to estimate the second expectation on the ight hand side of 2.28, obseve that, P-almost suely, fo all t [, T ] F N ξ M s dw s = F N ξ M t N W t W t N + N t 1 i= F N ξ M N i W N i+1 W N i, as F N ξ M is piecewise constant on the gid of mesh size N. On the othe hand, F N ξ M s d W M s = F N ξ M t N W M t W M t N + N t 1 By Assumption A3, σ is bounded by a constant K, hence i= F N ξ M N i W M N i+1 W M N i. 2K N F N ξ M s dw s F N ξ M s d W M s t sup W s W M s 2K N T sup W s W M s. s [,t] s [,T ] Bounded convegence yields fo each fixed N N 2.31 E sup t [,T ] F N ξ M s dw s F N ξ M s d W M s 2 M.

61 2.4. DISCUSSION 49 Let x, y D. By Assumption A4 we have fo all t [, T ] F N yt F ϕ, xt = F ϕ N, H N y t N F ϕ, xt K L sup ϕ N s ϕs + KL sup H N ys xs s [, s [,T ] + F ϕ, x t N F ϕ, xt. By Assumption A1, the map [, T ] t F ϕ, xt is càdlàg, whence it has only finitely many jumps lage than any given positive lowe bound. Thus, given ε >, thee is a finite subset A = Aε, T, ϕ, x [, T ] such that lim sup N F ϕ, x t N F ϕ, xt ε fo all t [, T ] \ A. Moeove, the convegence is unifom in the following sense cf. Billingsley, 1999: We can choose the finite set A in such a way that thee is N = N ε, T, ϕ, x N so that F ϕ, x t N F ϕ, xt 2ε fo all t [, T ] \ A, N N. Given ε >, we theefoe find N N and an event Ω with P Ω 1 ε so that fo each ω Ω thee is a finite subset A ω [, T ] with #A ω Nε and such that fo all t [, T ] \ A ω and all M N we have F M ξm ω t F Xω t 2 + F N ξm ω t F Xω t 2 ε. The expession on the ight hand side of 2.29 is then bounded fom above by 9T εk Fo M big enough, also the expession on the ight hand side of 2.3 is smalle than 9T εk 2 + 1, and the expectation in 2.31 is smalle than T ε. 2.4 Discussion Kushne s method applies to appoximation schemes which eplace a given optimal contol poblem, usually one ove continuous time and with continuous state space, by a sequence of appoximating contol poblems, usually defined fo discete time and discete state space. When the dynamics of the oiginal poblem ae descibed by some kind of deteministic o stochastic diffeential equation, conditions of local consistency indicate how to choose the dynamics of the appoximating poblems in a consistent way; that is, in such a way that the associated value functions convege to the value function of the oiginal poblem. Local consistency is easy to check when the dynamics ae discetised accoding to some finite diffeences scheme. A cucial assumption on the oiginal poblem is that the space of contol actions is compact. This assumption is less estictive than it might appea insofa as, in actual numeical computations, optimisation is often pefomed only with espect to a finite set of contol actions; see, fo example, the appendix by M. Falcone in Badi and Capuzzo Dolcetta Howeve, fo the poof of convegence to wok, compactness of the space of contol actions must cay ove to the space of admissible stategies; at least, compactness fo sequences of solutions must hold as equied by Hypothesis H4 of Theoem 2.1, the abstact convegence esult.

62 5 CHAPTER 2. THE MARKOV CHAIN METHOD In the case of dynamics descibed by a stochastic diffeential equation, compactness of the space of stategies is achieved by intoducing elaxed contol pocesses in a way analogous to the deteministic case povided the diffusion coefficient is independent of the contol. This is the case fo the contol poblems of Section 2.3, fo instance. When also the diffusion coefficient depends on the contol, the situation gets moe complicated. Matingale measues on the space of contol actions may be intoduced to obtain the desied compactness of the space of stategies, see Kushne and Dupuis 21: Ch. 13. As fa as the stuctue of the contol poblem that has to be appoximated is concened, the Makov chain method is extemely geneal. This was demonstated in Section 2.2, whee we set up an abstact famewok which encompasses quite abitay optimal contol poblems. The geneality of the appoach is also its limitation. In paticula, it is not clea how to obtain a pioi bounds on the appoximation eo, in addition to convegence. A pioi bounds on the discetisation eo ae impotant fo seveal easons. Eo bounds povide an assuance though usually ove-pessimistic about the accuacy of the appoximations elative to the oiginal poblem. They also allow to compae diffeent schemes fo the same class of poblems; o they may seve as a benchmak fo new schemes. Lastly, they give an indication of the computational esouces equied fo solving the discetised poblems. In Chapte 3, we will change attitude and develop a moe specific scheme fo the appoximation of contol poblems with delay, exploiting, in paticula, the additivity of the minimal costs as expessed by the Pinciple of Dynamic Pogamming. We will obtain bounds on the eo fo the discetisation in time. Questions of computational equiements and complexity fo the solution of the esulting semi-discete poblems will be discussed. The idea, also pesent in Kushne s method, to discetise a continuous-time contol poblem by constucting a sequence of appoximating poblems will be etained.

63 Chapte 3 Two-step time discetisation and eo bounds In this chapte, we study a semi-discetisation scheme fo stochastic systems with delay. Mateial of this chapte appeas in Fische and Nappo 27. The contol poblems to be appoximated ae chaacteised as follows: The system dynamics ae given by a multidimensional contolled stochastic functional diffeential equation with bounded memoy diven by a Wiene pocess. The diving noise pocess and the state pocess may have diffeent dimensions. The optimal contol poblem itself is, in geneal, infinite-dimensional in the sense that the associated value function lives on an infinite-dimensional function space. Thee will be no need to assume ellipticity of the diffusion matix so that deteministic contol poblems ae included as special cases. The pefomance citeion is a cost functional of evolutional type ove a finite deteministic time hoizon. Fo simplicity, thee will be neithe state constaints no state-dependent contol constaints. Ou scheme is based on a time discetisation of Eule-Mauyama type and yields a sequence of finite-dimensional optimal contol poblems in discete time. Hee, as in Chapte 2, we follow the appoach whee a given contol poblem is appoximated by a sequence of contol poblems which ae easie to solve numeically o solvable at all. Unde quite natual assumptions, we obtain uppe bounds on the discetisation eo o wost-case estimates fo the ate of convegence in tems of diffeences in supemum nom between the value functions coesponding to the oiginal contol poblem and the appoximating contol poblems, espectively. The appoximation of the oiginal contol poblem is caied out in two steps. The idea is to sepaate the discetisation of the dynamics fom that of the stategies. The dynamics ae discetised fist. By feezing the dynamics, the poblem of appoximating the stategies is educed to the finite-dimensional constant coefficients case and esults available in the liteatue can be applied. Notice that the state pocesses always have a cetain time egulaity they ae Hölde continuous like typical tajectoies of Bownian motion, while the stategies need not have any egulaity in time besides being measuable. The fist discetisation step consists in constucting a sequence of contol poblems whose coefficients ae piecewise constant in both the time and the segment vaiable. The admissible stategies ae the same as those of the oiginal poblem. We obtain a ate of convegence fo the contolled state pocesses, which is unifom in the stategies, thanks to 51

64 52 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS the fact that the modulus of continuity of Itô diffusions with bounded coefficients has finite moments of all odes. This esult can be found in Słomiński 21, cf. Appendix A.2 below. The convegence ate fo the contolled pocesses caies ove to the appoximation of the coesponding value functions. The second discetisation step consists in appoximating the oiginal stategies by contol pocesses which ae piecewise constant on a sub-gid of the time gid intoduced in the fist step. A main ingedient in the deivation of an eo bound is the Pinciple of Dynamic Pogamming PDP o, as it is also known, Bellman s Pinciple of Optimality. The validity of the PDP fo the non-makovian dynamics at hand is poved in Lassen 22, cf. Appendix A.1 below. A vesion of the PDP fo contolled diffusions with time delay is also poved in Gihman and Skookhod 1979: Ch. 3; thee ae diffeences, though, in the fomulation of the contol poblem. We apply the PDP to obtain a global eo bound fom an estimate of the local tuncation eo. The fact that the value functions of the appoximating poblems fom the fist step ae Lipschitz continuous unde the supemum nom guaantees stability of the method. This way of eo localisation and, in paticula, the use of the PDP ae adapted fom Falcone and Feetti 1994 and Falcone and Rosace 1996, who study deteministic optimal contol poblems with and without delay. Thei poof technique is not confined to such simple appoximation schemes as we adopt hee; it extends the usual convegence analysis of finite diffeence methods fo initial-bounday value poblems, cf. Section 5.3 in Atkinson and Han 21, fo example. To estimate the local tuncation eo we only need an eo bound fo the appoximation by piecewise constant stategies of finite-dimensional contol poblems with constant coefficients ; that is, the cost ate and the coefficients of the state equation ae functions of the contol vaiable only. Such a esult is povided by a stochastic mean value theoem due to Kylov 21. When the space of contol actions is finite and the diffusion coefficient is not diectly contolled, it is possible to deive an analogous esult with an eo bound of highe ode, namely of ode h 1/2 instead of h 1/4, whee h is the length of the time step. When the contol poblem is deteministic, the eo bound is at least of ode h 1/2 ; it is of ode h if, in addition, the space of contol actions is finite. In Appendix A.3, we state a educed vesion of Kylov s theoem and povide a detailed poof. The moe elementay eo bounds fo special cases ae also given. In a final step, we put togethe the two eo estimates to obtain bounds on the total appoximation eo. The eo bound in the most geneal case is of ode nealy h 1/12 with h the length of the time step, see Theoem 3.4 in Section 3.4. To the best of ou knowledge, this is the fist esult on the speed of convegence of a time-discetisation scheme fo contolled stochastic systems with delay. We do not expect ou wost-case estimates to be optimal; in any case, they may seve as benchmaks on the way towads shap eo bounds. Moeove, the scheme s special stuctue can be exploited so that the computational equiements ae lowe than what might be expected by looking at the ode of the eo bound. In the finite-dimensional setting, ou two-step time-discetisation pocedue allows to get fom the case of constant coefficients to the case of geneal coefficients, even though it yields a wose ate of convegence in compaison with the esults cited in Section 1.3, namely 1 12 instead of 1 6 and 1 1, espectively. This is the pice we pay fo sepaating

65 3.1. THE ORIGINAL CONTROL PROBLEM 53 the appoximation of the dynamics fom that of the stategies. On the othe hand, it is this sepaation that enables us to educe the poblem of stategy appoximation to an elementay fom. Obseve that cetain techniques like mollification of the value function employed in the woks cited above ae not available, because the space of initial values is not locally compact. Ou pocedue also allows to estimate the eo incued when using stategies which ae nealy optimal fo the appoximating poblems with the dynamics of the oiginal poblem. This would be the way to apply the appoximation scheme in many pactically elevant situations. Howeve, this method of nealy optimally contolling the oiginal system is viable only if the available infomation includes pefect samples of the undelying noise pocess. The question is moe complicated when infomation is esticted to samples of the state pocess. In Section 3.1, the oiginal contol poblem is descibed in detail. The dynamics of the oiginal contol poblem ae discetised in Section 3.2. The second discetisation step, based on the PDP and local eo bounds fo the appoximation of the oiginal stategies, is caied out in Section 3.3. In Section 3.4, bounds on the oveall discetisation eo ae deived. In Section 3.5, a pocedue fo solving the esulting finite-dimensional poblems is outlined. Section 3.6 contains some concluding emaks and open questions. 3.1 The oiginal contol poblem The dynamics of the contol poblems we want to appoximate ae descibed by a contolled d-dimensional stochastic delay o functional diffeential equation diven by a Wiene pocess. Both the dift and the diffusion coefficient may depend on the solution s histoy a cetain amount of time into the past. The delay length gives a bound on the maximal time the system is allowed to look back into the past; as befoe, we take it to be a finite deteministic time >. Fo simplicity, we estict attention to contol poblems with finite and deteministic time hoizon. The pefomance of the admissible contol pocesses o stategies will be measued in tems of a cost functional of evolutional type. Recall that, in geneal, the solution pocess of an SDDE does not enjoy the Makov popety, while the segment pocess associated with that solution does. Fo an R d -valued stochastic pocess Xt t living on Ω, F, P, we denote by X t t the associated segment pocess of delay length. Thus, fo any t, any ω Ω, X t ω is the function [, ] s Xt+ s, ω R d. If the oiginal pocess Xt t has continuous tajectoies, then X t t is a stochastic pocess taking its values in C := C[, ], R d, the space of all R d -valued continuous functions on the inteval [, ]. The space C comes equipped with the supemum nom, witten., induced by the standad nom on R d. Let Γ, ρ be a complete and sepaable metic space, the set of contol actions. We fist state ou contol poblem in the weak Wiene fomulation, cf. Lassen 22 and Yong and Zhou 1999: pp This is to justify ou use of the Pinciple of Dynamic Pogamming. In subsequent sections we will only need the stong fomulation. Definition 3.1. A Wiene basis of dimension d 1 is a tiple Ω, P, F, F t, W such that i Ω, F, P is a complete pobability space caying a standad d 1 -dimensional Wiene pocess W,

66 54 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS ii F t is the completion by the P-null sets of F of the filtation induced by W. A Wiene contol basis is a quaduple Ω, P, F, F t, W, u such that Ω, P, F, F t, W is a Wiene basis and u: [, Ω Γ is pogessively measuable with espect to F t. The F t -pogessively measuable pocess u is called a contol pocess. Wite U W fo the set of all Wiene contol bases. By abuse of notation, we will often hide the stochastic basis involved in the definition of a Wiene contol basis; thus, we may wite W, u U W meaning that W is the Wiene pocess and u the contol pocess of a Wiene contol basis. Let b, σ be Boel measuable functions defined on [, C Γ and taking values in R d and R d d 1, espectively. The functions b, σ ae the coefficients of the contolled SDDE that descibes the dynamics of the contol poblem. The SDDE is of the fom 3.1 dxt = b t +t, X t, ut dt + σ t +t, X t, ut dw t, t >, whee t is a deteministic initial time and Ω, P, F, F t, W, u a Wiene contol basis. The assumptions on the coefficients stated below will allow b, σ to depend on the segment vaiable in diffeent ways. Let ϕ C be a geneic segment function. The coefficients b, σ may depend on ϕ though bounded Lipschitz functions of, fo example, ϕ 1,..., ϕ n, v 1 s, ϕsw 1 sds,..., ṽ 1 s, ϕsdµ 1 s,..., v n s, ϕsw n sds ṽ n s, ϕsdµ n s, point delay, distibuted delay, genealised distibuted delay, whee n N, 1,..., n [, ], w 1,..., w n ae Lebesgue integable, µ 1,..., µ n ae finite Boel measues on [, ], v i, ṽ i ae Lipschitz continuous in the second vaiable unifomly in the fist, v i., w i. is Lebesgue integable and ṽ i., is µ i -integable, i {1,..., n}. Notice that the genealised distibuted delay compises the point delay as well as the Lebesgue absolutely continuous distibuted delay. Let us call functional delay any type of delay that cannot be witten in integal fom. An example of a functional delay, which is also coveed by the egulaity assumptions stated below, is the dependence on the segment vaiable ϕ though bounded Lipschitz functions of sup s,t [,] v 1 s, t, ϕs, ϕt,..., sup s,t [,] v n s, t, ϕs, ϕt, whee v i is a measuable function which is Lipschitz continuous in the last two vaiables unifomly in the fist two vaiables and v i.,.,, is bounded, i {1,..., n}. As initial condition fo Equation 3.1, in addition to the time t, we have to pescibe the values of Xt fo all t [, ], not only fo t =. Thus, a deteministic initial condition fo Equation 3.1 is a pai t, ϕ, whee t is the initial time and ϕ C the initial segment. We undestand Equation 3.1 in the sense of an Itô equation. An adapted pocess X with continuous paths defined on the stochastic basis Ω, P, F, F t of W, u is a solution with initial condition t, ϕ if it satisfies, P-almost-suely, 3.2 Xt = { ϕ + b t +s, X s, us ds + σ t +s, X s, us dw s, t >, ϕt, t [, ].

67 3.1. THE ORIGINAL CONTROL PROBLEM 55 Obseve that the solution pocess X always stats at time zeo; it depends on the initial time t only though the coefficients b, σ. As fa as the contol poblem is concened, this fomulation is equivalent to the usual one, whee the pocess X stats at time t with initial condition X t = ϕ and t does not appea in the time agument of the coefficients. A solution X to Equation 3.2 unde W, u with initial condition t, ϕ is stongly unique if it is indistinguishable fom any othe solution X satisfying Equation 3.2 unde W, u with the same initial condition. A solution X to Equation 3.2 unde W, u with initial condition t, ϕ is weakly unique if X, W, u has the same distibution as X, W, ũ wheneve W, ũ has the same distibution as W, u and X is a solution to Equation 3.2 unde Wiene contol basis W, ũ with initial condition t, ϕ. Hee, the space of Boel measuable functions [, Γ is equipped with the topology of convegence locally in Lebesgue measue. Definition 3.2. A Wiene contol basis W, u U W is called admissible o an admissible stategy if, fo each deteministic initial condition, Equation 3.2 has a stongly unique solution unde W, u which is also weakly unique. Wite U ad fo the set of admissible contol bases. Denote by T > the finite deteministic time hoizon. Let f, g be Boel measuable eal-valued functions with f having domain [, C Γ and g having domain C. They will be efeed to as the cost ate and the teminal cost, espectively. We intoduce a cost functional J defined on [, T ] C Uad J by setting T t 3.3 Jt, ϕ, W, u := E f t +s, X s, us ds + gx T t, whee X is the solution to Equation 3.2 unde W, u U J ad with initial condition t, ϕ and U J ad U ad is the set of all admissible Wiene contol bases such that the expectation in 3.3 is well defined fo all deteministic initial conditions. The value function coesponding to Equation 3.2 and cost functional 3.3 is the function V : [, T ] C [, given by 3.4 V t, ϕ := inf { Jt, ϕ, W, u W, u U J ad}. It is this function that we wish to appoximate. Let us specify the hypotheses we make about the egulaity of the coefficients b, σ, the cost ate f and the teminal cost g. A1 Measuability: the functions b : [, C Γ R d, σ : [, C Γ R d d 1, f : [, C Γ R, g : C R ae jointly Boel measuable. A2 Boundedness: b, σ, f, g ae bounded by some constant K >. A3 Unifom Lipschitz and Hölde condition: thee is a constant L > such that fo all ϕ, ϕ C, t, s, all γ Γ bt, ϕ, γ bs, ϕ, γ σt, ϕ, γ σs, ϕ, γ L ϕ ϕ + t s ft, ϕ, γ fs, ϕ, γ gϕ g ϕ L ϕ ϕ + t s.

68 56 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS A4 Continuity in the contol: bt, ϕ,., σt, ϕ,., ft, ϕ,. ae continuous functions on Γ fo any t, ϕ C. Hee and in the sequel,. denotes the Euclidean nom of appopiate dimension and x y denotes the maximum of x and y. The above measuability, boundedness and Lipschitz continuity assumptions on the coefficients b, σ guaantee the existence of a stongly unique solution X = X t,ϕ,u to Equation 3.2 fo evey initial condition t, ϕ [, T ] C and W, u U W any Wiene contol basis; see, fo example, Theoem 2.1 and Remak 1.12 in Chapte 2 of Mohammed Moeove, weak uniqueness of solutions holds fo all deteministic initial conditions. This is a consequence of a theoem due to Yamada and Watanabe, see Lassen 22 fo the necessay genealisation to SDDEs. Consequently, unde Assumptions A1 A3, we have U ad = U W. Moeove, since f and g ae assumed to be measuable and bounded, the expectation in 3.3 is always well defined, whence it holds that Uad J = U ad = U W. Assumption A4 will not be needed befoe Section 3.3. The fact that weak uniqueness holds allows us to discad the weak fomulation and conside ou contol poblem in the stong Wiene fomulation. Thus, we may wok with a fixed Wiene basis. Unde Assumptions A1 A3, the admissible stategies will be pecisely the natual stategies, that is, those that ae epesentable as functionals of the diving Wiene pocess. Fom now on, let Ω, P, F, F t, W be a fixed d 1 -dimensional Wiene basis. Denote by U the set of contol pocesses defined on this stochastic basis. The dynamics of ou contol poblem ae still given by Equation 3.2. Due to Assumptions A1 A3, all contol pocesses ae admissible in the sense that Equation 3.2 has a stongly unique solution unde any u U fo evey deteministic initial condition. In the definition of the cost functional, the Wiene basis does not vay any moe. The coesponding value function [, T ] C t, ϕ inf { Jt, ϕ, u u U } is identical to the function V detemined by 3.4. By abuse of notation, we wite Jt, ϕ, u fo Jt, ϕ, W, u. We next state some impotant popeties of the value function. Poposition 3.1. Assume A1 A3. Then the value function V is bounded and Lipschitz continuous in the segment vaiable unifomly in the time vaiable. Moe pecisely, thee is L V > such that fo all t [, T ], ϕ, ϕ C, V t, ϕ KT +1, V t, ϕ V t, ϕ L V ϕ ϕ. The constant L V need not be geate than 3LT + 1 exp3t T + 4d 1 L 2. Moeove, V satisfies Bellman s Pinciple of Dynamic Pogamming, that is, fo all t [, T t ], V t, ϕ = inf E f t +s, Xs u, us ds + V t +t, X u t, u U whee X u is the solution to Equation 3.2 unde contol pocess u with initial condition t, ϕ. Poof. Fo the boundedness and Lipschitz continuity of V see Poposition A.1, fo the Bellman Pinciple see Theoem A.1 in Appendix A.1, whee we set :=, b := b and so on. Notice that the Hölde continuity in time of the coefficients b, σ, f as stipulated in Assumption A3 is not needed in the poofs.

69 3.1. THE ORIGINAL CONTROL PROBLEM 57 The value function V has some egulaity in the time vaiable, too. It is Hölde continuous in time with paamete α fo any α, 1 2 ] povided the initial segment is at least α-hölde continuous. Notice that the coefficients b, σ, f need not be Hölde continuous in time. Except fo the ole of the initial segment, statement and poof of Poposition 3.2 ae analogous to the non-delay case, see Kylov 198: p. 167, fo example. Poposition 3.2. Assume A1 A3. Let ϕ C. If ϕ is α-hölde continuous with Hölde constant not geate than L H, then the function V., ϕ is Hölde continuous; that is, thee is a constant L V > depending only on L H, K, T and the dimensions such that fo all t, t 1 [, T ], V t, ϕ V t 1, ϕ L V t 1 t α t 1 t. Poof. Let ϕ C be α-hölde continuous with Hölde constant not geate than L H. Without loss of geneality, we suppose that t 1 = t +h fo some h >. We may also suppose h 1 2, because we can choose L V geate than 4KT +1 so that the asseted inequality cetainly holds fo t t 1 > 1 2. By Bellman s Pinciple as stated in Poposition 3.1, we see that V t, ϕ V t 1, ϕ = V t, ϕ V t +h, ϕ = h inf E f t +s, Xs u, us ds + V t +h, Xh u V t +h, ϕ u U h sup E f t +s, Xs u, us ds + sup E V t +h, Xh u V t +h, ϕ u U u U K h + sup L V E Xh u ϕ, u U whee K is the constant fom Assumption A2 and L V the Lipschitz constant fo V in the segment vaiable accoding to Poposition 3.1. We notice that ϕ = X u fo all u U since X u is the solution to Equation 3.2 unde contol u with initial condition t, ϕ. By Assumption A2, Hölde s inequality, Doob s maximal inequality and Itô s isomety, fo abitay u U it holds that E X u h ϕ sup t [, h] + E ϕt+h ϕt + sup t [,h] sup t [ h,] σ t +s, Xs u, us dw s h ϕ ϕt + E b t +s, Xs u, us ds L H h α + K h + 4K d 1 h. Putting eveything togethe, we obtain the assetion. Fom the poof of Poposition 3.2 we see that the time egulaity of the value function V is independent of the time egulaity of the coefficients b, σ, f; it is always 1 2 -Hölde povided the initial segment is at least that egula.

70 58 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS 3.2 Fist discetisation step: Eule-Mauyama scheme In this section, the dynamics and the cost functional of the oiginal contol poblem ae discetised in time and segment space. Moe pecisely, we define a sequence of appoximating contol poblems whee the coefficients of the dynamics, the cost ate, and the teminal cost ae piecewise constant functions of the time and segment vaiable, while the dependence on the stategies emains the same as in the oiginal poblem. We will obtain an uppe bound on the appoximation eo which is unifom ove all initial segments of a given Hölde continuity. Let N N. In ode to constuct the N-th appoximating contol poblem, set h N := N, and define. N by t N := h N t h N, whee. is the usual Gauss backet, that is, t is the intege pat of the eal numbe t. Set T N := T N and I N := {k h N k N } [, T N ]. As T is the time hoizon fo the oiginal contol poblem, T N will be the time hoizon fo the N-th appoximating poblem. The set I N is the time gid of discetisation degee N. Denote by Lin N the opeato C C which maps a function in C to its piecewise linea intepolation on the gid {k h N k Z} [, ]. We want to expess the dynamics and the cost functional of the appoximating poblems in the same fom as those of the oiginal poblem, so that the Pinciple of Dynamic Pogamming as stated in Appendix A.1 can be eadily applied; see Popositions 3.5 and 3.6 in Section 3.3. To this end, the segment space has to be enlaged accoding to the discetisation degee N. Denote by C N the space C[ h N, ], R d of R d -valued continuous functions living on the inteval [ h N, ]. Fo a continuous function o a continuous pocess Z defined on the time inteval [ h N,, let Π N Zt denote the segment of Z at time t of length +h N, that is, Π N Zt is the function [ h N, ] s Zt+s. Given t, ψ C N and u U, we define the Eule-Mauyama appoximation Z = Z N,t,ψ,u of degee N of the solution X to Equation 3.2 unde contol pocess u with initial condition t, ψ as the solution to 3.5 Zt = ψ + b N t +s, Π N Zs, us ds + σ N t +s, Π N Zs, us dw s, t >, ψt, t [ h N, ], whee the coefficients b N, σ N ae given by b N t, ψ, γ := b t N, Lin N [, ] s ψs+ t N t, γ, σ N t, ψ, γ := σ t N, Lin N [, ] s ψs+ t N t, γ, t, ψ C N, γ Γ. Thus, b N t, ψ, γ and σ N t, ψ, γ ae calculated by evaluating the coesponding coefficients b and σ at t N, ˆϕ, γ, whee ˆϕ is the segment in C which aises fom the piecewise linea intepolation with mesh size N of the estiction of ψ to the inteval [ t N t, t N t]. Notice that the contol action γ emains unchanged. Assumptions A1 A3 guaantee that, given any contol pocess u U, Equation 3.5 has a unique solution fo each initial condition t, ψ [, C N. Thus, the pocess Z = Z N,t,ψ,u of discetisation degee N is well defined. Notice that the appoximating coefficients b N, σ N ae still Lipschitz continuous in the segment vaiable unifomly in the time and contol vaiables, although they ae only piecewise continuous in time.

71 3.2. FIRST DISCRETISATION STEP: EULER-MARUYAMA SCHEME 59 Define the cost functional J N : [, T N ] C N U R of discetisation degee N by 3.6 J N t, ψ, u := E TN t whee f N, g N ae given by f N t +s, Π N Zs, us ds + g N ΠN ZT N t, f N t, ψ, γ := f t N, Lin N [, ] s ψs+ t N t, γ, g N ψ := g Lin N ψ [,], t, ψ CN, γ Γ. As b N, σ N above, f N, g N ae Lipschitz continuous in the segment vaiable unifomly in time and contol unde the supemum nom on C N. The value function V N coesponding to 3.5 and 3.6 is the function [, T N ] C N R detemined by 3.7 V N t, ψ := inf { J N t, ψ, u u U }. If t I N, then t +s N = t + s N fo all s. Thus, the solution Z to Equation 3.5 unde contol pocess u U with initial condition t, ψ I N C N satisfies 3.8 Zt = ψ + + b t + s N, Lin N Z s N, us ds σ t + s N, Lin N Z s N, us dw s fo all t. Moeove, Zt t depends on the initial segment ψ only though the estiction of ψ to the inteval [, ]. In analogy, wheneve t I N, the cost functional J N takes on the fom 3.9 J N t, ψ, u = E TN t f t + s N, Lin N Z s N, us ds + g LinN ZTN t. Hence, if t I N, then J N t, ψ, u = J N t, ψ [,], u fo all ψ C N, u U; that is, J N t,.,. coincides with its pojection onto C U. Consequently, if t I N, then V N t, ψ = V N t, ψ [,] fo all ψ C N ; that is, V N t,. can be intepeted as a function with domain C instead of C N. If t I N, by abuse of notation, we will wite V N t,. also fo this function. Notice that, as a consequence of Equations 3.8 and 3.9, in this case we have V N t, ϕ = V N t, Lin N ϕ fo all ϕ C. By Poposition 3.2, we know that the oiginal value function V is Hölde continuous in time povided the initial segment is Hölde continuous. It is theefoe enough to compae V and V N on the gid I N C. This is the content of the next two statements. Again, the ode of the eo will be unifom only ove those initial segments which ae α-hölde continuous fo some α > ; the constant in the eo bound also depends on the Hölde constant of the initial segment. We stat with compaing solutions to Equations 3.2 and 3.5 fo initial times in I N. Poposition 3.3. Assume A1 A3. Let ϕ C be Hölde continuous with paamete α > and Hölde constant not geate than L H. Then thee is a constant C depending only on α, L H, L, K, T and the dimensions such that fo all N N with N 2, all t I N, u U it holds that E sup Xt Z N t C h α N h N ln 1 h N, t [,T ]

72 6 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS whee X is the solution to Equation 3.2 unde contol pocess u with initial condition t, ϕ and Z N is the solution to Equation 3.5 of discetisation degee N unde u with initial condition t, ψ with ψ C N being such that ψ [,] = ϕ. Poof. Notice that h N 1 2 since N 2, and obseve that Z := ZN as defined in the assetion satisfies Equation 3.8, as the initial time t lies on the gid I N. Moeove, Z depends on the initial segment ψ only though ψ [,] = ϕ. Using Hölde s inequality, Doob s maximal inequality, Itô s isomety, Assumption A3, and Fubini s theoem we find that E 2T E sup Xt Zt 2 t [,T ] T = E sup Xt Zt 2 t [,T ] b t +s, X s, us b t + s N, Lin N Z s N, us 2 ds T + 8d 1 E σ t +s, X s, us σ t + s N, Lin N Z s N, us 2 ds T 4T E b t + s N, X s, us b t + s N, Lin N Z s N, us 2 ds T + 16d 1 E σ t + s N, X s, us σ t + s N, Lin N Z s N, us 2 ds + 4T T + 4d 1 L 2 h N 4T + 4d 1 L 2 T h N + 4T + 4d 1 L 2 T h N + 3 T T E X s Lin N Z s N 2 ds T + 12T + 4d 1 L 2 E X s N Z s N 2 ds 4T T + 4d 1 L 2 h N + 18L 2 H h 2α N + 18C 2,T h N ln 1 h N T + 12T + 4d 1 L 2 E sup Xt Zt 2 ds. t [,s] E Xs X s N 2 + E Z s N Lin N Z s N 2 ds Applying Gonwall s lemma, we obtain the assetion. In the last step of the above estimate Lemma A.1 fom Appendix A.2 and the Hölde continuity of ϕ have both been used twice. Fistly, to get fo all s [, T ], E X s X s N 2 2 E sup ϕt ϕ t 2 t, t [,], t t h N 2L 2 H h 2α N + 2C 2,T h N ln 1 h N. + 2 E sup Xt X t 2 t, t [,T ], t t h N

73 3.2. FIRST DISCRETISATION STEP: EULER-MARUYAMA SCHEME 61 Secondly, to obtain E Z s N Lin N Z s N 2 = E 2 E + 2 E sup t [, 4L 2 H h 2α N sup t [,s ϕt ϕ t N 2 + ϕt ϕ t N +h N E sup Zt Lin N Z s N t 2 t [ s N, s N ] Zt Z t N 2 + Zt Z t N +h N 2 4L 2 H h 2α N + 4C 2,T h N ln 1 h N sup Zt Z t 2 t, t [,s], t t h N fo all s [, T ]. The ode of the appoximation eo obtained in Poposition 3.3 fo the undelying dynamics caies ove to the appoximation of the coesponding value functions. This woks thanks to the Lipschitz continuity of the cost ate and teminal cost in the segment vaiable, the bound on the moments of the modulus of continuity fom Lemma A.1 in Appendix A.2, and the fact that the eo bound in Poposition 3.3 is unifom ove all stategies. Theoem 3.1. Assume A1 A3. Let ϕ C be Hölde continuous with paamete α > and Hölde constant not geate than L H. Then thee is a constant C depending only on α, L H, L, K, T and the dimensions such that fo all N N with N 2, all t I N it holds that V t, ϕ V N t, ϕ sup Jt, ϕ, u J N t, ψ, u u U whee ψ C N is such that ψ [,] = ϕ. C h α N h N ln 1 h N, Poof. To veify the fist inequality, we distinguish the cases V t, ϕ > V N t, ϕ and V t, ϕ < V N t, ϕ. Fist suppose that V t, ϕ > V N t, ϕ. Then fo each ε, 1] we find a stategy u ε U such that V N t, ϕ J N t, ϕ, u ε ε. Since V t, ϕ Jt, ϕ, u fo all u U by definition, it follows that V t, ϕ V N t, ϕ = V t, ϕ V N t, ϕ Jt, ϕ, u ε J N t, ϕ, u ε + ε sup Jt, ϕ, u J N t, ψ, u + ε. u U Sending ε to zeo, we obtain the asseted inequality povided that V t, ϕ > V N t, ϕ. If, on the othe hand, V t, ϕ < V N t, ϕ, then we choose a sequence of minimising stategies u ε U such that V t, ϕ Jt, ϕ, u ε ε, notice that V t, ϕ V N t, ϕ = V N t, ϕ V t, ϕ and obtain the asseted inequality as in the fist case. Now, let u U be any contol pocess. Let X be the solution to Equation 3.2 unde u with initial condition t, ϕ and Z = Z N be the solution to Equation 3.5 unde u with

74 62 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS initial condition t, ψ. Using Assumption A2 and the hypothesis that t I N, we get Jt, ϕ, u J N t, ψ, u K T T N + E g Lin N ZTN t g XT t + E TN t f t + s N, Lin N Z s N, us f t +s, X s, us ds Recall that T T N = T T N h N. Hence, K T T N K h N. Now, using Assumption A3, we see that E g Lin N ZTN t g XT t E Z TN t X TN t + E Lin N ZTN t ZTN t + E X TN t X T t L L C h α N h N ln 1 h N + 3L H h α N + 3C 1,T h N ln 1 h N, whee C is a constant as in Poposition 3.3 and C 1,T is a constant as in Lemma A.1 in Appendix A.2. Notice that Xt t as well as Zt t ae Itô diffusions with coefficients bounded by the constant K fom Assumption A2. In the same way, also using the Hölde continuity of f in time and ecalling that s s N h N fo all s, we see that TN t E f t + s N, Lin N Z s N, us f t +s, X s, us ds L T N t hn + 3C 1,T h N ln 1 h N + C + 3L H h α N h N ln 1 h N. Putting the thee estimates togethe, we obtain the assetion. In vitue of Theoem 3.1, we can eplace the oiginal contol poblem of Section 3.1 with the sequence of appoximating contol poblems defined above. The eo between the poblem of degee N and the oiginal poblem in tems of the diffeence between the coesponding value functions V and V N is not geate than a multiple of N α fo α-hölde continuous initial segments if α, 1 2, whee the popotionality facto is affine in the Hölde constant; it is less than a multiple of lnn/n if α 1 2. Fom the poofs of Poposition 3.3 and Theoem 3.1 it is clea that the coefficients b, σ, f of the oiginal poblem, instead of being 1 2-Hölde continuous in time as postulated by Assumption A3, need only satisfy a bound of the fom t s ln 1 t s, t, s [, T ] with t s small, fo the eo estimates to hold. Let us assume fo a moment that σ, that is, the diffusion coefficient σ is zeo. Then Equation 3.2 becomes a andom odinay diffeential equation. It is still andom, because the admissible stategies ae still Γ-valued stochastic pocesses adapted to the given Wiene filtation. The minimal costs V t, ϕ fo any deteministic initial condition t, ϕ [, T ] C, howeve, can be abitaily well appoximated by using deteministic stategies, that is, Boel measuable functions [, Γ. In case σ, the optimal contol poblem of Section 3.1 is theefoe equivalent to the puely deteministic contol poblem whee minimisation is pefomed with espect to all deteministic stategies. The cost functional of the deteministic poblem is again given by 3.3, but without expectation. The same obsevation applies to the contol poblems of degee N, N N, intoduced in this section. In the sequel, we will not always distinguish between a contol poblem with zeo diffusion matix and the coesponding.

75 3.3. SECOND DISCRETISATION STEP: PIECEWISE CONSTANT STRATEGIES 63 puely deteministic poblem. If the diffusion coefficient σ is zeo and the coefficients b, f ae Lipschitz continuous in time, then the eo between the value functions V and V N is of ode N fo all Lipschitz continuous initial segments, as one would expect fom the classical Eule scheme. Coollay 3.1. Assume A1 A3. Assume in addition that σ is equal to zeo and that b, f ae Lipschitz continuous also in the time vaiable with Lipschitz constant not geate than L. Let ϕ C be Hölde continuous with paamete α, 1] and Hölde constant not geate than L H. Then thee is a constant C depending only on L H, L, K, T such that fo all N N with N, all t I N it holds that whee ψ C N is such that ψ [,] = ϕ. V t, ϕ V N t, ϕ C h α N h N, Although we obtain an eo bound fo the appoximation of V by the sequence of value functions V N N N only fo Hölde continuous initial segments, the poofs of Poposition 3.3 and Theoem 3.1 show that pointwise convegence of the value functions holds tue fo all initial segments ϕ C. Recall that a function ϕ : [, ] R d is continuous if and only if sup t,s [,], t s h ϕt ϕs tends to zeo as h. Let us ecod the esult fo the value functions. Coollay 3.2. Assume A1 A3. Then fo all t, ϕ [, T ] C, V t, ϕ V N t N, ϕ N. Similaly to the value function of the oiginal poblem, also the function V N t,. is Lipschitz continuous in the segment vaiable unifomly in t I N with Lipschitz constant not depending on the discetisation degee N. Since t I N, we may intepet V N t,. as a function defined on C. Poposition 3.4. Assume A1 A3. Let V N be the value function of discetisation degee N. Then V N is bounded by KT +1. Moeove, if t I N, then V N t,. as a function of C satisfies the following Lipschitz condition: V N t, ϕ V N t, ϕ 3LT +1 exp 3T T +4d 1 L 2 ϕ ϕ fo all ϕ, ϕ C. Poof. The assetion is again a consequence of Poposition A.1 in Appendix A.1. To see this, set := + h N, T := T N, b:= b N, σ := σ N, f := f N, and g := g N. Equation 3.5 then descibes the same dynamics as Equation A.1, J is the same functional as JN, whence V N = Ṽ. The hypotheses of Appendix A.1 ae satisfied. Finally, ecall that T N T and that, since t I N, V N t, ψ depends on ψ C N only though ψ [,]. 3.3 Second discetisation step: piecewise constant stategies In Section 3.2, we have discetised the time as well as the segment space in time. The esulting contol poblem of discetisation degee N N has dynamics descibed by Equation 3.5, cost functional J N defined by 3.6 and value function V N given by 3.7. Hee,

76 64 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS we will also appoximate the contol pocesses u U, which up to now have been those of the oiginal poblem, by intoducing futhe contol poblems defined ove sets of piecewise constant stategies. To this end, fo n N, set 3.1 U n := { u U ut is σw k n, k N -measuable and ut = u t n, t }. Recall that t n = n n t. Hence, U n is the set of all Γ-valued F t -pogessively measuable pocesses which ae ight-continuous and piecewise constant in time elative to the gid {k n k N } and, in addition, ae σw k n, k N -measuable. In paticula, if u U n and t, then the andom vaiable ut can be epesented as utω = θ n t, W ω,..., W n t ω, ω Ω, whee θ is some Γ-valued Boel measuable function depending on u and n. Fo the pupose of appoximating the contol poblem of degee N, we will use stategies in U N M with M N. Let us wite U N,M fo U N M. With the same dynamics and the same pefomance citeion as befoe, fo each N N, we intoduce a family of value functions V N,M, M N, defined on [, T N ] C N by setting 3.11 V N,M t, ψ := inf { J N t, ψ, u u U N,M }. We will efe to V N,M as the value function of degee N, M. By constuction, it holds that V N t, ψ V N,M t, ψ fo all t, ψ [, T N ] C N. Hence, in estimating the appoximation eo, we only need an uppe bound fo V N,M V N. As with V N, if the initial time t lies on the gid I N, then V N,M t, ψ depends on ψ only though its estiction ψ [,] C to the inteval [, ]. We wite V N,M t,. fo this function, too. The dynamics and costs, in this case, can again be epesented by Equations 3.8 and 3.9, espectively. And again, if t I N, we have V N,M t, ϕ = V N,M t, Lin N ϕ fo all ϕ C. Popositions 3.5 and 3.6 state Bellman s Pinciple of Dynamic Pogamming fo the value functions V N and V N,M, espectively. The special case when the initial time as well as the time step lie on the gid I N is given sepaately, as it is this epesentation which will be used in the appoximation esult; see the poof of Theoem 3.2. Poposition 3.5. Assume A1 A3. Let t [, T N ], ψ C N. Then fo t [, T N t ], V N t, ψ = inf E f N t +s, Π N Z u s, us ds + V N t +t, Π N Z u t, u U whee Z u is the solution to Equation 3.5 of degee N unde contol pocess u and with initial condition t, ψ. If t I N and t I N [, T N t ], then V N t, ϕ = inf E f t + s N, Lin N Z s u u U N, us ds + V N t +t, Lin N Zt u, whee V N t,., V N t +t,. ae defined as functionals on C, and ϕ is the estiction of ψ to the inteval [, ]. Poof. Apply Theoem A.1 in Appendix A.1. To this end, let Ũ be the set of stategies U and set := + h N, T := TN, b := b N, σ := σ N, f := fn, and g := g N. Obseve that Equation 3.5 descibes the same dynamics as Equation A.1, that J = J N, whence V N = Ṽ, and veify that the hypotheses of Appendix A.1 ae satisfied.

77 3.3. SECOND DISCRETISATION STEP: PIECEWISE CONSTANT STRATEGIES 65 Poposition 3.6. Assume [, T N t ], A1 A3. Let t [, T N ], ψ C N. Then fo t I N M V N,M t, ψ = inf E f N t +s, Π N Z u s, us ds + V N,M t +t, Π N Z u t, u U N,M whee Z u is the solution to Equation 3.5 of degee N unde contol pocess u and with initial condition t, ψ. If t I N and t I N [, T N t ], then V N,M t, ϕ = inf E f t + s N, Lin N Z s u u U N, us ds + V N,M t +t, Lin N Zt u, N,M whee V N,M t,., V N,M t +t,. ae defined as functionals on C, and ϕ is the estiction of ψ to the inteval [, ]. Poof. Apply Theoem A.1 of Appendix A.1 as in the poof of Poposition 3.5, except fo the fact that we choose U N,M = U N M instead of U as the set of stategies Ũ. Notice that, by hypothesis, the intemediate time t lies on the gid I N M. The next esult gives a bound on the ode of the global appoximation eo between the value functions of degee N and N, M povided that the local appoximation eo is of ode geate than one in the discetisation step. Theoem 3.2. Assume A1 A3. Let N, M N. Suppose that fo some constants ˆK, δ > the following holds: fo any t I N, ϕ C, u U thee is ū U N,M such that hn E f t, Lin N ϕ, ūs ds + V N t +h N, Z hn E hn f t, Lin N ϕ, us ds + V N t +h N, Z hn + ˆK h 1+δ N, whee Z is the solution to Equation 3.5 of degee N unde contol pocess u, Z the solution to Equation 3.5 of degee N unde ū, both with initial condition t, ψ fo some ψ C N such that ψ [,] = ϕ. Then V N,M t, ϕ V N t, ϕ T ˆK h δ N fo all t I N, ϕ C. Poof. Let N, M N. Recall that V N,M V N by constuction. It is theefoe enough to pove the uppe bound fo V N,M V N. Suppose Condition is fulfilled fo N, M and some constants ˆK, δ >. Obseve that V N T N,. = glin N. = V N,M T N,.. Let t I N \ {T N }. Let ϕ C, and choose any ψ C N such that ψ [,] = ϕ. Given ε >, in vitue of Poposition 3.5, we find a contol pocess u U such that hn V N t, ϕ E f t, Lin N ϕ, us ds + V N t +h N, Lin N Z hn ε, whee Z is the solution to Equation 3.5 of degee N unde contol pocess u with initial condition t, ψ. Fo this u, choose ū U N,M accoding to, and let Z be the solution

78 66 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS to Equation 3.5 of degee N unde contol pocess ū with the same initial condition as fo Z. Then, using the above inequality and Poposition 3.6, we see that V N,M t, ϕ V N t, ϕ V N,M t, ϕ E hn E E hn hn = E E hn hn f t, Lin N ϕ, us ds + V N t +h N, Lin N Z hn + ε f t, Lin N ϕ, ūs ds + V N,M t +h N, Lin N Z hn + ε f t, Lin N ϕ, us ds + V N t +h N, Lin N Z hn f t, Lin N ϕ, ūs ds + V N t +h N, Lin N Z hn f t, Lin N ϕ, us ds + V N t +h N, Lin N Z hn + E V N,M t +h N, Lin N Z hn V N t +h N, Lin N Z hn + ε ˆK h 1+δ N { + sup VN,M t +h N, ϕ V N t +h N, ϕ } + ε, ϕ C whee in the last line Condition has been exploited. Since ε > was abitay and neithe the fist no the last line of the above inequalities depend on u o ū, it follows that fo all t I N \ {T N }, { sup VN,M t, ϕ V N t, ϕ } ˆK h 1+δ N ϕ C { + sup VN,M t +h N, ϕ V N t +h N, ϕ }. ϕ C Recalling the equality V N,M T N,. = V N T N,., we conclude that fo all t I N, { sup VN,M t, ϕ V N t, ϕ } ϕ C 1 h N T N t ˆK h 1+δ N T ˆK h δ N, which yields the assetion. Statement and poof of Theoem 3.2 should be compaed to Theoem 7 in Falcone and Rosace We note, though, that the deteministic analogue of Condition in Theoem 3.2 is weake than the coesponding conditions 37 and 38 in Falcone and Rosace In paticula, it is not necessay to equie that any contolled pocess Z can be appoximated with local eo of ode h 1+δ by some pocess Z using only contol pocesses which ae piecewise constant in time on a gid of width h. In the stochastic case, such a equiement would in geneal be too stong to be satisfiable. In ode to be able to apply Theoem 3.2, we must check whethe and how Condition can be satisfied. Given a gid of width N fo the discetisation in time and segment space, we would expect the condition to be fulfilled povided we choose the sub-gid fo the piecewise constant contols fine enough; that is, the time discetisation of the contol pocesses should be of degee M with M sufficiently big in compaison to N. Indeed, if we choose M of any ode geate than thee in N, then Condition holds. This is the content of Theoem 3.3. The theoem, in tun, elies on a kind of mean value theoem, due to Kylov, which we cite as Theoem A.2 in Appendix A.3.

79 3.3. SECOND DISCRETISATION STEP: PIECEWISE CONSTANT STRATEGIES 67 Theoem 3.3. Assume A1-A4. Let β > 3. Then thee is a numbe ˆK > depending only on K,, L, T, the dimensions and β such that Condition in Theoem 3.2 is satisfied with constants ˆK and δ := β 3 4 fo all N, M N such that N and M N β. Poof. Let N, M N be such that N and M N β. Let t I N, ϕ C. Define the following functions: b: Γ R d, bγ := b t, Lin N ϕ, γ, σ : Γ R d d 1, σγ := σ t, Lin N ϕ, γ, f : Γ R, fγ := f t, Lin N ϕ, γ, g : R d R d, gx := V N t +h N, Lin N Sϕ, x, whee Sϕ, x is the function in C given by Sϕ, x: [, ] s { ϕs+hn if s [, h N ], ϕ + s+h N h N x if s h N, ]. As a consequence of Assumption A4, b, σ, f as just defined ae continuous functions on Γ, ρ. By Assumption A2, b, σ, f ae all bounded by K. As a consequence of Poposition 3.4, the function g is Lipschitz continuous and fo the Lipschitz constant we have gx gy sup 3LT +1 exp 3T T +4d 1 L 2. x,y R d,x y x y Let u U, and let Z u be the solution to Equation 3.5 of degee N unde contol pocess u with initial condition t, ψ fo some ψ C N such that ψ [,] = ϕ. As Z also satisfies Equation 3.8, we see that Z u t ϕ = t b us ds + σ us dw s fo all t [, h N ]. By Theoem A.2 in Appendix A.3, we find ū U N,M such that hn E f ūs ds + g Xūh N hn E f us ds + g Z u h N ϕ C1+h N whee Xū satisfies N M N M sup fγ + sup γ Γ x,y R d,x y gx gy, x y Xūt = t būs ds + σ ūs dw s fo all t. Notice that the constant C above only depends on K and the dimensions d and d 1. Let Zū be the solution to Equation 3.5 of degee N unde contol pocess ū with initial condition t, ψ, whee ψ [,] = ϕ as above. Then, by constuction, Zūt ϕ = Xūt fo all t [, h N ]. Set ˆK := 2 C β 4 K + 3LT +1 exp 3T T +4d1 L 2.

80 68 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS Since M N β by hypothesis, 1+β 4 = 1+δ > 1 and h N = N, we have 1 4 N M N 1+β 4 = β 4 h 1+δ N. Recalling the definition of the coefficients b, σ, f, g, we have thus found a piecewise constant stategy ū U N,M such that hn E f t, Lin N ϕ, ūs ds + V N t +h N, Zūh N E hn f t, Lin N ϕ, us ds + V N t +h N, Z u h N + ˆK h 1+δ N, whee Z u, Zū ae the solutions coesponding to u and ū, espectively, as above. We note that the constant ˆK, which appeas in Theoem 3.3 and its poof, depends on β only though the facto β 4. Moeove, ˆK also depends on the delay length only though the facto β 4. Theoem 3.2 and Theoem 3.3, togethe with the above obsevation, yield the following bound on the diffeence between the value functions of degee N and degee N, M, espectively. Coollay 3.3. Assume A1-A4. Then thee is a positive constant K depending only on K, L, T, and the dimensions such that fo all β > 3, all N N with N, all M N with M N β, all t I N, all ϕ C it holds that V N,M t, ϕ V N t, ϕ K β β N In paticula, with M = N β, whee x is the least intege not smalle than x, the uppe bound on the discetisation eo can be ewitten as V N, N β t, ϕ V N t, ϕ K β 1+β β 3 41+β N 1+β. Fom Coollay 3.3 we see that, in tems of the total numbe of time steps N N β, we can achieve any ate of convegence smalle than 1 4 by choosing the sub-discetisation ode β sufficiently lage. When the diffusion coefficient σ is zeo o the space of contol actions Γ is finite and σ is not diectly contolled, then the sub-disetisation degee M may be chosen of an ode lowe than thee in N, and Condition is still satisfied. Fo in these special cases, the eo bound of Theoem A.2 can be impoved on, see Appendix A.3. Let us fist conside the case when σ, which coesponds to deteministic contol poblems. To obtain an analogue of Theoem 3.3, we use Lemma A.3 in place of Theoem A.2. The ode exponent β must be geate than one, and the ode exponent δ in Condition is taken to be β 1 2. If, in addition, Γ is finite, instead of Lemma A.3 we invoke Lemma A.4. The analogue of Theoem 3.3 holds tue fo any β > and with the choice δ := β. These obsevations in combination with Theoem 3.2 yield the following bounds fo deteministic systems on the diffeence between V N and V N,M ; the esults ae given only fo M = N β.

81 3.4. BOUNDS ON THE TOTAL ERROR 69 Coollay 3.4. Assume A1-A4. Assume futhe that σ is equal to zeo. Then thee is a positive constant K depending only on K, L, T, and the dimension d such that fo all β > 1, all N N with N, all t I N, all ϕ C it holds that V N, N β t, ϕ V N t, ϕ K β 1+β β 1 21+β N 1+β. If, in addition, Γ is finite with cadinality N Γ, then thee is a positive constant K depending only on K, L, T such that fo all β >, all N N with N, all t I N, all ϕ C it holds that V N, N β t, ϕ V N t, ϕ K1 + N Γ β 1+β β 1+β N 1+β. If the diffusion coefficient σ is not diectly contolled, that is, if σt, ϕ, γ = σt, ϕ fo some σ and all t [, T ], ϕ C, γ Γ, then we may ely on Lemma A.5 in place of Theoem A.2. Obseve that the diffusion coefficient fo the contol poblems of degee N, M and N, espectively, is constant on time intevals of the fom [k 1 N, k N, k N. The ode exponent β fo the analogue of Theoem 3.3 must be geate than one, and the ode exponent δ in Condition is taken to equal β 1 2. In combination with Theoem 3.2, this implies the following bound. Coollay 3.5. Assume A1-A4. Assume in addition that σ does not depend on the contol vaiable and that Γ is finite with cadinality N Γ. Then thee is a positive constant K depending only on K, L, T such that fo all β > 1, all N N with N and N β a squae numbe, all t I N, all ϕ C it holds that V N, N β t, ϕ V N t, ϕ K1 + 4 T + N Γ β 1+β β 1+β N 1+β. The equiement in Coollay 3.5 that N β be a squae numbe is no seious estiction, as the optimal bound on the total discetisation eo will be achieved with β = Bounds on the total eo Hee, we put togethe the eo bounds fom Sections 3.2 and 3.3 in ode to obtain an oveall estimate fo the ate of convegence, that is, a bound on the discetisation eo incued in passing fom the oiginal value function to the value function of degee N, M. In addition, we addess the question of whethe and in which sense nealy optimal stategies fo the discete poblems can be used as nealy optimal stategies fo the oiginal system. As in Coollay 3.3, we expess the eo bound in tems of the total numbe of discetisation steps o, taking into account the pesence of the delay length, in tems of the length of the smallest time step. Theoem 3.4. Assume A1-A4. Let α, 1], L H >. Then thee is a constant C depending only on α, L H, L, K, T and the dimensions such that fo all β > 3, all N N with N 2, all t I N, all α-hölde continuous ϕ C with Hölde constant not geate than L H, it holds that, with h = V t, ϕ V N, N β t, ϕ C N 1+β, α β 1+β h α 1+β β 21+β ln 1 1 h h 21+β + β 1+β h β 3 41+β.

82 7 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS In paticula, with β = 5 and h = N 6, it holds that V t, ϕ V N,N 5t, ϕ C 5α 2α 1 6 h ln 1 h h Poof. Clealy, V V N, N β V V N + V N V N, N β. The assetion now follows fom Coollay 3.3 and Theoem 3.1, whee ln 1 h N = ln N N 1+β is bounded by ln = ln 1 h. The choice β = 5 in Theoem 3.4 yields the same ate fo both summands in the eo estimate povided the initial segment is at least 1 2 -Hölde continuous, because 1 2 = β 3 4 implies β = 5. Thus, the best oveall eo bound we obtain without additional assumptions is of ode h 1/12 up to neglecting the logaithmic tem. The ate 1 12 is a wost-case estimate. Moeove, bette eo bounds ae obtained in the special situations teated at the end of Section 3.3. In the deteministic case, that is, when the diffusion coefficient σ is zeo, two diffeent bounds on the total eo depending on whethe o not the space of contol actions Γ is finite can be deived by combining Coollay 3.1 fom Section 3.2 with Coollay 3.4 fom Section 3.3. The optimal choice of the paamete β is thee fo a complete and sepaable metic space Γ, since 1 = β 1 2 implies β = 3, povided the initial segment as well as the coefficients b and f ae Lipschitz continuous in the time vaiable. If Γ is finite, we choose β = 1. When the diffusion coefficient is not diectly contolled and Γ is finite, we combine the assetions of Theoem 3.1 and Coollay 3.5 to obtain a bound on the oveall discetisation eo. The optimal choice of β is two, since 1 2 = β 1 2 implies β = 2. Table 3.1 shows the coesponding bounds on the total eo, that is, bounds on the maximal diffeence between the value functions V and V N,M ove all initial segments of a given time egulaity. The time egulaity of the initial segments and of the coefficients b, σ, f in thei time vaiable is indicated in the fist column of the table. A function ψ is Hölde 1 2 iff ψt ψs L H t s ln1/ t s fo some LH > and all t, s with t s small. The second column of the table shows whethe the space of contol actions Γ is assumed to be finite o not. In the thid column, the fom of the diffusion coefficient is indicated. The second but last column shows the ode of the sub-discetisation degee M in tems of the degee N of the oute discetisation. Notice that M need only be popotional to N β with β giving the optimal ode, not necessaily equal to N β. The N M eo bounds in tems of the time step h = ae given in the last column of the table. Recall that V N,M V N fo all N, M N by constuction. If, instead of the two-sided eo bound of Theoem 3.4, we wee meely inteested in obtaining an uppe bound fo V, we would simply compute V N,M with M = 1. Theoem 3.1 implies that we would incu an eo of ode nealy 1 2 ; that is, we would have V V N,1 + constant lnn N fo all N N, N 2, whee the initial segments ae supposed to be Hölde 1 2. This diection, howeve, is the less infomative one, since we do not expect the minimal costs fo the discetised system to be lowe than the minimal costs fo the oiginal system. Up to this point, we have been concened with convegence of value functions only. A natual question to ask is the following: Suppose we have found a stategy ū U N,M which

83 3.4. BOUNDS ON THE TOTAL ERROR 71 Time egulaity Space Γ Diffusion coefficient M Eo bound Lipschitz finite σ N h 1/2 Lipschitz sepaable σ N 3 h 1/4 Hölde 1 2 finite σt, ϕ N 2 h ln 1/6 1 h Hölde 1 2 sepaable σt, ϕ, γ N 5 h ln 1/12 1 h Table 3.1: The table shows bounds on the diffeence between V and V N,M situations and the geneal case last ow in tems of the time step h = N M. fo some special is ε-optimal fo the contol poblem of degee N, M unde initial condition t, ϕ. Will this same stategy ū also be nealy optimal fo the oiginal contol poblem? The hypothesis that ū be ε-optimal fo the poblem of degee N, M unde initial condition t, ϕ means that J N t, ϕ, ū V N,M t, ϕ ε. Recall that the cost functional fo the poblem of degee N, M is identical to the one fo the poblem of degee N, namely J N, and that, by constuction, J N V N,M V N ove the set of stategies U N,M. The stategy ū is nealy optimal fo the oiginal contol poblem if thee is ε which must be small fo ε small and N, M big enough such that Jt, ϕ, ū V t, ϕ ε. Recall that U N,M U, whence Jt, ϕ, ū is well-defined. The next theoem states that nealy optimal stategies fo the appoximating poblems ae nealy optimal fo the oiginal poblem, too. Theoem 3.5. Assume A1-A4. Let α, 1], L H >. Then thee is a constant C depending only on α, L H, L, K, T, the dimensions and the delay length such that fo all β > 3, all N, M N with N 2 and M N β, all t I N, all α-hölde continuous ϕ C with Hölde constant not geate than L H the following holds: If ū U N,M is such that J N t, ϕ, ū V N,M t, ϕ ε, then, with h = Jt, ϕ, ū V t, ϕ C h α 1+β ln 1 h h 1 21+β + h β 3 41+β Poof. Let ū U N,M be such that J N t, ϕ, ū V N,M t, ϕ ε. Then Jt, ϕ, ū V t, ϕ N 1+β, + ε. Jt, ϕ, ū J N t, ϕ, ū + J N t, ϕ, ū V N,M t, ϕ + V N,M t, ϕ V t, ϕ Jt, ϕ, u J N t, ϕ, u + ε + VN,M t, ϕ V t, ϕ. sup u U The assetion is now a consequence of Theoem 3.1 and Theoem 3.4. Let us suppose we have found a stategy ū fo the poblem of degee N, M with fixed initial condition t, ϕ I N C which is ε-optimal o optimal and a feedback contol. The latte means hee that ū can be witten in the fom ūtω = ū t N M, Π N Z u t N M ω fo all ω Ω, t,

84 72 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS whee Z u is the solution to Equation 3.8 unde contol ū and initial condition t, ϕ and ū is some measuable Γ-valued function defined on [, C N o, because of the discetisation, on {k N M k N } R dn M+M+1. We would like to use ū as a feedback contol fo the oiginal system. It is not clea whethe this is possible unless one assumes some egulaity like Lipschitz continuity of ū in its segment vaiable. The poblem is that we have to eplace solutions to Equation 3.8 with solutions to Equation 3.2. Something can be said, though. Recall the definition of U N,M at the beginning of Section 3.3. Stategies in U N,M ae not only piecewise constant, they ae also adapted to the filtation geneated by W k N M, k N. Thus, if ū U N,M is a feedback contol, then it can be e-witten as ūtω = ū 1 t N M, W t N M k N M ω, k =,..., N +1M, ω Ω, t, whee ū 1 is some measuable Γ-valued function depending on the initial condition t, ϕ and defined on {k N M k N } R d N,M with d N,M := dn M+M+1. The above equality has to be ead keeping in mind the convention that W t = if t <. The function ū 1 can be used as a noise feedback contol fo the oiginal poblem as it diectly depends on the undelying noise pocess, which is the same fo the contol poblem of degee N, M and the oiginal poblem. By Theoem 3.5, we then know that ū 1 induces a nealy optimal stategy fo the oiginal contol poblem povided ū was nealy optimal fo the discetised poblem. 3.5 Solving the contol poblems of degee N, M N M Hee, we tun to the question of how to compute the value functions of the contol poblems esulting fom the discetisation pocedue analysed above. The value function of degee N, M is the value function of a finite-dimensional optimal contol poblem in discete time. One time step coesponds to a step of length in continuous time. The noise component of the contol poblem of degee N, M is given by a finite sequence of independent Gaussian andom vaiables with mean zeo and vaiance N M, because the time hoizon is finite and the stategies in U N,M ae not only piecewise constant, but also adapted to the filtation geneated by W k N M, k N. By constuction of the appoximation to the dynamics in Section 3.2, the segment space fo the poblem of degee N, M is the subspace of C N consisting of all functions which ae piecewise linea elative to the gid {k N M k Z} [ N, ]. The segment space of degee N, M, theefoe, is finite-dimensional and isomophic to R d N,M with d N,M := dn M +M +1. The functions of inteest ae actually those whose nodes ae multiples of N units of time apat, but in each step of the evolution the segment functions and thei nodes get shifted in time by N M units. Theoetically, the Pinciple of Dynamic Pogamming as expessed in Poposition 3.6 could be applied to compute the value function V N,M. Pactically, howeve, it is not possible to use any algoithm based on diectly applying one-step Dynamic Pogamming. This difficulty aises because the state space of the contolled discete-time Makov chains we ae dealing with is R d N,M and the semi-discete value function V N,M is defined on I N M R d N,M o, in the fully discete case, on a d N,M -dimensional gid. In view of Theoem 3.4, the dimension d N,M is expected to be vey lage so that stoing the values of V N,M

85 3.5. SOLVING THE CONTROL PROBLEMS OF DEGREE N, M 73 fo all initial conditions as equied by the Dynamic Pogamming method becomes impossible. It is well known that the wost-case complexity of solving a d-dimensional discete-time optimal contol poblem via Dynamic Pogamming gows exponentially in the dimension d. This is elated to the famous cuse of dimensionality e. g. Bellman and Kabala, 1965: p. 63. The complexity of a poblem is hee undestood in the sense of infomationbased complexity theoy, see Taub and Weschulz 1998 fo an oveview. Fo a esult in this spiit confiming the pesence of the cuse of dimensionality see Chow and Tsitsiklis Obseve, though, that the complexity of a poblem depends not only on the poblem fomulation, but cucially also on the eo citeion used fo detemining the accuacy of appoximate solutions and on the infomation available to the admissible algoithms. The situation in ou case is not as despeate as it might seem povided the oiginal contol poblem has low dimensions d, d 1. Recall that V N,M is an appoximation of the value function V N constucted in Section 3.2, which in tun appoximates V, the value function of the oiginal poblem, and that the poblems of degee N and of degee N, M, M N, have the same dynamics and the same cost functional. Moeove, fo any time t I N, both V N t,. and V N,M t,. live on the space of all functions ϕ C which ae piecewise linea elative to the gid {k N k Z} [, ]. Let us wite ĈN fo this space. Clealy, ĈN is isomophic to Rd N with d N := dn +1. An appoximation ˆV N,M t,. to V N,M t,. fo times t I N can be computed by backwad iteation stating fom time T N and poceeding in time steps of length N. Recall that V N,M T N,. = g., whence ˆV N,M T N,. is detemined by g, the function giving the teminal costs. To compute ˆV N,M t, ϕ fo any ϕ ĈN when ˆV N,M t + N,. is available and t I N, an inne backwad iteation can be pefomed with espect to the gid {t + k N M k =,..., M}. If t I N, then, on the time inteval [t, t + N, the coefficients b, σ, f ae functions of the contol vaiable only, see Equations 3.8 and 3.9, espectively, and the poof of Theoem 3.3. The inne optimisation thus consists in solving a d-dimensional discete-time optimal contol poblem with constant coefficients and fixed initial condition ove M time steps, which coespond to a time hoizon of length N. To be moe pecise, define, fo each n N, an opeato T n N,M on the space BĈN of all bounded eal-valued functions on ĈN by 3.12 T N,M n Ψϕ := inf u U N,M E N f n N, ϕ, us ds + Ψ Lin N Z u, ϕ ĈN, N whee Z u = Z u,n,ϕ is the pocess defined on the time inteval [, N ] by 3.13 { ϕ + Z u t := b n N, ϕ, us ds + σ n N, ϕ, us dw s, t, N ], ϕt, t [, ]. The definition of T n N,M should be compaed to Poposition 3.6. Given Ψ BĈN, let us efe to the evaluation of T n N,M Ψ at ϕ ĈN as the Bellman step fo Ψ at segment ϕ and time step n. Notice that Lin N ϕ = ϕ fo all ϕ ĈN. Since any stategy u U N,M is piecewise constant elative to the gid {k k N }, the integals appeaing in N M

86 74 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS 3.12 and 3.13 ae eally finite sums of andom vaiables; fo n N, Ψ BĈN, all ϕ ĈN, it holds that T N,M n Ψϕ = inf u U N,M E N M M 1 k= f n N, ϕ, uk N M + Ψ Lin N Z u, N whee Lin N Z u is an element of ĈN and is completely detemined by ϕ + k N N, k {,..., N 1}, and Z u N = ϕ + + M 1 b n N M N M 1 k= k=, ϕ, uk N M σ n N, ϕ, uk N M W k+1 N M W k N M. If the diffusion coefficient σ is not diectly contolled, that is, if σt, ϕ, γ = σt, ϕ, then the expession fo Z u N simplifies to Z u N = ϕ + M 1 N M k= b n N, ϕ, uk N M + σ n N, ϕ W N. Obseve that the opeato T n N,M BĈN, that is, is a non-expansive mapping in supemum nom on sup T n N,M Ψϕ T n N,M Ψϕ sup Ψϕ Ψϕ ϕ ĈN ϕ ĈN fo all Ψ, Ψ BĈN. This popety, though evident fom 3.12, is impotant in that it guaantees numeical stability when the opeatos T n N,M, n N, ae epeatedly applied. The Bellman steps need not necessaily be backwad iteations of Dynamic Pogamming type as was suggested above. We can use any method that solves the aising M-step constant coefficients contol poblems. When the space of contol actions Γ is finite, then the coefficients b, σ, f can be evaluated in advance at n N, ϕ, γ fo all γ Γ, because the time segment pai n N, ϕ is constant duing any Bellman step. In the deteministic case, it is sometimes possible to optimise diectly ove the set of deteministic M-step stategies. If Γ has finite cadinality N Γ, instead of checking N Γ to the powe of M possibilities, we only have to test N Γ +M 1 M possibilities, which is the numbe of combinations of M objects when thee ae N Γ diffeent kinds of objects. In the stochastic case, a method ecently intoduced by Roges 27 fo computing value functions of high-dimensional discete-time Makovian optimal contol poblems might pove useful. The method is based on path-wise optimisation and Monte Calo simulation of tajectoies of a efeence Makov chain; it uses minimisation ove functions which can be intepeted as candidates fo the value function. Those candidates should be chosen fom a computationally nice class so that the value function can be computed at any given point without the need to stoe its values fo the entie state space, although this poblem is less acute fo low dimensions d, d 1. Unlike schemes diectly employing the PDP, Roges s method does not yield an appoximation of the value function ove the

87 3.5. SOLVING THE CONTROL PROBLEMS OF DEGREE N, M 75 entie state space, but only its value at the given initial point. This is what is needed fo the Bellman step. Let us etun to ou pocedue fo computing ˆV N,M t,., t I N. Set n T := T N. The pocedue stats by detemining ˆV N,M n T N,. = ˆV N,M T N,. fom g. To this end, choose a finite subset S nt ĈN. Fo each ϕ S n T, set ˆV N,M n T N, ϕ:= gϕ. The values of ˆV N,M n T N,. at segments not in S n T ae calculated by some intepolation o egession method. Now, suppose that ˆV N,M n+1 N,. is available fo some n {,..., n T 1}. Then the following steps ae executed: 1. Choose a finite set S n ĈN. 2. Fo each segment ϕ S n, compute ˆV N,M n N, ϕ by executing the Bellman step fo ˆV N,M n+1 N,. at ϕ and time step n. 3. Compute ˆV N,M n N,. by some intepolation o egession method using the data {ϕ, ˆV N,M n N, ϕ ϕ S n}. In this way, by backwad iteation, ˆV N,M n N,. can be calculated fo all n {,..., n T }. The poposed pocedue may be called an application of appoximate Dynamic Pogamming o appoximate value iteation 1 e. g. Betsekas, 25, 27: I.6, II.1.3. The idea is pobably as old as Dynamic Pogamming itself, cf. Bellman and Kabala Input: SYSTEM, T,, N, M Output: V[],...,V[T*N/] SYSTEM.set_paametes,N,M; SEGMENTS.set_paametes,N; n <- T*N/; fo i = to n do V[i].set_paametes,N; SEGMENTS.geneaten; fo each x in SEGMENTS do V[n].addx,SYSTEM.gx; V[n].intepolate; while n > do begin n <- n-1; SEGMENTS.geneaten; fo each x in SEGMENTS do V[n].addx,SYSTEM.Bellman_stepn,x,V[n+1]; V[n].intepolate; end_while; Figue 3.1: Appoximate value iteation: scheme in pseudo code. The object SYSTEM contains the coefficient functions b, σ, f, g and povides a method fo the Bellman step. The objects V[],...,V[T*N/] epesent appoximations to V N,M n N,., n =,..., T N ; they possess an intepolation method, as values ae calculated only at segments povided by SEGMENTS. 1 The tem value iteation is usually eseved fo the backwad iteation in value function space when solving infinite hoizon contol poblems, Dynamic Pogamming fo the finite backwad iteation when solving poblems with finite time hoizon.

88 76 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS Figue 3.1 epesents the pocedue in an object-oiented pseudo code. The object SYSTEM contains the coefficient functions b, σ, f, g; the teminal costs g ae diectly accessible, the othe functions ae needed fo the method Bellman_step, which implements the opeatos T n N,M, n N. The object SEGMENTS geneates and stoes the sets S n of segments at which the Bellman step is caied out. The objects V[],...,V[T*N/] epesent the appoximations ˆV N,M n N,. to the value functions V N,Mn N,., n =,..., n T. The method intepolate ceates an intepolant using the data stoed in V[n], that is, it implements the ceation of ˆV N,M n N,. fom the data {ϕ, ˆV N,M n N, ϕ ϕ S n}. We have seen how the Bellman steps can be computed in pinciple, but will leave open the question of which algoithm should be used. Thee ae two othe impotant questions, hee. The fist is the choice of the sets of segments S n ĈN, n {,..., n T }. The second egads the choice of the intepolation o egession method. Clealy, the two questions ae inteelated in that the choice of a cetain intepolation method may equie a specific choice of the segment sets. Suppose we have chosen, fo each time step n, a set of segments S n as well as an intepolation method. The latte can be epesented as a mapping A N n : BĈN BĈN such that A N n Ψ = A N n Ψ wheneve Ψϕ = Ψϕ fo all ϕ S n. The appoximate value iteation pocedue can then be witten as ˆV N,M n T N,. := AN n T g, ˆV N,M n N,. := A N n Tn N,M ˆVN,M n+1 N,., n {,..., n T 1}. An impotant estiction on the choice of the intepolation method is that the coesponding opeatos A N n should be non-expansive mappings. This is to peseve the nonexpansiveness of the Bellman opeato, which in tun guaantees numeical stability of the ecusion. Admissible methods ae, fo example, the neaest neighbou and k neaest neighbou egession, which wok with any choice of the segment sets, o intepolation methods using piecewise linea basis functions. Recall that ĈN is isomophic to Rd N with d N = dn +1. On the othe hand, the value function of degee N, M is Lipschitz continuous, but not necessaily continuously diffeentiable. The poblem of ecoveing a Lipschitz continuous function defined on a ddimensional hypecube to wok on a bounded domain is itself subject to a dimensional cuse, at least when the eo is measued in supemum nom. Consequently, appoximate Dynamic Pogamming in itself povides no escape fom the cuse of dimensionality. Instead of teating the values at the gid points of the segment functions in ĈN as belonging to independent dimensions, we may exploit the fact that they ae geneated by continuous functions. In view of the eo bounds of Sections 3.2 and 3.4, which ae unifom only ove sets of Lipschitz o Hölde continuous segments with bounded Lipschitz o Hölde constant, it is natual to estict the domain of the value function of degee N, M accodingly. Any function in ĈN is, by constuction, Lipschitz continuous, yet its Lipschitz constant may be abitaily lage. Fo L >, let ĈLipN, L denote the convex set of all functions in ĈN with Lipschitz constant not geate than L. Denote by Ĉ1/2 N, L the convex set of all functions in ϕ ĈN such that ϕt ϕs L t s ln e t s fo all t, s [, ].

89 3.6. CONCLUSIONS AND OPEN QUESTIONS 77 In the case of a deteministic system and fo bounded dift coefficient b, the segments of all solution tajectoies of the oiginal dynamics ae Lipschitz continuous with Lipschitz constant not geate than L povided the initial segments ae that egula and L was chosen big enough. In the stochastic case, boundedness of b and σ does not guaantee that all tajectoy segments ae Hölde 1 2 fo some constant L; nevetheless, fo all Hölde 1 2 initial segments, all tajectoy segments ae Hölde 1 2, and the pobability that a tajectoy segment has Hölde constant geate than L tends to zeo as L goes to infinity, again povided the initial segments ae Hölde 1 2 with constant L. Moeove, the pobability that a tajectoy segment has Hölde constant geate than L can be estimated by deiving bounds on the moments of the modulus of continuity of Itô diffusions as in Appendix A.2. These obsevations can be used in choosing the sets S n, n {,..., n T } of gid segments. In geneating appopiate Lipschitz o Hölde continuous segments, the Bownian bidge constuction o a deteministic analogue may be used. The undelying idea is that not all dimensions of the piecewise linea segments ae equally impotant. In paticula, the ight-most coodinate, which coesponds to the cuent time, plays a special ole in that it povides the initial value fo geneating the new cuent state, cf We leave these obsevations to futue investigation. Fist numeical expeiments have been caied out fo the simple deteministic system pesented in Subsection A ough appoximation to the tue value function can be obtained. The choice of the gid segments and of the intepolation method ae seen to be cucial in view of the heavy equiements in memoy and computing time. 3.6 Conclusions and open questions In this chapte, we have pesented and analysed a semi-discetisation scheme fo finite hoizon stochastic contol poblems with delay. The dependence of the system on its past evolution is allowed to be of a geneal fom; it includes point and distibuted delay as well as genealised distibuted and functional delay, cf. Section 3.1. Apat fom the somewhat estictive assumption of boundedness, the hypotheses on the system coefficients ae quite natual. The state pocess and the noise pocess may have diffeent dimensions d and d 1, and no non-degeneacy assumption on the diffusion coefficient is needed. The space of contol actions Γ may be an abitay complete sepaable metic space only sepaability is eally needed; in paticula, Γ need not be compact. The discetisation of time induces a discetisation of the segment space. The discete-time optimal contol poblems geneated by the scheme ae, as a esult, finite-dimensional. Convegence of the scheme has been demonstated and bounds on the discetisation eo have been deived. Unde geneal assumptions, we have a wost-case estimate of the ate of convegence; bette bounds have been obtained fo impotant special situations, namely fo the deteministic case finite and sepaable Γ and the case of uncontolled diffusion coefficient and finite Γ. We stess that the eo bounds of Section 3.4 hold without any assumptions on egulaity o existence of optimal stategies and without any additional assumptions on the egulaity of the value function. Indeed, thee ae contol poblems satisfying ou hypotheses which eithe do not possess optimal stategies o whee the optimal stategies ae Boel measuable, but almost suely discontinuous on any non-

90 78 CHAPTER 3. TWO-STEP TIME DISCRETISATION AND ERROR BOUNDS empty open time inteval, o whee the value function is Lipschitz continuous, but not eveywhee Féchet diffeentiable. The stuctue of ou two-step discetisation scheme can be exploited in designing algoithms fo the numeical solution of the discete-time contol poblems of degee N, M. In this way, the memoy equiements can be kept within feasible limits. The Bellman steps, that is, the inne optimisation steps of the pocedue poposed in Section 3.5, ae of constant coefficients type, which may be computationally advantageous. In contast to Chapte 2, the analysis of this chapte is not confined to poving mee convegence of a discetisation scheme. Kushne s Makov chain method, on the othe hand, is applicable to a wide vaiety of dynamic optimisation poblems and discetisation schemes. Notice, howeve, that some kind of compactness assumption egading the space of stategies is an essential ingedient of the method, cf. Section 2.2. In connection with the two-step scheme, thee ae some open questions. The eo bound obtained unde geneal assumptions is a wost-case estimate of the ate of convegence, but it is not clea whethe it is shap. Due to the stuctue of the scheme, none of the eo bounds can be impoved beyond the ate of convegence attained by the Eule scheme fo the coesponding uncontolled system unless the cost functional has some special fom. 2 As fa as the numeical solution of the discete-time contol poblems of degee N, M is concened, a lot is still to be done. On the one hand, thee is the question of the complexity of the poblem in the sense of infomation based complexity, which depends on the eo citeion adopted. On the othe hand, thee is the question of how to implement the scheme of Section 3.5. Obseve that, even if the poblem is subject to a dimensional cuse in the discetisation degee N, an appoximate Dynamic Pogamming algoithm can still be useful, as it will poduce a fist ough appoximation to the value function of the oiginal poblem. Such an appoximation, in tun, can seve as an initial guess of the value function fo algoithms of suboptimal contol like limited lookahead o ollout cf. Betsekas, 25: Ch. 6. The two-step discetisation scheme should be applicable to othe types of optimal contol poblems with delay. Instead of a finite deteministic time, the andom time of fist exit fom a compact set as in Section 2.3 may be taken as time hoizon. Othe inteesting systems ae those with eflection at the bounday of a compact polyhedon. The state pocess would, in both cases, take values in a bounded subset of R d, which is easonable also fom the point of view of numeical computation. What has to be established is, again, not so much whethe the scheme conveges, but how fast. 2 Fo stochastic systems with geneal cost functionals, we have the stong ate of convegence of the coesponding Eule scheme as bound on the ate of convegence of the two-step scheme. Fo special cost functionals, the scheme might attain the weak ate of convegence of the Eule scheme.

91 Appendix A A.1 On the Pinciple of Dynamic Pogamming Let Ω, F, P, F t, W be a Wiene basis of dimension d 1. Let U be the associated set of contol pocesses. Fo n N, define the set U n U of piecewise constant stategies accoding to 3.1 at the beginning of Section 3.3. Let Ũ be eithe U o U n fo some n N. Let > and set C := C[, ], R d. If Y is an R d -valued pocess, then the notation Y t in this subsection denotes the segment of length. Let b, σ, f, g be functions satisfying the following hypotheses: H1 Measuability: b: [, C Γ R d, σ : [, C R d d 1, f : [, C Γ R, g : C R ae Boel measuable functions. H2 Boundedness: b, σ, f, g ae bounded by some positive constant K. H3 Unifom Lipschitz condition: thee is a constant L > such that fo all ϕ, ψ C, all t, all γ Γ bt, ϕ, γ bt, ψ, γ σt, ϕ, γ σt, ψ, γ L ϕ ψ, ft, ϕ, γ ft, ψ, γ gϕ gψ L ϕ ψ. Let T >. Define a cost functional J : [, T ] C U R by Jt, ψ, u := E T t f t +s, Y s, us ds + g Y T t, whee Y = Y t,ψ,u is the solution to the contolled SDDE A.1 Y t = { ψ + b t +s, Y s, us ds + σ t +s, Y s, us dw s, t >, ψt, t [, ]. Define the associated value function Ṽ : [, T ] C R by Ṽ t, ψ := inf { Jt, ψ, u u Ũ}. Depending on the choice of Ũ, the function Ṽ thus defined gives the minimal costs ove the set U of all contol pocesses o just ove a set of stategies which ae piecewise constant elative to the gid {k n k N } fo some n N. The following popety of Ṽ is useful. 79

92 8 APPENDIX A. APPENDIX Poposition A.1. Assume H1-H3. Let Ṽ be the value function defined above. Then Ṽ is bounded and Lipschitz continuous in the segment vaiable unifomly in the time vaiable. Moe pecisely, Ṽ is bounded by K T +1 and fo all t [, T ], all ϕ, ψ C, Ṽ t, ϕ Ṽ t, ψ 2 2 L T +1 exp 3 T T + 4d 1 L 2 ϕ ψ. Poof. Boundedness of Ṽ is an immediate consequence of its definition and Hypothesis H2. Let t [, T ], let ϕ, ψ C. Recall the inclusion Ũ U and obseve that, in vitue of the definition of Ṽ, we have Ṽ t, ϕ Ṽ t, ψ sup Jt, ϕ, u Jt, ψ, u. u U By Hypothesis H3, fo all u U we get Jt, ϕ, u Jt, ψ, u T t E f t +s, Xs u, us f t +s, Ys u, us ds + g X ũ T t g Y ũ T t L1 + T t E sup t [, T ] X u t Y u t 2 1 2, whee X u, Y u ae the solutions to Equation A.1 unde contol pocess u with initial conditions t, ϕ and t, ψ, espectively. Now, fo evey T [, T ], E sup X u t Y u t 2 2 E sup X u t Y u t ϕ ψ 2, t [,T ] t [,T ] while Hölde s inequality, Doob s maximal inequality, Itô s isomety, Fubini s theoem and Hypothesis H3 togethe yield E sup X u t Y u t 2 t [,T ] T 3 ϕ ψ 2 + 3T E b t +s, Xs u, us b t +s, Ys u, us 2 ds d d 1 + 3d 1 E σij t +s, Xs u, us σ ij t +s, Ys u, us 2 dw j s i=1 j=1 sup t [,T ] 3 ϕ ψ 2 + 3T L 2 T + 12d 1 E T i=1 j=1 E X u s Y u s 2 ds d d 1 σij t +s, Xs u, us σ ij t +s, Ys u, us 2 ds 3 ϕ ψ 2 + 3T + 4d 1 L 2 T E sup t [,s] X u t Y u t 2 ds. Since ϕ ψ ϕ ψ, Gonwall s lemma implies that E sup X u t Y u t 2 8 ϕ ψ 2 exp 6 T T + 4d 1 L 2. t [, T ] Putting the estimates togethe, we obtain the assetion.

93 A.2. ON THE MODULUS OF CONTINUITY OF ITÔ DIFFUSIONS 81 Recall that the the value function Ṽ has been defined ove the set of stategies Ũ. If Ũ = U, set Ĩ := [,, else if Ũ = U n, set Ĩ := {k n k N }. The following vesion of Bellman s Pinciple of Optimality o Pinciple of Dynamic Pogamming holds. Theoem A.1 PDP. Assume H1-H3. Then fo all t [, T ], all t Ĩ [, T t ], all ψ C, Ṽ t, ψ = inf E u U f t +s, Ys u, us ds + Ṽ t +t, Yt u, whee Y u is the solution to Equation A.1 unde contol pocess u with initial condition t, ψ. Theoem A.1 is poved in the same way as Theoem 4.2 in Lassen 22, also see the poof of Theoem in Yong and Zhou 1999: p. 18. We meely point out the diffeences in the poblem fomulation and the hypotheses. Hee, all coefficients, those of the dynamics and those of the cost functional, ae bounded, while Lassen 22 also allows fo sub-linea gowth. Since Equation A.1 has unique solutions, boundedness of the coefficients guaantees that the cost functional J as well as the value function Ṽ ae well defined. Notice that we expess dependence on the initial time in a diffeent, but equivalent way in compaison with Lassen 22. Notice futhe that in Theoem A.1 only deteministic times appea. We have stated the contol poblem and given Bellman s pinciple in the stong Wiene fomulation, cf. Section 3.1. Although the weak Wiene fomulation is essential fo the poof, the esulting value functions ae the same fo both vesions. This is due to the fact that weak uniqueness holds fo Equation A.1. Also the infimum in the Dynamic Pogamming equation can be taken ove all Wiene contol bases o just ove all contol pocesses associated with a fixed Wiene basis. Thee ae two espects in which ou hypotheses ae moe geneal than those of Theoem 4.2 in Lassen 22. The fist is that we do not equie the integand f of the cost functional to be unifomly continuous in its thee vaiables. This assumption is not needed fo the Dynamic Pogamming equation, while it is impotant fo vesions of the Hamilton-Jacobi-Bellman patial diffeential equation. The second is that we allow the optimisation poblem to be fomulated fo cetain subclasses of admissible stategies, namely the subclasses U n of piecewise constant stategies. The set Ĩ and thus the set of allowed intemediate times must be chosen accodingly. A.2 On the modulus of continuity of Itô diffusions A typical tajectoy of standad Bownian motion is Hölde continuous of any ode less than one half. If such a tajectoy is evaluated at two diffeent time points t 1, t 2 [, T ] with t 1 t 2 h small, then the diffeence between the values at t 1 and t 2 is not geate than a multiple of h ln 1 h, whee the popotionality facto depends on the tajectoy and the time hoizon T, but not on the choice of the time points t 1, t 2. This is a consequence of Lévy s exact modulus of continuity fo Bownian motion. The modulus of continuity of a stochastic pocess is a andom element. Lemma A.1 below shows that the modulus of

94 82 APPENDIX A. APPENDIX continuity of Bownian motion and, moe geneally, that of any Itô diffusion with bounded coefficients has finite moments of any ode. Lemma A.1, which teats the case of Itô diffusions with bounded coefficients, can be found in Słomiński 21, cf. Lemma A.4 thee. It is enough to pove Lemma A.1 fo the special case of one-dimensional Bownian motion. The full statement is then deived by a component-wise estimate and a time-change agument the Dambis-Dubins-Schwaz theoem, cf. Theoem in Kaatzas and Sheve 1991: p. 174, fo example. One way of poving the assetion fo Bownian motion diffeent fom the poof in Słomiński 21 is to follow the deivation of Lévy s exact modulus of continuity as suggested in Execise of Stoock and Vaadhan The main ingedient thee is an inequality due to Gasia, Rodemich, and Rumsey, see Theoem in Stoock and Vaadhan 1979: p. 47 and Gasia et al Fo the sake of completeness, we give the two poofs in full detail. Lemma A.1 Słomiński. Let W be a d 1 -dimensional Wiene pocess living on the pobability space Ω, F, P. Let Y = Y 1,..., Y d T be an Itô diffusion of the fom Y t = y + bsds + σsdw s, t, whee y R d and b, σ ae F t -adapted pocesses with values in R d and R d d 1, espectively. If b, σ ae bounded by some positive constant K, then it holds that fo evey p >, evey T > thee is a constant C p,t depending only on K, the dimensions, p and T such that E sup t,s [,T ], t s h Y t Y s p C p,t h ln 1 h p 2 fo all h, 1 2 ]. Poof. Let T >, p >. Then fo all t, s [, T ], Y t Y s p d p 2 Y 1 t Y 1 s p Y d t Y d s p, and fo the i-th component we have Y i t Y i s p = d 1 +1 p K p t s p + s d 1 j=1 s bi sd s + d 1 j=1 σ ij sdw j s. p s σ ij sdw j s p Hence, fo h, 1 2 ], E sup t,s [,T ], t s h d p 2 d1 +1 p d K p h p + Y t Y s p d d 1 E i=1 j=1 sup t,s [,T ], t s h s σ ij sdw j s. p To pove the assetion, it is enough to show that the d d 1 expectations on the ight-hand side of the last inequality ae of the ight ode. Let i {1,..., d}, j {1,..., d 1 }, and

95 A.2. ON THE MODULUS OF CONTINUITY OF ITÔ DIFFUSIONS 83 define the one-dimensional pocess M = M i,j by Mt := { σ ij s dw j s if t [, T ], MT + W j t W j T if t > T. Since σ ij is bounded, the pocess M is a matingale and can be epesented as a timechanged Bownian motion. Moe pecisely, by the Dambis-Dubins-Schwaz theoem, see Theoem in Kaatzas and Sheve 1991: p. 174, fo example, thee is a standad one-dimensional Bownian motion W living on Ω, F, P such that, P-almost suely, Mt = W M t fo all t, whee M is the quadatic vaiation pocess associated with M, that is, { M t = σ2 ij s d s if t [, T ], T σ2 ij s d s + t T if t > T. Consequently, E = E E sup t,s [,T ], t s h sup t,s [,T ], t s h sup t,s [,K 2 +1T ], t s K 2 +1h s σ ij sdw j s p W M t W M s p = E W t W s p sup t,s [,T ], t s h Mt Ms p as it holds that, P-almost suely, M t M s K 2 t s K 2 +1 t s fo all t, s [, T ]. The assetion is now a consequence of Lemma A.2, which gives an uppe bound fo the moments of the modulus of continuity fo standad one-dimensional Bownian motion. Lemma A.2. Let W be a standad one-dimensional Bownian motion living on the pobability space Ω, F, P. Then fo evey p >, evey T > thee is a constant C p,t such that E sup t,s [,T ], t s h W t W s p C p,t h ln 1 h p 2 fo all h, 1 2 ]. Poof. As announced above, the main ingedient in the poof is an inequality due to Gasia, Rodemich, and Rumsey; it allows us to get an uppe bound fo W tω W sω p in tems of ω Ω, T and the distance t s. To this end, we define two stictly inceasing functions Ψ, µ on [, by Ψx := exp x 2 2 1, µx := 2x, x [,. Instead of µ we could have taken any function of the fom x c x povided c > 1; as one may expect, the esulting constant C p,t would be diffeent. Clealy, Ψ = = µ, Ψ 1 y = 2 lny+1 fo all y, dµx = µdx = dx 2x.

96 84 APPENDIX A. APPENDIX In ode to pepae fo the application of the Gasia-Rodemich-Rumsey inequality, we set T T ξω := Ψ W tω W sω ds dt, ω Ω, µ t s thus defining an F-measuable andom vaiable with values in [, ]. Since W t W s has nomal distibution with mean zeo and vaiance t s, we see that Eξ = T T E exp t W s 2 ds dt T 2 4 t s = T T E exp t W s 2 ds dt T 2 4 t s 1 T T 1 u 2 = exp 2π t s 4 t s u2 du ds dt T 2 2 t s = 1 T 2π T 1 2π 2 t s ds dt T 2 = 2 1 T 2 <, t s that is, ξ has finite expectation. In paticula, ξω < fo P-almost all ω Ω. The Gasia-Rodemich-Rumsey inequality now implies that fo all ω Ω, all t, s [, T ], t s W tω W sω 8 Ψ 1 4ξω x 2 µdx = 8 t s 2 ln 4ξω dx +1. x 2 2x Notice that if ξω = then the above inequality is tivially satisfied. With h, 1 2 ] we have sup W tω W sω 8 t,s [,T ], t s h 8 h ln4ξω+1 dx x h h = 16 h h ln4ξω h ln 1 h ln4ξω+1 h ln 1 h + 2 ln2 32 ln4ξω h ln 1 h. ln 4ξω+x ln 1 x dx x ln 1 x ln 1x ln 1 x dx x ln 1 x dx x Consequently, fo all p >, all h, 1 2 ], E sup W t W s p t,s [,T ], t s h 32 p E ln4ξ p h ln 1 h p 2.

97 A.3. PROOFS OF CONSTANT COEFFICIENTS ERROR BOUNDS 85 The above inequality yields the assetion povided we can show that the expectation on the ight-hand side is finite. But this is the case, because p ln4ξ+1 E ln4ξ p p 2 E + 4 p and the expectation on the ight-hand side of the last inequality is finite, as Eξ < and lnx+1 x 2 p fo all x big enough. Moe pecisely, if p 1, then lnx lnp+1 x 1 p fo all x e p p, whence ln4ξ+1 p 2 E 1+lnp E 4ξ+1 + p lnp+p p 2 1+lnp T 2 + p lnp+p p 2 2 p lnp+p p T. Theefoe, the asseted inequality follows fo p 1, whee the constant C p,t need not be geate than 256 p p lnp+p p T. On the othe hand, if p, 1, then clealy p E ln4ξ E ln4ξ Eξ T + 1, and the constant C p,t, p, 1, need not be geate than 2 32 p 1 + T. A.3 Poofs of constant coefficients eo bounds The fist esult we give hee is a educed vesion, adapted to ou notation, of Theoem 2.7 in Kylov 21. It povides an estimate of the eo in appoximating constant-coefficient contolled Itô diffusions by diffusions with piecewise constant stategies. The eo is measued in tems of cost-functional-like expectations with Lipschitz o Hölde coefficients; see Section 1 in Kylov 21 fo a discussion of vaious eo citeia. In the deteministic case, bette eo bounds can be obtained, see Lemmata A.3 and A.4 below. Let Ω, F, P, F t, W be a Wiene basis of dimension d 1 in the sense of Definition 3.1. As above, let Γ, ρ be a complete and sepaable metic space, and denote by U the set of all F t -pogessively measuable pocesses [, Ω Γ. Fo n N, let U n be the subset of U given by 3.1. Thus, if ū U n, then ū is ight-continuous and piecewise constant in time elative to the gid {k n k N } and ūt is measuable with espect to the σ-algeba geneated by W k n, k =,..., t n. We have incopoated the delay length in the patition in ode to be coheent with the notation of Section 3.3. In the oiginal wok by Kylov 21, thee is no delay and the time gid has mesh size 1 n instead of n. Let b: Γ R d, σ : Γ R d d 1 be continuous functions with b, σ bounded by K. Fo u U denote by X u the pocess X u t := t b us ds + σ us dw s, t.

98 86 APPENDIX A. APPENDIX Let us wite. Γ fo the supemum nom of a eal-valued function ove Γ. Let us wite. 1 fo the Lipschitz nom of a eal-valued function defined on R d. Thus, if g is a Lipschitz continuous function R d R, then g 1 := sup x,y R d,x y gx gy. x y The following theoem povides an eo estimate fo the appoximation of a pocess X u, whee u U, by pocesses X un, n N, whee u n U n, in tems of suitable cost functionals. Theoem A.2 Kylov. Let T >. Thee is a constant C > depending only on K and the dimensions such that the following holds: Fo any n N such that n, any bounded continuous function f : Γ R, any bounded Lipschitz continuous function g : R d R, any u U thee exists u n U n such that T E f u n s ds + g X un T T E f us ds + g X u T C1+ T n1 4 n 1 4 f Γ + g 1. Note that in Theoem A.2 the diffeence between the two expectations may be inveted, since we can take f in place of f and g in place of g. Poof. Let n N such that n 1. Define an extended cost functional J on R R d U by T t E f us ds + g x+x Jt, u T t if t < T, x, u := gx if t T. Let V n be the value function aising fom minimising J ove U n, that is, V n t, x := inf Jt, x, u, t, x R R d. u U n To pove the assetion it is enough to show that fo all u U, x R d, V n, x J, x, u C1+ T n1 n f Γ + g 1. Indeed, it suffices to veify the above inequality fo x = R d, because we may conside the tanslated poblem with gx +. in place of g., leaving the othe functions f, b, σ unchanged. Hence, it suffices to show that Vn, J,, u + C1+ T n1 4 n 1 4 f Γ + g 1 fo all u U. We take note of the following popeties of the discete value function V n, cf. Lemma 3.1 in Kylov Lipschitz continuity in space: fo all t R, x, y R d, Vn t, x V n t, y g 1 x y. This is clea fom the obsevation that V n t, x V n t, y is bounded by the supemum of Jt, x, u Jt, y, u ove u U n and the definition of J.

99 A.3. PROOFS OF CONSTANT COEFFICIENTS ERROR BOUNDS One-step Pinciple of Dynamic Pogamming: fo all t T n, x Rd, V n t, x = inf γ Γ E n fγ + V n t+ n, x+xγ n. This is a consequence of Theoem A.1 in Appendix A.1. As will be seen below, it is actually enough to have an uppe bound fo V n, that is, to have the one-step Dynamic Pogamming Inequality with in place of =. 3. Hölde continuity in time: fo all t, s T, x R d, Vn t, x V n s, x f Γ t s + K g 1 t s + d1 t s. To check this popety, notice that V n t, x V n s, x is bounded by the supemum of Jt, x, u Js, x, u ove u U n. Now, fo u U, it holds that Jt, x, u Js, x, u f Γ t s + g 1 E X u T t X u T s f Γ t s + g 1 K t s + K d 1 t s. A main difficulty in estimating the eo aising fom time-discetisation of the stategies is due to the fact that neithe the discete value function V n no the oiginal value function V ae necessaily diffeentiable. Kylov s idea fo ovecoming this poblem is to conside a V ε n family of mollified functions ε,1] in place of V n. The Hölde and Lipschitz egulaity of V ε n tanslate into bounds on the patial deivatives of V n, which in tun seve to estimate the discetisation eo fo the mollified value functions; because of the smoothness of the ε functions V n, Itô s fomula can be applied. Also the eo between V ε n and V n has to be estimated. Finally, to equate the two eo bounds, one chooses the mollification paamate ε of the ight ode in n. The idea of using the Pinciple of Dynamic Pogamming to get fom a local to a global eo bound e-appeas. Let η C R, ξ C R d be non-negative eal-valued functions with unit integal and compact suppot; assume that ηt = fo t R \, 1. Fo ε, 1] define η ε t := ε 1 η 1 ε t, t R, ξ ε x :=ε d ξ 1 ε x, x R d, ζ ε t, x := η ε 2t ξ ε x, t, x R R d. Notice the diffeent scaling in time and space as egads the functions ζ ε. Define the mollified discete value function with paamete ε as V n ε := V ε n ζ ε, i. e., V n t, x = R ζ ε t s, x y V n s, y dy ds, R d t, x R R d. Denote by V n ε the discete value function with paamete ε which is mollified only in the space vaiable, that is, V n ε t, x = ξ ε x y V n t, y dy, t, x R R d. R d ε The function V n, i. e. the mollification of V n in time and space, is in C R R d and has bounded patial deivatives of all odes. The following estimates on the patial deivatives will be needed. The constants C 1,..., C 6 that will appea in the estimates below depend

100 88 APPENDIX A. APPENDIX only on K, the dimensions d, d 1 and the choice of the mollifies η and ξ. Recall that ε, 1] and that η and ξ ae C -functions with unit integal and compact suppot, the suppot of η being contained in [, 1]. This implies, in paticula, that the integals 1 η sds, 1 η sds, suppξ Dl ξydy all equal zeo, whee l > is the ode of any patial deivatives in space. 1. Patial deivative in time of second ode: fo all t T, x R d, t 2 ε V 2 = ε 6 = ε 6 ε 2 n t, x = 2 t 2 ε 2 η t s ε V ε 2 n s, x ds ds t ε 2 η t s ε 2 R d ξ ε x y V n s, y dy ε 2 ε 6 η s ε 2 R d ξ ε x y Vn t s, y V n t, y dy η s ds f ε 2 Γ s + 1+ d 1 K g 1 s ξ ε ydy ds R d ε 6 ε f Γ + 1+ ε 2 d 1 K g 1 η s ε 2 s ds ε 6 ε f Γ d 1 K g 1 ε 3 η s s ds C 1 ε 3 ε f Γ + g Patial deivatives in space of ode l {1, 2, 3, 4}: fo all t, x R R d, D l ε V n t, x sup D l ε V n s, x s R = sup ε l d D s R R l ξ 1 d ε y Vn s, x y V n s, x dy ε R l d D l ξ 1 d ε y y g 1 dy = ε l d g 1 ε d ε l g 1 ε sup D l ξy y supp ξ supp ξ suppξ y dy C 2 ε 1 l g 1. D l ξy ε y dy 3. Mixed patial deivatives of fist ode in time and ode l {1, 2} in space: fo all t, x R R d, ε 2 t Dl ε V n t, x = ε 4 η s ε D l ε V 2 n t s, x ds 1 ε 4 C 2 ε 1 l g 1 ε 2 η s ds =: C 3 ε l 1 g 1. ε Itô s fomula will pesently be applied to get an uppe bound fo V n,. To this pupose, fo γ Γ, let L γ be the second ode patial diffeential opeato t + d σ σ T 2 ij γ + x i x j i,j=1 d i=1 bi γ x i

101 A.3. PROOFS OF CONSTANT COEFFICIENTS ERROR BOUNDS 89 acting on functions in C 2 R R d. Let u U be any stategy. Itô s o Dynkin s fomula then yields E V ε n T n, Xu T n = inf γ Γ ε V n, + E T n L ut ε V n t, X u t dt Let t T n, x Rd. As a consequence of the one-step PDP fo V n, Fatou s lemma and Fubini s theoem, we have V n ε t, x = ζ ε t s, x y V n s, y dy ds R R d = ζ ε t s, x y inf E R R d γ Γ n fγ + V n s+ n, y+xγ n dy ds { } inf γ Γ n fγ + E ζ ε t s, x y V n s+ R R d n, y+xγ n dy ds { n fγ + E V ε n t+ n, x+xγ n }. Let γ Γ. Itô s fomula and Fubini s theoem yield. E V ε n t+ n, x+xγ n = V ε n t, x + n E L γ V ε n t+s, x+x γ s ds. This, togethe with the above Dynamic Pogamming inequality, implies that n E L γ V ε n t+s, x+x γ s ds n fγ. Applying Itô s fomula to L γ V ε n t +., x +. we see that, fo all s, E L γ ε V n t+s, x+x γ s = L γ V ε n s t, x + E Theefoe, fo all γ Γ, t T n, x Rd it holds that L γ ε V n t, x + n n s E L γ L γ ε V n t+ s, x+x γ s d s. L γ L γ ε V n t+ s, x+x γ s d s ds fγ. The diffeential opeato L γ L γ is composed of the following patial deivatives: deivative in time of second ode, second to fouth ode deivatives in space, mixed deivatives of fist ode in time and fist and second ode in space. The above bounds on the patial deivatives of V ε theefoe imply that, fo all γ Γ, s T, y R d, L γ L γ ε V n s, y C4 ε 3 ε f Γ + g 1, whee C 4 := max{c 1, C 2, C 3 }. Notice that ε 3 ε l fo all l 3 since ε, 1]. Using the above inequality, we obtain L γ ε V n t, x fγ n C 4 ε 3 ε f Γ + g 1 fo all γ Γ, t T n, x Rd.

102 9 APPENDIX A. APPENDIX Recall that V n ε, = E T n L ut ε V n t, X u t dt + E V ε n T n, Xu T n, whee u U is an abitay stategy. The above lowe bound fo L γ V ε n T V n ε n, E On the othe hand, f ut dt tanslates into + E V ε n T n, Xu T n + T n C 4 ε 3 ε f Γ + g 1. V ε n is close to V n ; moe pecisely, fo t, x R R d, ε V n t, x V n t, x ζ ε s, y Vn t s, x y V n t, x dy ds R R d ζ ε s, y g 1 y + f Γ s + K g 1 d1 + s s dy ds R R d C 5 ε ε f Γ + g 1. Combining the last two inequalities we get T V n, E f ut dt + E Vn T n, Xu T n + n f Γ + 2C 5 ε ε f Γ + g 1 + T n C 4 ε 3 ε f Γ + g 1. Now obseve that, fo x, y R d, Vn T n, y gx = Vn T n, y V n T, x g 1 x y + n f Γ d 1 K g 1, n whence E Vn T n, Xu T n E g X u T n f Γ + 1+ d 1 K g 1 n g 1 E X u T n X u T n f Γ + 1+ d 1 K g 1 n d 1 K g 1 n 1 2. Consequently, we have T V n, E f ut dt + E g X u T C 6 n n f Γ + g 1 + 2C 5 ε ε f Γ + g 1 + T n C 4 ε 3 ε f Γ + g 1.

103 A.3. PROOFS OF CONSTANT COEFFICIENTS ERROR BOUNDS 91 In ode to equate the ode of the eo in the last two summands, set ε := n 1 4. With this choice of ε and ecalling the definition of J, we find that V n, J,, u + C 6 n + T n C 4 n n 2 n J,, u + C T +1 n1 1 2 f 1 Γ + g 1 + 2C 5 n 1 4 f Γ + g 1 4 n 1 4 f Γ + g 1, 4 n 1 4 f Γ + g 1 whee u U is abitay and the constant C can be chosen as max{c 4, 2C 5, C 6 }. Hence Inequality holds. We now tun to the deteministic case. Let Û denote the set of all deteministic stategies, that is, Û is the set of all measuable functions [, Γ. Fo n N, let Ûn be the subset of Û consisting of all ight-continuous functions [, Γ which ae piecewise constant elative to the gid {k n k N }. Again, we have incopoated the delay length in the patition in ode to be coheent with the notation of Section 3.3. Let b: Γ R d be a measuable function with b bounded by K. Fo u Û, denote by x u the function x u t := b us ds, t. The following esults povide eo estimates fo the appoximation of a function x u, whee u Û, by functions xun, n N, whee u n Ûn, in tems of suitable cost functionals. The esult we state fist should be compaed to Theoem 2.1 in Falcone and Giogi 1999 and also to Theoem A.2 above. Recall that the eo bound in Theoem A.2 is of ode h 1/4 in the time step h = n, while the bound fo deteministic poblems automatically impoves to h 1/2. Lemma A.3. Let T >. Thee is a constant C > depending only on K and the dimension d such that the following holds: Fo any n N such that n, any bounded measuable function f : Γ R, any bounded Lipschitz continuous function g : R d R, any u Û thee exists u n Ûn such that T f u n s ds + g x un T C1+ T T n1 2 f Γ + g 1. f us ds + g x u T The poof of Lemma A.3 is mutatis mutandis completely paallel to the poof of Theoem A.2. Itô s fomula has to be eplaced by the usual change-of-vaiable fomula, and the scaling elation between smoothing in time and smoothing in space must be modified, as would be expected, fom ε vs. ε to ε vs. ε. Obseve, howeve, that the poof of Theoem 2.1 in Falcone and Giogi 1999 is diffeent, as it elies on the theoy of viscosity solutions. If the space of contol actions Γ is finite, then the following elementay aguments show that the appoximation eo is of ode h in the length h = n of the time step.

104 92 APPENDIX A. APPENDIX Lemma A.4. Assume that Γ is finite with cadinality N Γ. Let T >. Then fo any n N such that n T N Γ, any bounded measuable function f : Γ R, any bounded Lipschitz continuous function g : R d R, any u Û thee exists u n Ûn such that T n 1 + N Γ f u n s ds + g x un T f Γ + K g 1. T f us ds + g x u T Poof. By hypothesis, Γ has N Γ elements, say Γ = {γ 1,..., γ NΓ }. Let n N be such that n T N Γ. Clealy, fo abitay u Û, all ū Ûn, T T f ūs ds + g xū T f ūs T ds T f us ds + g x u T f us T T. ds + g 1 būs ds b us ds Denoting by λ 1 Lebesgue measue on R, we set a k := λ 1{ s [, T ] us = γ k }, k {1,..., NΓ }. Then, by definition of the Lebesgue integal, T f us ds = N Γ a k fγk, k=1 T N Γ b us ds = a k bγk. k=1 Notice that the integal ove f is just a eal numbe, while the integal ove b is a point in R d. On the othe hand, setting we have T T Consequently, j k := # { i {1,..., T n 1} ū i n = γ k}, k {1,..., NΓ }, f ūs ds = NΓ k=1 k=1 j k n fγ k NΓ būs ds = j k n bγ k f ū n T n T n T n, b ū n T n T n T n. T f ūs T ds f NΓ Γ + g 1 K n + ak j k, n f us T T ds + g 1 būs ds b us ds k=1 whee the hypothesis that b K has been used. Recall that a 1,..., a NΓ depend on u Û, while j 1,..., j NΓ depend on the choice of ū Ûn. Let us fix u Û. Clealy, a k

105 A.3. PROOFS OF CONSTANT COEFFICIENTS ERROR BOUNDS 93 and N Γ k=1 a k = T. Define numbes j 1,..., j NΓ ecusively by setting j 1 := n a 1 and, if N Γ 2, n l l 1 j l := a k j k, l {2,..., N Γ }. k=1 With this definition, the numbes j 1,..., j NΓ ae in {,..., n T } and N Γ N Γ 1 n N Γ n j k = j NΓ + j k = a k = T. k=1 k=1 To estimate the diffeence between a l and n j l, l {1,..., N Γ }, note that a1 j 1 = n n n n a 1 a 1 < n, and obseve that fo all a, â, â a+â + a = a + â a+â a a < 1. Theefoe, fo all l {2,..., N Γ }, a l j l = n n n a l = n n a l n n k=1 k=1 l l 1 a k j k k=1 l a k + k=1 It is clea that we can choose ū Ûn such that k=1 n l 1 k=1 a k < n. j k = # { i {1,..., T n 1} ū i n = γ k} fo all k {1,..., NΓ }. Fo example, we may define ū to be equal to γ 1 on the inteval [, n j 1, then to be equal to γ 2 on the inteval [ n j 1, n j 1+j 2 and so on. In this way, given u Û, we find ū Ûn such that T f ūs T ds f us T T ds + g 1 būs ds b us ds f Γ + g 1 K n + N Γ, n which yields the assetion. Let us etun a last time to the stochastic setting. We ae inteested in the case when the diffusion matix is constant and the space of contol actions Γ is finite. Let Ω, F, P, F t, W be a Wiene basis of dimension d 1, U the set of all F t -pogessively measuable pocesses [, Ω Γ, and U n be the subset of stategies which ae ightcontinuous and piecewise constant in time elative to the gid {k n k N } and measuable with espect to the σ-algeba geneated by W k n, k N, as above. Let b: Γ R d be a continuous function with b bounded by K, and let σ be a d d 1 matix. Fo u U, denote by X u the R d -valued pocess X u t := b us ds + σ W t, t. The following esult gives a bound on the discetisation eo which is of ode h in the time step h = n.

106 94 APPENDIX A. APPENDIX Lemma A.5. Assume that Γ is finite with cadinality N Γ and that the diffusion coefficient σ is a constant matix. Let T >. Then fo any squae numbe n N such that n T N Γ, any bounded measuable function f : Γ R, any bounded Lipschitz continuous function g : R d R, any u U thee exists u n U n such that T E f u n s ds + g X un T T E f us ds + g X u T T + N Γ f Γ + K g 1. n Poof. Let n N be such that n T N Γ. Since σ is constant, we have fo abitay u U, all ū U n, T E f ūs ds + g Xū T T E f us ds + g X u T E T Let ω Ω. Clealy, T T f us, ω ds = b us, ω ds = f ūs T ds n k=1 n T k n k=1 f us T T ds + g 1 būs ds b us ds. k T n T k 1 n k T n T k 1 f us, ω ds + b us, ω ds + n T n T n T n T f us, ω ds, n b us, ω ds. By Lemma A.4 and its poof, we can find a deteministic function û ω Ûn such that fo all k {1,..., n 1}, k+1 T n fûω s k T n ds + g 1 k+1 T n T k n n 1 + N Γ bûω s k T n ds f Γ + K g 1. f us, ω ds T k 1 n. b us, ω ds T k 1 n Notice that, since n is a squae numbe, the points of the gid of mesh size n ae also pat of the fine gid of mesh size n. The functions û ω, ω Ω, can now be chosen in such a way that ūt, ω:= û ω t, t, ω Ω, defines an F t -pogessively measuable piecewise constant Γ-valued pocess which is also measuable with espect to the σ-algeba geneated by W k n, k N. Thus, ū is a stategy in U n, and it holds that E n T f ūs T ds n 1 + N Γ f Γ + K g 1 f us T T ds + g 1 būs ds b us ds + 4 T n f Γ + K g 1.

107 Bibliogaphy S. Ankichne, P. Imkelle, and A. Popie. Optimal coss hedging fo insuance deivatives. Humboldt Univesity, 27. K. Atkinson and W. Han. Theoetical Numeical Analysis, volume 39 of Texts in Applied Mathematics. Spinge-Velag, New Yok, 21. M. Badi and I. Capuzzo Dolcetta. Optimal Contol and Viscosity Solutions of Hamilton- Jacobi-Bellman equations, volume 4 of Systems & Contol: Foundations & Applications. Bikhäuse, Boston, G. Bales and E. R. Jakobsen. Eo bounds fo monotone appoximation schemes fo Hamilton-Jacobi-Bellman equations. SIAM J. Nume. Anal., 432:54 558, 25. G. Bales and E. R. Jakobsen. Eo bounds fo monotone appoximation schemes fo paabolic Hamilton-Jacobi-Bellman equations. Math. Comput., 7626: , 27. G. Bales and P. E. Souganidis. Convegence of appoximation schemes fo fully nonlinea second ode equations. Asymptotic Anal., 4: , H. Baue and U. Riede. Stochastic contol poblems with delay. Math. Meth. Ope. Res., 623: , 25. R. Bellman and R. Kabala. Dynamic Pogamming and Moden Contol Theoy. Academic Pess, New Yok, A. Bensoussan, G. Da Pato, M. C. Delfou, and S. K. Mitte. Repesentation and Contol of Infinite-Dimensional Systems. Systems & Contol: Foundations & Applications. Bikhäuse, Boston, 2nd edition, 27. D. P. Betsekas. Dynamic Pogamming and Optimal Contol, volume 1. Athena Scientific, Belmont, Massachusetts, 3d edition, 25. D. P. Betsekas. Dynamic Pogamming and Optimal Contol, volume 2. Athena Scientific, Belmont, Massachusetts, 3d edition, 27. D. P. Betsekas and S. E. Sheve. Stochastic Optimal Contol: The Discete-Time Case. Athena Scientific, Belmont, Massachusetts, epint of the 1978 edition, P. Billingsley. Convegence of Pobability Measues. Wiley seies in Pobability and Statistics. John Wiley & Sons, New Yok, 2nd edition, R. Boucekkine, O. Licando, L. A. Puch, and F. del Rio. Vintage capital and the dynamics of the AK model. J. Econ. Theoy, 12:39 72, 25. E. Buckwa. Intoduction to the numeical analysis of stochastic delay diffeential equations. J. Comput. Appl. Math., :297 37, 2. 95

108 96 BIBLIOGRAPHY A. Calzolai, P. Flochinge, and G. Nappo. Convegence in nonlinea filteing fo stochastic delay systems. SIAM J. Contol Optim., 465: , 27. I. Capuzzo Dolcetta and M. Falcone. Discete dynamic pogamming and viscosity solutions of the Bellman equation. Ann. Inst. Heni Poincaé, Anal. Non Linéaie, 6Suppl.: , I. Capuzzo Dolcetta and H. Ishii. Appoximate solutions of the Bellman equation of deteministic contol theoy. Appl. Math. Optim., 112: , M.-H. Chang, T. Pang, and M. Pemy. Stochastic optimal contol poblems with a bounded memoy. In X. Zhang, D. Liu, and L. Wu, editos, Opeations Reseach and Its Applications. Papes fom the Sixth Intenational Symposium, ISORA 6, Xinjiang, China, August 8-12, pages 82 94, Beijing, 26. Wold Publishing Copoation. C.-S. Chow and J. N. Tsitsiklis. The complexity of dynamic pogamming. J. Complexity, 5: , G. Da Pato and J. Zabczyk. Stochastic Equations in Infinite Dimensions, volume 45 of Encyclopedia of Mathematics and its Applications. Cambidge Univesity Pess, Cambidge, I. Elsanosi, B. Øksendal, and A. Sulem. Some solvable stochastic contol poblems with delay. Stochastics Stochastics Rep., 711-2:69 89, 2. S. N. Ethie and T. G. Kutz. Makov Pocesses: Chaacteization and Convegence. Wiley Seies in Pobability and Statistics. John Wiley & Sons, New Yok, M. Falcone and R. Feetti. Discete time high-ode schemes fo viscosity solutions of Hamilton-Jacobi-Bellman equations. Nume. Math., 673: , M. Falcone and T. Giogi. An appoximation scheme fo evolutive Hamilton-Jacobi equations. In W. McEneaney, G. Yin, and Q. Zhang, editos, Stochastic Analysis, Contol, Optimization and Applications. A Volume in Hono of W. H. Fleming, pages , Boston, Bikhäuse. M. Falcone and R. Rosace. Discete-time appoximation of optimal contol poblems fo delayed equations. Contol Cyben., 253: , M. Fische and G. Nappo. Time discetisation and ate of convegence fo the optimal contol of continuous-time stochastic systems with delay. Appl. Math. Optim., to appea, 27. M. Fische and M. Reiß. Discetisation of stochastic contol poblems fo continuous time dynamics with delay. J. Comput. Appl. Math., 252: , 27. W. H. Fleming and H. M. Sone. Contolled Makov Pocesses and Viscosity Solutions, volume 25 of Applications of Mathematics. Spinge-Velag, New Yok, 2nd edition, 26. R. Gabasov and F. M. Kiillova. Method of optimal contol. J. Math. Sci. NY, 75: , A. M. Gasia, E. Rodemich, and H. Rumsey, J. A eal vaiable lemma and the continuity of paths of some Gaussian pocesses. Indiana Math. J., 26: , 197. I. I. Gihman and A. V. Skookhod. Contolled Stochastic Pocesses. Spinge-Velag, New Yok, 1979.

109 BIBLIOGRAPHY 97 O. Henández-Lema and J. B. Lassee. Discete-Time Makov Contol Pocesses: Basic Optimality Citeia, volume 3 of Applications of Mathematics. Spinge-Velag, New Yok, Y. Hu, S.-E. A. Mohammed, and F. Yan. Discete-time appoximations of stochastic delay equations: the Milstein scheme. Ann. Pob., 321A: , 24. J. Jacod and A. N. Shiyaev. Limit Theoems fo Stochastic Pocesses, volume 288 of Gundlehen de mathematischen Wissenschaften. Spinge-Velag, Belin, I. Kaatzas and S. E. Sheve. Bownian Motion and Stochastic Calculus, volume 113 of Gaduate Texts in Mathematics. Spinge-Velag, New Yok, 2nd edition, V. B. Kolmanovskiǐ and L. E. Shaǐkhet. Contol of Systems with Afteeffect, volume 157 of Tanslations of Mathematical Monogaphs. Ameican Mathematical Society, Povidence, Rhode Island, N. V. Kylov. On the ate of convegence of finite-diffeence appoximations fo Bellman s equations with vaiable coefficients. Pobab. Theoy Relat. Fields, 1171:1 16, 2. N. V. Kylov. Mean value theoems fo stochastic integals. Ann. Pobab., 291:385 41, 21. N. V. Kylov. The ate of convegence of finite-diffeence appoximations fo Bellman equations with Lipschitz coefficients. Appl. Math. Optim., 52: , 25. N. V. Kylov. Contolled Diffusion Pocesses, volume 14 of Applications of Mathematics. Spinge-Velag, New Yok, 198. N. V. Kylov. Appoximating value functions fo contolled degeneate diffusion pocesses by using piece-wise constant policies. Electon. J. Pobab., 4Pape No. 2:1 19, T. G. Kutz and P. Potte. Weak convegence of stochastic integals and diffeential equations. Lectue notes fo the 1995 CIME School in Pobability, Octobe 24. T. G. Kutz and P. Potte. Weak limit theoems fo stochastic integals and stochastic diffeential equations. Ann. Pobab., 193:135 17, H. J. Kushne. Numeical appoximations fo nonlinea stochastic systems with delays. Stochastics, 773:211 24, 25. H. J. Kushne. Weak Convegence Methods and Singulaly Petubed Stochastic Contol and Filteing Poblems, volume 3 of Systems & Contol: Foundations & Applications. Bikhäuse, Boston, 199. H. J. Kushne and P. Dupuis. Numeical Methods fo Stochastic Contol Poblems in Continuous Time, volume 24 of Applications of Mathematics. Spinge-Velag, New Yok, 2nd edition, 21. B. Lassen. Dynamic pogamming in stochastic contol of systems with delay. Stochastics Stochastics Rep., 743-4: , 22. B. Lassen and N. H. Risebo. When ae HJB-equations fo contol poblems with stochastic delay equations finite dimensional? Stochastic Anal. Appl., 213: , 23. X. Mao. Numeical solutions of stochastic functional diffeential equations. LMS J. Comput. Math., 6: , 23. S.-E. A. Mohammed. Stochastic Functional Diffeential Equations. Pitman Publishing, London, 1984.

110 98 BIBLIOGRAPHY B. Øksendal and A. Sulem. A maximum pinciple fo optimal contol of stochastic systems with delay, with applications to finance. In J. Menaldi, E. Rofman, and A. Sulem, editos, Optimal Contol and Patial Diffeential Equations. In Honou of Pofesso Alain Bensoussan s 6th Bithday. Poceedings of the Confeence, Pais, Decembe 4, 2, pages IOS Pess, Amstedam, 21. P. E. Potte. Stochastic Integation and Diffeential Equations, volume 21 of Applications of Mathematics. Spinge-Velag, Belin, 2nd edition, 23. L. C. G. Roges. Pathwise stochastic optimal contol. SIAM J. Contol Optim., 463: , 27. L. Słomiński. Eule s appoximations of solutions of SDEs with eflecting bounday. Stochastic Pocesses Appl., 94: , 21. D. W. Stoock and S. R. S. Vaadhan. Multidimensional Diffusion Pocesses, volume 233 of Gundlehen de mathematischen Wissenschaften. Spinge-Velag, Belin, J. F. Taub and A. G. Weschulz. Complexity and Infomation. Lezioni Lincee. Cambidge Univesity Pess, Cambidge, N. M. van Dijk. Contolled Makov pocesses: time discetization, volume 11 of CWI Tacts. Centum voo Wiskunde en Infomatica, Amstedam, F. Yan and S.-E. A. Mohammed. A stochastic calculus fo systems with memoy. Stochastic Anal. Appl., 23: , 25. J. Yong and X. Y. Zhou. Stochastic Contols. Hamiltonian Systems and HJB Equations, volume 43 of Applications of Mathematics. Spinge-Velag, New Yok, 1999.