Real-Time Scheduling via Reinforcement Learning

Size: px
Start display at page:

Download "Real-Time Scheduling via Reinforcement Learning"

Transcription

1 Real-Time Scheduling via Reinforcemen Learning Rober Glaubius, Terry Tidwell, Chrisopher Gill, and William D. Smar Deparmen of Compuer Science and Engineering Washingon Universiy in S. Louis Absrac Cyber-physical sysems, such as mobile robos, mus respond adapively o dynamic operaing condiions. Effecive operaion of hese sysems requires ha sensing and acuaion asks are performed in a imely manner. Addiionally, execuion of mission specific asks such as imaging a room mus be balanced agains he need o perform more general asks such as obsacle avoidance. This problem has been addressed by mainaining relaive uilizaion of shared resources among asks near a user-specified arge level. Producing opimal scheduling sraegies requires complee prior knowledge of ask behavior, which is unlikely o be available in pracice. Insead, suiable scheduling sraegies mus be learned online hrough ineracion wih he sysem. We consider he sample complexiy of reinforcemen learning in his domain, and demonsrae ha while he problem sae space is counably infinie, we may leverage he problem s srucure o guaranee efficien learning. 1 Inroducion In cyber-physical sysems such as mobile robos, seing and enforcing a uilizaion arge for shared resources is a useful mechanism for sriking a balance beween general and mission-specific goals while ensuring imely execuion of asks. However, classical scheduling approaches are inapplicable o asks in he domains we consider. Firs, some asks are no efficienly preempable: for example, acuaion asks involve moving a physical resource, such as a roboic arm or pan-il uni. Resoring he acuaor sae afer a preempion would be essenially he same as resaring ha ask. Therefore, once an insance of a ask acquires he resource, i should reain he resource unil compleion. Second, he duraion for which a ask holds he resource may be sochasic. This is rue for acuaion asks, which ofen involve one or more variable mechanical processes. Classical real-ime scheduling approaches model asks deerminisically by reaing a ask s wors-case execuion ime (WCET) as is execuion budge. This is inappropriae in our domain, as a ask s WCET may be many orders of magniude larger han is ypical duraion. To accoun for his variabiliy, we assume ha each ask s duraion obeys some underlying bu unknown saionary disribuion. Behaving opimally under hese condiions requires ha we accoun for his uncerainy in order o anicipae common evens while exploiing early resource availabiliy and hedging agains delays. In previous work (Glaubius e al., 2008, 2009), we have proposed mehods for solving scheduling problems wih hese concerns, provided ha accurae ask models are available. One sraighforward approach for employing hese mehods is via cerainy equivalence: consrucing and solving an approximae model from observaions of he sysem. However, his is less effecive han inerleaving modeling and soluion wih execuion, since inerleaving learning allows he conroller o adap o condiions observed during execuion, which may differ from condiions observed in a disinc modeling phase. Inerleaving modeling and execuion raises he exploraion/exploiaion dilemma (Kaelbling e al., 1996): he conroller mus balance opimal behavior wih respec o available informaion agains he long-erm benefi of choosing apparenly subopimal exploraory acions ha will improve ha informaion. This dilemma is paricularly relevan in he real-ime sysems domain, as susained subopimal behavior ranslaes direcly ino poor qualiy of service. In his paper we consider he problem of learning nearopimal schedules when he sysem model is no known

2 in advance. We provide PAC bounds on he compuaional complexiy of learning a near-opimal policy using balanced wandering. Our resul is novel, as i exends esablished mehods for learning in finie Markov decision processes o a domain wih a counably infinie sae space wih unbounded coss. We also provide an empirical comparison of several exploraion mehods, and observe ha he srucure of he ask scheduling problem enforces effecive exploraion. 2 Background 2.1 Sysem Model As in Glaubius e al. (2008, 2009), he ask scheduling model consiss of n asks (T i ) n i=1 ha require muually exclusive use of a single common resource. Each ask T i consiss of an infinie sequence of jobs (T i,j ) j=0. Job T i,0 is available a ime 0, while each job T i,(j+1) becomes available immediaely upon compleion of job T i,j. Jobs canno be preemped, so whenever a job is graned he resource, i occupies ha resource for some sochasic duraion unil compleion. Two simplifying assumpions are made regarding he disribuion of job duraions: (A1) Iner-ask job duraions are independenly disribued. (A2) Inra-ask job duraions are independenly and idenically disribued. When A1 holds, he duraion of job T i,j always obeys he same disribuion regardless of wha job preceded i. This means ha he sysem hisory is no necessary o predic he behavior of a paricular job. When A2 holds, consecuive jobs of he same ask obey he same disribuion. Thus, every ask T i has a duraion disribuion P ( i) from which he duraion of every job of T i is drawn. The acuaor example in he previous secion does no immediaely saisfy hese assumpions, since a job s duraion depends on he sae of he acuaor when he job sars execuing. These may be enforced in acuaor-sharing, however, by requiring ha each job leaves he acuaor in a saic reference posiion before relinquishing conrol. In addiion o he assumpions saed above, each duraion disribuion mus have bounded suppor on he posiive inegers: ha is, every ask T i has an inegervalued WCET W i such ha W i =1 P ( i) = 1. For simpliciy, W denoes he maximum among all W i, and he WCET of individual asks are ignored. Our goal is o schedule jobs in order o preserve emporal isolaion (Srinivasan and Anderson, 2005) among asks. We specify some arge uilizaion u i for each ask ha describes is inended resource share a any emporal resoluion. More specifically, le x i () denoe he number of quana during which ask T i held he resource in he inerval [0, ). Our objecive is o keep ( )u i (x i ( ) x i ()) as small as possible over every ime inerval [, ) for each ask T i. We require ha each ask s uilizaion arge u i is raional and ha he resource is compleely divided among all asks, so ha n i=1 u i = MDP Formulaion Following Glaubius e al. (2008, 2009), his problem is modeled as a Markov decision process (MDP) (Puerman, 1994). An MDP consiss of a se of saes X, a se of acions A, a ransiion sysem P, and a cos funcion C. A each discree decision epoch k, a conroller observes he curren MDP sae x k and selecs an acion i k. The MDP hen ransiions o sae x k+1 disribued according o P ( x k, i k ) and incurs cos c k = C(x k+1 ). The value V π of a policy π is he expeced long-erm γ-discouned cos of following π, where γ is a discoun facor in (0, 1). V π saisfies he recurrence V π (x) = y X P (y x, π(x))[γv π (y) C(y)]. I is ofen convenien o compare alernaive acions using he sae-acion value funcion Q π (x, i), Q π (x, i) = y X P (y x, i)[γv π (y) C(y)]. The objecive is o find an opimal policy π such ha V π (x) V π (x) among all saes x and policies π. For breviy, V and Q are used o denoe V π and Q π. V saisfies he Bellman Equaion (Puerman, 1994) V (x) = max i A P (y x, i)[γv (y) C(y)], (1) y X or equivalenly V (x) = max i {Q(x, i)}. An opimal policy is obained by behaving greedily wih respec o Q, π (x) argmax i {Q(x, i)}. Thus, compuing he opimal conrol can be reduced o compuing he opimal value funcion. Several dynamic programming and linear programming approaches have been developed o solve such problems when X and A are finie (Puerman, 1994). The ask scheduling problem is modeled as an MDP over a se of uilizaion saes X = N n. Each sae x is an n-vecor x = (x 1,..., x n ) where each componen x i

3 Figure 1: The uilizaion sae model for a woask problem insance. T 1 (grey, open arrowheads) sochasically ransiions o he righ, while T 2 (black, closed arrowheads) deerminisically ransiions upward. The dashed ray indicaes he uilizaion arge. rays parallel o he uilizaion ray may be aggregaed. The resuling problem sill has infiniely many saes, bu an opimal policy can be esimaed accuraely using a finie sae approximaion (Glaubius e al., 2008). Applying his model minimizaion approach (Givan e al., 2003) does require prior knowledge of he ask parameers, which is ofen unavailable in pracice. In his paper, we use reinforcemen learning o inegrae model and policy esimaion. An imporan quesion is how much experience is necessary before we can rus learned policies. We address his quesion by deriving a PAC bound on he sample complexiy of obaining a near-opimal policy. To he bes of our knowledge, his is he firs such guaranee for problems wih infinie sae spaces and unbounded coss. is he oal number of quana during which ask T i occupied he shared resource since sysem iniializaion. τ(x) denoes he oal elapsed ime in sae x, τ(x) = n x i. (2) i=1 Each acion i in his MDP corresponds o he decision o run ask T i. Transiions are deermined according o ask duraion disribuions, so ha { P ( i) y = x + i P (y x, i) = (3) 0 oherwise where i is he zero vecor excep ha componen i is equal o one, i.e., execuing ask T i alers jus one dimension of he sysem sae. The cos of a sae is is L 1 -disance from arge uilizaion wihin he hyperplane of saes wih equal elapsed ime τ(x), C(x) = n x i τ(x)u i. (4) i=1 Figure 1 illusraes he uilizaion sae model for a problem wih wo asks and a arge uilizaion u = (1, 2)/3 (ha is, ask T 1 should receive 1/3 of he processor, and ask T 2 should receive he res). The arge uilizaion defines a arge uilizaion ray {λu : λ 0}. When he componens of u are raional, his ray regularly passes hrough many uilizaion saes. In Figure 1, for example, he uilizaion ray passes hrough ineger muliples of (1, 2). Every sae on his ray has zero cos, and saes wih he same displacemen from he arge uilizaion ray have equal cos. This ask scheduling MDP has an infinie sae space and unbounded coss, bu because of repeaed ransiion and cos srucure, saes ha are collinear along 2.3 Relaed Work A principle ha unifies many successful mehods for efficien exploraion is opimism in he face of uncerainy (Kaelbling e al., 1996; Szia and Lőrincz, 2008). When presened wih a choice beween wo acions wih similar esimaed value, mehods using his principle end o selec he acion ha has been ried less frequenly. Opimism can ake he form of opimisic iniializaion (Even-Dar and Mansour, 2001), i.e., boosrapping iniial approximaions of he value funcion wih large values (Brafman and Tennenholz, 2003; Srehl and Liman, 2008). Inerval esimaion echniques insead bias acion selecion owards exploraion by mainaining confidence inervals on model parameers (Srehl and Liman, 2008; Auer e al., 2009) or value esimaes (Even-Dar e al., 2002). Inerval esimaion echniques have been developed for solving single-sae Bandi problems (Auer e al., 2002; Even-Dar e al., 2002; Mannor and Tsisiklis, 2004; Mnih e al., 2008), as hey can be exended o he general MDP seing by reaing each sae as a disinc Bandi problem. Heurisic exploraion sraegies are ofen employed due o heir relaive simpliciy. ɛ-greedy exploraion and Bolzmann acion selecion mehods (Kaelbling e al., 1996) are randomizaion sraegies ha bias acion selecion oward exploiaion. Perhaps he mos commonly used sraegy, ɛ-greedy exploraion, simply chooses an acion uniformly a random wih probabiliy ɛ k a epoch k, and oherwise i selecs he apparen bes acion. By decaying ɛ k appropriaely his sraegy asympoically approaches he opimal policy (Even- Dar e al., 2002). We are ineresed in quanifying he sample complexiy of learning good policies in erms of he number of observaions necessary o compue a nearopimal policy wih high probabiliy i.e., probably

4 approximaely correc (PAC) learning (Valian, 1984). Kakade (2003) has considered he quesion of PAC learning in MDPs in deail. Several PAC reinforcemen learning algorihms have been developed, including E 3 (Kearns and Singh, 2002), R-Max (Brafman and Tennenholz, 2003), MBIE (Srehl and Liman, 2008), and OIM (Szia and Lőrincz, 2008). These algorihms are limied o he finie sae case, and assume bounded rewards. Meric E 3 (Kakade e al., 2003) is a PAC learner for MDPs wih coninuous bu compac sae spaces. 3 Online Learning Resuls We consider he difficully of learning good scheduling policies in his secion. We approach his quesion boh analyically and empirically. In Secion 3.1, we derive a PAC bound (Valian, 1984) on a balanced wandering approach o exploraion (Kearns and Singh, 2002; Even-Dar e al., 2002; Brafman and Tennenholz, 2003) in he scheduling domain. Our resul is novel, as i exends resuls derived for he finie-sae bounded cos seing, o a domain wih a counably infinie sae space and unbounded coss. These resuls rely on a specific Lipschiz-like condiion ha resrics he growh rae of he value funcion under our cos funcion (See Lemmas 3 and 4 in he appendix), and finie suppor of he duraion disribuions, i.e., finie wors-case execuion imes of asks. In Secion 3.2, we presen resuls from simulaions comparing alernaive exploraion sraegies. We esimae ask duraion disribuions using he empirical probabiliy measure. We suppose a collecion of m observaions {(i k, k ) : k = 1,..., m}, where ask T ik ran for k P ( i k ) quana a decision epoch k. Then le ω m (i) be he number of observaions involving ask T i, and le ω m (i, ) be he number of hose observaions in which T i ran for quana, ω m (i) = ω m (i, ) = m I {i k = i}, (5) k=1 m I {i k = i k = }, (6) k=1 where I { } is he indicaor funcion. Then our ask duraion model P m ( i) is jus P m ( i) = ω m (i, )/ω m (i). (7) Since cos is compleely deermined by he sysem sae, he ransiion model is he sole source of uncerainy in his problem. 3.1 Analyical PAC Bound We consider he sample complexiy of esimaing a near-opimal policy wih high confidence by bounding he number of low value exploraory acions aken (Kakade, 2003). Our analysis proceeds in hree pars. Firs, we derive bounds on he value esimaion error as a funcion of he model accuracy. Nex, we deermine he number of observaions needed o guaranee ha model accuracy. Finally, we use hese resuls o deermine how many observaions suffice o arrive a a near-opimal policy wih high cerainy. We focus on esimaing he sae-acion value funcion Q, Q(x, i) = W P ( i)[γv (x + i ) C(x + i )]. (8) =1 We use V m o denoe he opimal sae value funcion and Q m o denoe he sae-acion value funcion of he esimaed MDP wih ransiion dynamics P m. To esablish our main resul consraining he sample complexiy of learning in our scheduling domain, we firs provide he following simulaion lemma, which is proven in he appendix. Lemma 1. If here is a consan β such ha for all asks T i, W P m ( i) P ( i) β, (9) =1 where he wors-case execuion ime W is finie, hen Q m Q 2W β (1 γ) 2. (10) This resul serves an idenical role o he Simulaion Lemma of Kearns and Singh (2002) relaing model esimaion error o value esimaion error. Our bound replaces he quadraic dependence on he number of saes in ha resul wih a dependence on he WCET W. This is consisen wih observaions indicaing ha he sample complexiy of obaining a good approximaion should depend polynomially on he number of parameers of he ransiion model (Kakade, 2003; Leffler e al., 2007), which is O( X 2 A ) for general MDPs, bu is (W A ) in his scheduling domain. Theorem 1 provides a PAC bound on he number of observaions needed o arrive a an accurae esimae of he value funcion. For he sake of simpliciy we assume balanced wandering here, as his resul can be easily used o guide offline modeling as well as employed during online learning. Theorem 1. Under balanced exploraion, if ( 32W 3 ) ( ) n 2W n m ε 2 (1 γ) 4 log, (11) δ hen Q m Q ε wih probabiliy a leas 1 δ.

5 Proof. According o Lemma 1, model accuracy β ε(1 γ) 2 /(2W ) is sufficien o guaranee ha Q m Q ε. Thus, demonsraing he bound in Equaion 11 is a maer of guaraneeing wih high cerainy ha P m is near P ; specifically, we require ha { n ( W )} P P m ( i) P ( i) > β δ, i=1 =1 which{ we can enforce using he union bound by requiring P W } =1 P m( i) P ( i) β 1 δ/n for every ask. By Lemma from Kakade s disseraion (Kakade, 2003), ω m (i) (8W/β 2 ) log(2w n/δ) is sufficien o guaranee wih probabiliy 1 δ/n ha P m ( i) is accurae. If we assume balanced wandering, ha ω m (i) = m/n for each ask T i, hen we require m (8W n/β 2 ) log(2w n/δ) (12) observaions. Subsiuing he leas accuracy β = ε(1 γ) 2 /(2W ) ha will sill guaranee an ε- approximaion o Q, produces he saed resul, m ( 32W 3 ) n ε 2 (1 γ) 4 log ( 2W n δ ). Theorem 1 provides a PAC bound on he number of observaions needed o learn an ε-approximaion o Q. However, we are principally ineresed in discovering he number of observaions we need o rus our learned policies. Corollary 1 esablishes he sample complexiy for using balanced complexiy o learn good scheduling policies. Corollary 1. Assuming each acion is ried an equal number of imes, if ( 128W 3 γ 2 ) ( ) n 2W n m ε 2 (1 γ) 6 log, δ hen he opimal policy π m of he esimaed ask scheduling MDP is wihin ε of he opimal policy π wih probabiliy a leas 1 δ. A classical resul due o Singh and Yee (1994) demonsraes ha, in general, a policy ˆπ ha is greedy wih respec o value funcion approximaion ˆV is wihin 2γ ˆV V /(1 γ) of opimal. Corollary 1 follows by noing ha ˆV V ˆQ Q, so we require ha 2γ ˆQ Q /(1 γ) ε. Subsiuing his consrain on Q m ino Theorem 1 esablishes he corollary. As wih exising bounds, he sample complexiy scales polynomially in he parameers 1/γ, 1/δ, 1/ε, and he number of acions. Unlike bounds for general MDPs, here is no dependence on he number of saes; insead, he complexiy of learning is deermined by he wors-case execuion ime W. This resul is similar o bounds for relocaable acion models (Leffler e al., 2007), in which he sae space can be pariioned ino a relaively small number of classes. Transiion models can be generalized among saes in he same class, so he sample complexiy of learning depends on he number of classes raher han he number of saes. Our scheduling MDP is a special case of he relocaable acion model in which here is only one class of saes. While relocaable acion models have been used o address infinie sae spaces (Brunskill e al., 2009), exising sample complexiy resuls do no address he unbounded reward case. We are able o handle unbounded coss here by aking advanage of he slow growh rae of he value funcion relaive o he discoun facor. Specifically, he disance beween consecuive saes is bounded, so while coss grow polynomially wih disance from he resource share arge (cf. Lemma 3 in he appendix), since coss are exponenially discouned he value of any paricular sae is finie. These observaions enable he bound in Lemma 1, suggesing ha sample complexiy bounds may be possible in general for infinie sae, unbounded cos models as long as he number of classes is finie and individual sae values can be bounded. Of course, for hese resuls o be useful good policies mus be represened compacly, which is possible for he scheduling domain considered here (Glaubius e al., 2008), bu is no generally he case. 3.2 Empirical Evaluaion The PAC bound in he previous secion gives a sense of he finie sample performance for learning a good policy; however, i requires several simplifying assumpions, such as balanced wandering, so he bound may no be igh. In pracice, alernaive exploraion sraegies may yield beer performance han our bound would indicae. We compare he performance of several exploraion sraegies in he conex of he ask scheduling problem by conducing experimens comparing ɛ-greedy, balanced wandering, and an inerval-based exploraion sraegy. For inerval-based opimisic exploraion, we use he confidence inervals derived for he muli-armed bandi case by Even-Dar e al. (2002) for he Successive Eliminaion algorihm. Tha algorihm consrucs inervals of he form α k = log(nk 2 c)/k abou he expeced cos of each acion a decision epoch k, hen eliminaes acions ha appear worse han he apparen bes using an overlap es. The parameer c conrols he sensi-

6 (a) Opimisic (b) ɛ-greedy (c) Balanced Figure 2: Simulaion comparison of exploraion echniques. Noe he differing scales on he verical axes. iviy of he inervals. We use hese inervals o selec acions opimisically according o argmax{q m (x, i) + α i,k } i A where we have adjused he confidence inervals according o he poenially differen number of observaions of each ask, α k,i = log(nω k (i) 2 c)/ω k (i) We vary c o conrol he chance of aking exploraory acions. As c shrinks, hese inervals narrow, increasing he endency o exploi he esimaed model. In our experimens wih ɛ-greedy, we se he random selecion rae a decision epoch k, ɛ k = ɛ 0 /k for varying values of ɛ 0 ; his sraegy always explois when ɛ 0 = 0. Balanced wandering simply execues each ask a fixed number of imes m prior o exploiing. We vary his parameer o deermine is impac on he learning rae. When m = 0, his sraegy always explois is curren model knowledge. To compare he performance of hese exploraion sraegies, we generaed 400 random problem insances wih wo asks. Duraion disribuions for hese asks were generaed by firs selecing a wors-case execuion ime W uniformly a random from he inerval [8, 32], hen choosing a normal disribuion wih mean and variance seleced uniformly a random from he respecive inervals [1, W ] and [1, 4]; his disribuion was hen runcaed and discreized over he inerval [1, W ]. Uilizaion arges for each ask were chosen according o u = (u 1, u 2)/(u 1 + u 2), where u 1 and u 2 were inegers seleced uniformly a random beween [1, 64]. We used a discoun facor of γ = 0.95 in our ess. We conduced experimens by iniializing he model in he sae x = (0, 0). The conroller simulaed a single rajecory over 20,000 decision epochs in each problem insance wih each exploraion sraegy. In order o avoid enumeraing arbirarily large numbers of saes, we reiniialized he sae whenever a sae wih cos greaer han 50 was encounered. These high cos saes were reaed as absorbing saes in he approximae model o avoid degenerae policies ha exploi he rese. We repor he number of misakes he number of imes he exploraion sraegy chooses an subopimal acion i ha has value V (x) Q(x, i) The resuls of hese experimens are shown in Figure Evaluaion Resuls In Figure 2, we repor 90% confidence inervals on he mean number of misakes each exploraion sraegy makes, averaged across he problem insances described above. Noe ha hese plos have differen scales due o he variaion in misake raes among exploraion sraegies. Figure 2(a) compares he performance of inervalbased opimisic acion selecion o ha of Exploi, he policy ha greedily follows he opimal policy of he approximae model a each decision epoch. All of he inerval-based exploraion seings we considered exhibied saisically similar performance. Ineresingly, he exploiive sraegy yields beer performance han he exploraive sraegies despie is lack of an explici exploraion mechanism. This observaion holds rue for ε-greedy exploraion and balanced wandering as well. Figure 2(b) illusraes he performance of ɛ-greedy exploraion. Noice ha he misake rae decreases along wih he likelihood of aking exploraory acions ha is, as ɛ 0 approaches zero. Explici exploraion may no improve performance in his domain. This is furher suppored by our resuls for balanced wandering. The heory behind balanced wandering is ha making a few iniial misakes early on will pay off in he long run due o more uniformly accurae models. Figure 2(c) shows ha his is no he case in our scheduling domain, as

7 a purely exploiive sraegy m = 0 ouperforms each balanced wandering approach. These resuls sugges ha he exploiive sraegy may be he bes available exploraion mehod in our ask scheduling problem domain. One plausible explanaion is ha he environmen iself enforces raional exploraion: if some ask is never dispached, he sysem will ener progressively more cosly saes as ha ask becomes more and more underused. Thus, evenually he esimaed benefi of running ha ask will be subsanial enough ha he exploiive sraegy mus use i. I is ineresing o noe ha all of he exploraive policies considered have quie low misake raes despie he igh hreshold of 10 6 used o disinguish subopimal acions. 4 Conclusions In his paper we have considered he problem of learning near-opimal schedules when he sysem model is no fully known in advance. We have presened analyical resuls ha bound he number of subopimal acions aken prior o arriving a a near-opimal policy wih high cerainy. Ineresingly, he ransiion sysem s porabiliy resuls in bounds ha are similar o hose for esimaing he underlying model in a single sae. This naurally leads o a comparison o he muliarmed bandi model (see, for example, Even-Dar e al. (2002)), in which here is a single sae wih several available acions. Each acion causes he emission of a reward according o a corresponding unknown, saionary random process. However, a bandi model does no appear o apply direcly because while he duraion disribuions are saionary processes ha are invarian beween saes, he payoff associaed wih each acion is sae-dependen. We have focused on he PAC model of learning raher han deriving bounds on regre he loss in value incurred due o subopimal behavior during learning (Auer e al., 2009). Regre bounds may ranslae more readily ino guaranees abou ransien real-ime performance effecs during learning, as guaranees regarding cos (and hence value) ranslae ino guaranees abou ask imeliness. We have presened empirical resuls which sugges ha a learner ha always explois is curren informaion ouperforms agens ha explicily encourage exploraion in his domain. This occurs because any policy ha consisenly ignores some acion will ge progressively farher from he uilizaion arge, resuling in arbirarily large coss. Thus he domain iself appears o enforce an appropriae level of exploraion, perhaps obviaing he need for an explici exploraion mechanism. I is an open quesion wheher a more general class of MDPs ha exhibi his behavior can be idenified. Acknowledgemens This research has been suppored in par by NSF grans CNS (Cyberrus) and CCF (CAREER). Appendix: Proof of Lemma 1 Lemma 1 saes ha he error in approximaing Q is bounded, Q m Q 2W β/(1 γ) 2, when he ransiion model esimaion error is bounded by β (cf. Equaion 9), where W is he maximum worscase execuion ime among all asks. We inroduce lemmas prior o demonsraing his resul. The firs provides a bound on expeced successor sae value of a funcion wih a Lipschiz-like speed limi on is growh. Subsequen lemmas esablish ha boh coss and values exhibi his propery. Lemma 2. Suppose p and ˆp are disribuions over {1,..., W } ha saisfy W =1 p() ˆp() β, and ha for any i, he funcion f : X R saisfies f(x i, ) f(x) λ for some λ 0. Then W [p() ˆp()]f(x + i ) λw β. =1 Proof. Since we can decompose f(x+ i ) ino an f(x) erm and a λ erm, we have [p() ˆp()]f(x + i ) [p() ˆp()]f(x) + λ p() ˆp(). Since f(x) does no depend on, he firs erm on he righ-hand side vanishes. Since W, we have λ p() ˆq() λw β. We now show ha he cos funcion C and he opimal value V saisfy he condiions of Lemma 2. Lemma 3. For any sae x, ask T i, and duraion, C(x) C(x + i ) C( i ). (13)

8 Proof. Since C(x) is he L 1 -norm beween x and τ(x)u (cf. Equaion 4), we can use he riangle inequaliy and scalabiliy o derive he upper bound C(x + i ) C(x) + C( i ). We can also use he riangle inequaliy o obain he lower bound, since C(x) C(x + i ) + C( i ); rearranging he erms yields he inended resul. I is sraighforward o show ha C( i ) < 2 for any ask T i. We make use of his fac and Lemma 3 o derive a relaed limi on he growh of V. Lemma 4. For any sae x, ask T i, and duraion, V (x + i ) V (x) 2/(1 γ). Proof. Le y = x + i. We can bound he difference in values a x and y in erms of he difference in Q- values, since V (y) V (x) max Q(y, j) Q(x, j). (14) j By expanding Q according o Equaion 8 and rearranging erms, Q(y, j) Q(x, j) ( = P (s j) γv (y + s j ) γv (x + s j ) s γ s + s 2 + γ s C(y + s j ) + C(x + s j )) P (s j) V (y + s j ) V (x + s j ) P (s j) C(y + s j ) C(x + s j ) P (s j) V (y + s j ) V (x + s j ). Recurring his argumen on he absolue value in he righ-hand side resuls in accumulaing a residual γ k C( i ) for he k h repeiion. Therefore, V (x + i ) V (x) γ k C( i ) = 2 1 γ. k=0 We are ready now o prove Lemma 1. Proof of Lemma 1. We begin bounding Q(x, i) Q m (x, i) by expanding according o Equaion 8, rearranging erms o group coss and values, hen decomposing he sum by using he superaddiiviy of he absolue value: Q(x, i) Q m (x, i) γ P ( i)v (x + i ) P m ( i)v m (x + i ) + [P ( i) P m ( i)]c(x + i ). (15) Applying Lemmas 2 and 3, we have [P ( i) P m ( i)]c(x + i ) 2W β. We can apply he riangle inequaliy o obain P ( i)v (x + i ) P m ( i)v m (x + i ) [P ( i) P m ( i)]v (x + i ) + P m ( i) V (x + i ) V m (x + i ). Using Lemmas 2 and 4 yields [P ( i) P m ( i)]v (x + i ) 2W β/(1 γ). Subsiuing back ino Equaion 15 allows us o wrie Q(x, i) Q m (x, i) 2W β + γ 2W β 1 γ + γ P m ( i) V (x + i ) V m (x + i ). Finally, we can use Equaion 14 o express V (x + i ) V m (x + i ) in erms of Q, hen recur his argumen o produce he saed bound, Q(x, i) Q m (x, i) References k=0 γ k 2W β 1 γ = 2W β (1 γ) 2. P. Auer, N. Cesa-Bianchi, and P. Fischer. Finie ime analysis of he muliarmed bandi problem. Machine Learning, 47(2-3): , P. Auer, T. Jaksch, and R. Orner. Near-opimal regre bounds for reinforcemen learning. In Advances in Neural Informaion Processing Sysems, volume 21, pages 89 96, R. I. Brafman and M. Tennenholz. R-MAX a general polynomial ime algorihm for near-opimal reinforcemen learning. Journal of Machine Learning Research, 3: , E. Brunskill, B. R. Leffler, L. Li, M. L. Liman, and N. Roy. Provably efficien learning wih yped parameric models. Journal of Machine Learing Research, 10: , 2009.

9 E. Even-Dar and Y. Mansour. Convergence of opimisic and incremenal Q-learning. In Advances in Neural Informaion Processing Sysems, volume 13, pages , E. Even-Dar, S. Mannor, and Y. Mansour. PAC bounds for muli-armed bandi and markov decision processes. In COLT 02: Proceedings of he 15h Annual Conference on Compuaional Learning Theory, pages , R. Givan, T. Dean, and M. Greig. Equivalence noions and model minimizaion in markov decision processes. Arificial Inelligence, 147(1-2): , R. Glaubius, T. Tidwell, W. D. Smar, and C. Gill. Scheduling design and verificaion for open sof real-ime sysems. In RTSS 08: Proceedings of he 2008 Real-Time Sysems Symposium, pages , R. Glaubius, T. Tidwell, C. Gill, and W. D. Smar. Scheduling policy design for auonomic sysems. Inernaional Journal on Auonomous and Adapive Communicaions Sysems, 2(3): , L. P. Kaelbling, M. Liman, and A. Moore. Reinforcemen learning: A survey. Journal of Arificial Inelligence Research, 4: , S. M. Kakade. On he Sample Complexiy of Reinforcemen Learning. PhD hesis, Gasby Compuaional Neuroscience Uni, Universiy College London, London, UK, S. M. Kakade, M. Kearns, and J. Langford. Exploraion in meric sae spaces. In ICML 03: Proceedings of he 20h Inernaional Conference on Machine Learning, pages , M. J. Kearns and S. P. Singh. Near-opimal reinforcemen learning in polynomial ime. Machine Learning, 2-3(49): , B. R. Leffler, M. L. Liman, and T. Edmunds. Efficien reinforcemen learning wih relocaable acion models. In AAAI 07: Proceedings of he 22nd Naional Conference on Arificial Inelligence, pages , S. Mannor and J. N. Tsisiklis. The sample complexiy of exploraion in he muli-armed bandi problem. Journal of Machine Learning Research, 5: , V. Mnih, C. Szepesvári, and J.-Y. Audiber. Empirical bernsein sopping. In ICML 08: Proceedings of he 25h Inernaional Conference on Machine Learning, pages , M. L. Puerman. Markov Decision Processes: Discree Sochasic Dynamic Programming. Wiley-Inerscience, S. P. Singh and R. C. Yee. An upper bound on he loss from approximae opimal-value funcions. Machine Learning, 16(3): , A. Srinivasan and J. H. Anderson. Efficien scheduling of sof real-ime applicaions on muliprocessors. Journal of Embedded Compuing, 1(2): , A. L. Srehl and M. L. Liman. An analysis of modelbased inerval esimaion for markov decision processes. Journal of Compuer and Sysem Sciences, 74(8): , I. Szia and A. Lőrincz. The many faces opimism: a unifying approach. In ICML 08: Proceedings of he 25h Inernaional Conference on Machine Learning, pages , L. G. Valian. A heory of he learnable. In STOC 84: Proceedings of he Sixeenh Annual ACM Symposium on Theory of Compuing, pages , 1984.

Chapter 7. Response of First-Order RL and RC Circuits

Chapter 7. Response of First-Order RL and RC Circuits Chaper 7. esponse of Firs-Order L and C Circuis 7.1. The Naural esponse of an L Circui 7.2. The Naural esponse of an C Circui 7.3. The ep esponse of L and C Circuis 7.4. A General oluion for ep and Naural

More information

Multiprocessor Systems-on-Chips

Multiprocessor Systems-on-Chips Par of: Muliprocessor Sysems-on-Chips Edied by: Ahmed Amine Jerraya and Wayne Wolf Morgan Kaufmann Publishers, 2005 2 Modeling Shared Resources Conex swiching implies overhead. On a processing elemen,

More information

Task is a schedulable entity, i.e., a thread

Task is a schedulable entity, i.e., a thread Real-Time Scheduling Sysem Model Task is a schedulable eniy, i.e., a hread Time consrains of periodic ask T: - s: saring poin - e: processing ime of T - d: deadline of T - p: period of T Periodic ask T

More information

Cointegration: The Engle and Granger approach

Cointegration: The Engle and Granger approach Coinegraion: The Engle and Granger approach Inroducion Generally one would find mos of he economic variables o be non-saionary I(1) variables. Hence, any equilibrium heories ha involve hese variables require

More information

CHARGE AND DISCHARGE OF A CAPACITOR

CHARGE AND DISCHARGE OF A CAPACITOR REFERENCES RC Circuis: Elecrical Insrumens: Mos Inroducory Physics exs (e.g. A. Halliday and Resnick, Physics ; M. Sernheim and J. Kane, General Physics.) This Laboraory Manual: Commonly Used Insrumens:

More information

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements Inroducion Chaper 14: Dynamic D-S dynamic model of aggregae and aggregae supply gives us more insigh ino how he economy works in he shor run. I is a simplified version of a DSGE model, used in cuing-edge

More information

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE Profi Tes Modelling in Life Assurance Using Spreadshees PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE Erik Alm Peer Millingon 2004 Profi Tes Modelling in Life Assurance Using Spreadshees

More information

Real-time Particle Filters

Real-time Particle Filters Real-ime Paricle Filers Cody Kwok Dieer Fox Marina Meilă Dep. of Compuer Science & Engineering, Dep. of Saisics Universiy of Washingon Seale, WA 9895 ckwok,fox @cs.washingon.edu, mmp@sa.washingon.edu Absrac

More information

Chapter 8: Regression with Lagged Explanatory Variables

Chapter 8: Regression with Lagged Explanatory Variables Chaper 8: Regression wih Lagged Explanaory Variables Time series daa: Y for =1,..,T End goal: Regression model relaing a dependen variable o explanaory variables. Wih ime series new issues arise: 1. One

More information

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1 Absrac number: 05-0407 Single-machine Scheduling wih Periodic Mainenance and boh Preempive and Non-preempive jobs in Remanufacuring Sysem Liu Biyu hen Weida (School of Economics and Managemen Souheas Universiy

More information

Economics Honors Exam 2008 Solutions Question 5

Economics Honors Exam 2008 Solutions Question 5 Economics Honors Exam 2008 Soluions Quesion 5 (a) (2 poins) Oupu can be decomposed as Y = C + I + G. And we can solve for i by subsiuing in equaions given in he quesion, Y = C + I + G = c 0 + c Y D + I

More information

AP Calculus BC 2010 Scoring Guidelines

AP Calculus BC 2010 Scoring Guidelines AP Calculus BC Scoring Guidelines The College Board The College Board is a no-for-profi membership associaion whose mission is o connec sudens o college success and opporuniy. Founded in, he College Board

More information

Communication Networks II Contents

Communication Networks II Contents 3 / 1 -- Communicaion Neworks II (Görg) -- www.comnes.uni-bremen.de Communicaion Neworks II Conens 1 Fundamenals of probabiliy heory 2 Traffic in communicaion neworks 3 Sochasic & Markovian Processes (SP

More information

Stochastic Optimal Control Problem for Life Insurance

Stochastic Optimal Control Problem for Life Insurance Sochasic Opimal Conrol Problem for Life Insurance s. Basukh 1, D. Nyamsuren 2 1 Deparmen of Economics and Economerics, Insiue of Finance and Economics, Ulaanbaaar, Mongolia 2 School of Mahemaics, Mongolian

More information

Vector Autoregressions (VARs): Operational Perspectives

Vector Autoregressions (VARs): Operational Perspectives Vecor Auoregressions (VARs): Operaional Perspecives Primary Source: Sock, James H., and Mark W. Wason, Vecor Auoregressions, Journal of Economic Perspecives, Vol. 15 No. 4 (Fall 2001), 101-115. Macroeconomericians

More information

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer)

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer) Mahemaics in Pharmacokineics Wha and Why (A second aemp o make i clearer) We have used equaions for concenraion () as a funcion of ime (). We will coninue o use hese equaions since he plasma concenraions

More information

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613. Graduae School of Business Adminisraion Universiy of Virginia UVA-F-38 Duraion and Convexiy he price of a bond is a funcion of he promised paymens and he marke required rae of reurn. Since he promised

More information

Working Paper No. 482. Net Intergenerational Transfers from an Increase in Social Security Benefits

Working Paper No. 482. Net Intergenerational Transfers from an Increase in Social Security Benefits Working Paper No. 482 Ne Inergeneraional Transfers from an Increase in Social Securiy Benefis By Li Gan Texas A&M and NBER Guan Gong Shanghai Universiy of Finance and Economics Michael Hurd RAND Corporaion

More information

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1 Business Condiions & Forecasing Exponenial Smoohing LECTURE 2 MOVING AVERAGES AND EXPONENTIAL SMOOTHING OVERVIEW This lecure inroduces ime-series smoohing forecasing mehods. Various models are discussed,

More information

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS R. Caballero, E. Cerdá, M. M. Muñoz and L. Rey () Deparmen of Applied Economics (Mahemaics), Universiy of Málaga,

More information

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt Saisical Analysis wih Lile s Law Supplemenary Maerial: More on he Call Cener Daa by Song-Hee Kim and Ward Whi Deparmen of Indusrial Engineering and Operaions Research Columbia Universiy, New York, NY 17-99

More information

DETERMINISTIC INVENTORY MODEL FOR ITEMS WITH TIME VARYING DEMAND, WEIBULL DISTRIBUTION DETERIORATION AND SHORTAGES KUN-SHAN WU

DETERMINISTIC INVENTORY MODEL FOR ITEMS WITH TIME VARYING DEMAND, WEIBULL DISTRIBUTION DETERIORATION AND SHORTAGES KUN-SHAN WU Yugoslav Journal of Operaions Research 2 (22), Number, 6-7 DEERMINISIC INVENORY MODEL FOR IEMS WIH IME VARYING DEMAND, WEIBULL DISRIBUION DEERIORAION AND SHORAGES KUN-SHAN WU Deparmen of Bussines Adminisraion

More information

MTH6121 Introduction to Mathematical Finance Lesson 5

MTH6121 Introduction to Mathematical Finance Lesson 5 26 MTH6121 Inroducion o Mahemaical Finance Lesson 5 Conens 2.3 Brownian moion wih drif........................... 27 2.4 Geomeric Brownian moion........................... 28 2.5 Convergence of random

More information

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Journal Of Business & Economics Research September 2005 Volume 3, Number 9 Opion Pricing And Mone Carlo Simulaions George M. Jabbour, (Email: jabbour@gwu.edu), George Washingon Universiy Yi-Kang Liu, (yikang@gwu.edu), George Washingon Universiy ABSTRACT The advanage of Mone Carlo

More information

Distributing Human Resources among Software Development Projects 1

Distributing Human Resources among Software Development Projects 1 Disribuing Human Resources among Sofware Developmen Proecs Macario Polo, María Dolores Maeos, Mario Piaini and rancisco Ruiz Summary This paper presens a mehod for esimaing he disribuion of human resources

More information

On the degrees of irreducible factors of higher order Bernoulli polynomials

On the degrees of irreducible factors of higher order Bernoulli polynomials ACTA ARITHMETICA LXII.4 (1992 On he degrees of irreducible facors of higher order Bernoulli polynomials by Arnold Adelberg (Grinnell, Ia. 1. Inroducion. In his paper, we generalize he curren resuls on

More information

A Probability Density Function for Google s stocks

A Probability Density Function for Google s stocks A Probabiliy Densiy Funcion for Google s socks V.Dorobanu Physics Deparmen, Poliehnica Universiy of Timisoara, Romania Absrac. I is an approach o inroduce he Fokker Planck equaion as an ineresing naural

More information

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation A Noe on Using he Svensson procedure o esimae he risk free rae in corporae valuaion By Sven Arnold, Alexander Lahmann and Bernhard Schwezler Ocober 2011 1. The risk free ineres rae in corporae valuaion

More information

Morningstar Investor Return

Morningstar Investor Return Morningsar Invesor Reurn Morningsar Mehodology Paper Augus 31, 2010 2010 Morningsar, Inc. All righs reserved. The informaion in his documen is he propery of Morningsar, Inc. Reproducion or ranscripion

More information

Inventory Planning with Forecast Updates: Approximate Solutions and Cost Error Bounds

Inventory Planning with Forecast Updates: Approximate Solutions and Cost Error Bounds OPERATIONS RESEARCH Vol. 54, No. 6, November December 2006, pp. 1079 1097 issn 0030-364X eissn 1526-5463 06 5406 1079 informs doi 10.1287/opre.1060.0338 2006 INFORMS Invenory Planning wih Forecas Updaes:

More information

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary Random Walk in -D Random walks appear in many cones: diffusion is a random walk process undersanding buffering, waiing imes, queuing more generally he heory of sochasic processes gambling choosing he bes

More information

Strategic Optimization of a Transportation Distribution Network

Strategic Optimization of a Transportation Distribution Network Sraegic Opimizaion of a Transporaion Disribuion Nework K. John Sophabmixay, Sco J. Mason, Manuel D. Rossei Deparmen of Indusrial Engineering Universiy of Arkansas 4207 Bell Engineering Cener Fayeeville,

More information

How To Predict A Person'S Behavior

How To Predict A Person'S Behavior Informaion Theoreic Approaches for Predicive Models: Resuls and Analysis Monica Dinculescu Supervised by Doina Precup Absrac Learning he inernal represenaion of parially observable environmens has proven

More information

Module 4. Single-phase AC circuits. Version 2 EE IIT, Kharagpur

Module 4. Single-phase AC circuits. Version 2 EE IIT, Kharagpur Module 4 Single-phase A circuis ersion EE T, Kharagpur esson 5 Soluion of urren in A Series and Parallel ircuis ersion EE T, Kharagpur n he las lesson, wo poins were described:. How o solve for he impedance,

More information

4. International Parity Conditions

4. International Parity Conditions 4. Inernaional ariy ondiions 4.1 urchasing ower ariy he urchasing ower ariy ( heory is one of he early heories of exchange rae deerminaion. his heory is based on he concep ha he demand for a counry's currency

More information

Acceleration Lab Teacher s Guide

Acceleration Lab Teacher s Guide Acceleraion Lab Teacher s Guide Objecives:. Use graphs of disance vs. ime and velociy vs. ime o find acceleraion of a oy car.. Observe he relaionship beween he angle of an inclined plane and he acceleraion

More information

Monte Carlo Observer for a Stochastic Model of Bioreactors

Monte Carlo Observer for a Stochastic Model of Bioreactors Mone Carlo Observer for a Sochasic Model of Bioreacors Marc Joannides, Irène Larramendy Valverde, and Vivien Rossi 2 Insiu de Mahémaiques e Modélisaion de Monpellier (I3M UMR 549 CNRS Place Eugène Baaillon

More information

9. Capacitor and Resistor Circuits

9. Capacitor and Resistor Circuits ElecronicsLab9.nb 1 9. Capacior and Resisor Circuis Inroducion hus far we have consider resisors in various combinaions wih a power supply or baery which provide a consan volage source or direc curren

More information

adaptive control; stochastic systems; certainty equivalence principle; long-term

adaptive control; stochastic systems; certainty equivalence principle; long-term COMMUICATIOS I IFORMATIO AD SYSTEMS c 2006 Inernaional Press Vol. 6, o. 4, pp. 299-320, 2006 003 ADAPTIVE COTROL OF LIEAR TIME IVARIAT SYSTEMS: THE BET O THE BEST PRICIPLE S. BITTATI AD M. C. CAMPI Absrac.

More information

Performance Center Overview. Performance Center Overview 1

Performance Center Overview. Performance Center Overview 1 Performance Cener Overview Performance Cener Overview 1 ODJFS Performance Cener ce Cener New Performance Cener Model Performance Cener Projec Meeings Performance Cener Execuive Meeings Performance Cener

More information

The Transport Equation

The Transport Equation The Transpor Equaion Consider a fluid, flowing wih velociy, V, in a hin sraigh ube whose cross secion will be denoed by A. Suppose he fluid conains a conaminan whose concenraion a posiion a ime will be

More information

The option pricing framework

The option pricing framework Chaper 2 The opion pricing framework The opion markes based on swap raes or he LIBOR have become he larges fixed income markes, and caps (floors) and swapions are he mos imporan derivaives wihin hese markes.

More information

Stochastic Recruitment: A Limited-Feedback Control Policy for Large Ensemble Systems

Stochastic Recruitment: A Limited-Feedback Control Policy for Large Ensemble Systems Sochasic Recruimen: A Limied-Feedback Conrol Policy for Large Ensemble Sysems Lael Odhner and Harry Asada Absrac This paper is abou sochasic recruimen, a conrol archiecure for cenrally conrolling he ensemble

More information

Chapter 1.6 Financial Management

Chapter 1.6 Financial Management Chaper 1.6 Financial Managemen Par I: Objecive ype quesions and answers 1. Simple pay back period is equal o: a) Raio of Firs cos/ne yearly savings b) Raio of Annual gross cash flow/capial cos n c) = (1

More information

AP Calculus AB 2013 Scoring Guidelines

AP Calculus AB 2013 Scoring Guidelines AP Calculus AB 1 Scoring Guidelines The College Board The College Board is a mission-driven no-for-profi organizaion ha connecs sudens o college success and opporuniy. Founded in 19, he College Board was

More information

cooking trajectory boiling water B (t) microwave 0 2 4 6 8 101214161820 time t (mins)

cooking trajectory boiling water B (t) microwave 0 2 4 6 8 101214161820 time t (mins) Alligaor egg wih calculus We have a large alligaor egg jus ou of he fridge (1 ) which we need o hea o 9. Now here are wo accepable mehods for heaing alligaor eggs, one is o immerse hem in boiling waer

More information

Dynamic programming models and algorithms for the mutual fund cash balance problem

Dynamic programming models and algorithms for the mutual fund cash balance problem Submied o Managemen Science manuscrip Dynamic programming models and algorihms for he muual fund cash balance problem Juliana Nascimeno Deparmen of Operaions Research and Financial Engineering, Princeon

More information

Longevity 11 Lyon 7-9 September 2015

Longevity 11 Lyon 7-9 September 2015 Longeviy 11 Lyon 7-9 Sepember 2015 RISK SHARING IN LIFE INSURANCE AND PENSIONS wihin and across generaions Ragnar Norberg ISFA Universié Lyon 1/London School of Economics Email: ragnar.norberg@univ-lyon1.fr

More information

Measuring macroeconomic volatility Applications to export revenue data, 1970-2005

Measuring macroeconomic volatility Applications to export revenue data, 1970-2005 FONDATION POUR LES ETUDES ET RERS LE DEVELOPPEMENT INTERNATIONAL Measuring macroeconomic volailiy Applicaions o expor revenue daa, 1970-005 by Joël Cariolle Policy brief no. 47 March 01 The FERDI is a

More information

SPEC model selection algorithm for ARCH models: an options pricing evaluation framework

SPEC model selection algorithm for ARCH models: an options pricing evaluation framework Applied Financial Economics Leers, 2008, 4, 419 423 SEC model selecion algorihm for ARCH models: an opions pricing evaluaion framework Savros Degiannakis a, * and Evdokia Xekalaki a,b a Deparmen of Saisics,

More information

An Online Learning-based Framework for Tracking

An Online Learning-based Framework for Tracking An Online Learning-based Framework for Tracking Kamalika Chaudhuri Compuer Science and Engineering Universiy of California, San Diego La Jolla, CA 9293 Yoav Freund Compuer Science and Engineering Universiy

More information

Term Structure of Prices of Asian Options

Term Structure of Prices of Asian Options Term Srucure of Prices of Asian Opions Jirô Akahori, Tsuomu Mikami, Kenji Yasuomi and Teruo Yokoa Dep. of Mahemaical Sciences, Risumeikan Universiy 1-1-1 Nojihigashi, Kusasu, Shiga 525-8577, Japan E-mail:

More information

Constant Data Length Retrieval for Video Servers with Variable Bit Rate Streams

Constant Data Length Retrieval for Video Servers with Variable Bit Rate Streams IEEE Inernaional Conference on Mulimedia Compuing & Sysems, June 17-3, 1996, in Hiroshima, Japan, p. 151-155 Consan Lengh Rerieval for Video Servers wih Variable Bi Rae Sreams Erns Biersack, Frédéric Thiesse,

More information

Behavior Analysis of a Biscuit Making Plant using Markov Regenerative Modeling

Behavior Analysis of a Biscuit Making Plant using Markov Regenerative Modeling Behavior Analysis of a Biscui Making lan using Markov Regeneraive Modeling arvinder Singh & Aul oyal Deparmen of Mechanical Engineering, Lala Lajpa Rai Insiue of Engineering & Technology, Moga -, India

More information

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS Hao Wu and Qinfen Zheng Cenre for Auomaion Research Dep. of Elecrical and Compuer Engineering Universiy of Maryland, College Park, MD-20742 {wh2003, qinfen}@cfar.umd.edu

More information

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches. Appendi A: Area worked-ou s o Odd-Numbered Eercises Do no read hese worked-ou s before aemping o do he eercises ourself. Oherwise ou ma mimic he echniques shown here wihou undersanding he ideas. Bes wa

More information

THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS

THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS VII. THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS The mos imporan decisions for a firm's managemen are is invesmen decisions. While i is surely

More information

PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM

PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM WILLIAM L. COOPER Deparmen of Mechanical Engineering, Universiy of Minnesoa, 111 Church Sree S.E., Minneapolis, MN 55455 billcoop@me.umn.edu

More information

Option Put-Call Parity Relations When the Underlying Security Pays Dividends

Option Put-Call Parity Relations When the Underlying Security Pays Dividends Inernaional Journal of Business and conomics, 26, Vol. 5, No. 3, 225-23 Opion Pu-all Pariy Relaions When he Underlying Securiy Pays Dividends Weiyu Guo Deparmen of Finance, Universiy of Nebraska Omaha,

More information

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS Shuzhen Xu Research Risk and Reliabiliy Area FM Global Norwood, Massachuses 262, USA David Fuller Engineering Sandards FM Global Norwood, Massachuses 262,

More information

Why Did the Demand for Cash Decrease Recently in Korea?

Why Did the Demand for Cash Decrease Recently in Korea? Why Did he Demand for Cash Decrease Recenly in Korea? Byoung Hark Yoo Bank of Korea 26. 5 Absrac We explores why cash demand have decreased recenly in Korea. The raio of cash o consumpion fell o 4.7% in

More information

Can Individual Investors Use Technical Trading Rules to Beat the Asian Markets?

Can Individual Investors Use Technical Trading Rules to Beat the Asian Markets? Can Individual Invesors Use Technical Trading Rules o Bea he Asian Markes? INTRODUCTION In radiional ess of he weak-form of he Efficien Markes Hypohesis, price reurn differences are found o be insufficien

More information

Optimal Investment and Consumption Decision of Family with Life Insurance

Optimal Investment and Consumption Decision of Family with Life Insurance Opimal Invesmen and Consumpion Decision of Family wih Life Insurance Minsuk Kwak 1 2 Yong Hyun Shin 3 U Jin Choi 4 6h World Congress of he Bachelier Finance Sociey Torono, Canada June 25, 2010 1 Speaker

More information

ON THE PRICING OF EQUITY-LINKED LIFE INSURANCE CONTRACTS IN GAUSSIAN FINANCIAL ENVIRONMENT

ON THE PRICING OF EQUITY-LINKED LIFE INSURANCE CONTRACTS IN GAUSSIAN FINANCIAL ENVIRONMENT Teor Imov r.amaem.sais. Theor. Probabiliy and Mah. Sais. Vip. 7, 24 No. 7, 25, Pages 15 111 S 94-9(5)634-4 Aricle elecronically published on Augus 12, 25 ON THE PRICING OF EQUITY-LINKED LIFE INSURANCE

More information

Hedging with Forwards and Futures

Hedging with Forwards and Futures Hedging wih orwards and uures Hedging in mos cases is sraighforward. You plan o buy 10,000 barrels of oil in six monhs and you wish o eliminae he price risk. If you ake he buy-side of a forward/fuures

More information

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES Mehme Nuri GÖMLEKSİZ Absrac Using educaion echnology in classes helps eachers realize a beer and more effecive learning. In his sudy 150 English eachers were

More information

A Bayesian framework with auxiliary particle filter for GMTI based ground vehicle tracking aided by domain knowledge

A Bayesian framework with auxiliary particle filter for GMTI based ground vehicle tracking aided by domain knowledge A Bayesian framework wih auxiliary paricle filer for GMTI based ground vehicle racking aided by domain knowledge Miao Yu a, Cunjia Liu a, Wen-hua Chen a and Jonahon Chambers b a Deparmen of Aeronauical

More information

Steps for D.C Analysis of MOSFET Circuits

Steps for D.C Analysis of MOSFET Circuits 10/22/2004 Seps for DC Analysis of MOSFET Circuis.doc 1/7 Seps for D.C Analysis of MOSFET Circuis To analyze MOSFET circui wih D.C. sources, we mus follow hese five seps: 1. ASSUME an operaing mode 2.

More information

GOOD NEWS, BAD NEWS AND GARCH EFFECTS IN STOCK RETURN DATA

GOOD NEWS, BAD NEWS AND GARCH EFFECTS IN STOCK RETURN DATA Journal of Applied Economics, Vol. IV, No. (Nov 001), 313-37 GOOD NEWS, BAD NEWS AND GARCH EFFECTS 313 GOOD NEWS, BAD NEWS AND GARCH EFFECTS IN STOCK RETURN DATA CRAIG A. DEPKEN II * The Universiy of Texas

More information

The Grantor Retained Annuity Trust (GRAT)

The Grantor Retained Annuity Trust (GRAT) WEALTH ADVISORY Esae Planning Sraegies for closely-held, family businesses The Granor Reained Annuiy Trus (GRAT) An efficien wealh ransfer sraegy, paricularly in a low ineres rae environmen Family business

More information

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas The Greek financial crisis: growing imbalances and sovereign spreads Heaher D. Gibson, Sephan G. Hall and George S. Tavlas The enry The enry of Greece ino he Eurozone in 2001 produced a dividend in he

More information

AP Calculus AB 2010 Scoring Guidelines

AP Calculus AB 2010 Scoring Guidelines AP Calculus AB 1 Scoring Guidelines The College Board The College Board is a no-for-profi membership associaion whose mission is o connec sudens o college success and opporuniy. Founded in 1, he College

More information

Differential Equations. Solving for Impulse Response. Linear systems are often described using differential equations.

Differential Equations. Solving for Impulse Response. Linear systems are often described using differential equations. Differenial Equaions Linear sysems are ofen described using differenial equaions. For example: d 2 y d 2 + 5dy + 6y f() d where f() is he inpu o he sysem and y() is he oupu. We know how o solve for y given

More information

As widely accepted performance measures in supply chain management practice, frequency-based service

As widely accepted performance measures in supply chain management practice, frequency-based service MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 6, No., Winer 2004, pp. 53 72 issn 523-464 eissn 526-5498 04 060 0053 informs doi 0.287/msom.030.0029 2004 INFORMS On Measuring Supplier Performance Under

More information

Optimal Stock Selling/Buying Strategy with reference to the Ultimate Average

Optimal Stock Selling/Buying Strategy with reference to the Ultimate Average Opimal Sock Selling/Buying Sraegy wih reference o he Ulimae Average Min Dai Dep of Mah, Naional Universiy of Singapore, Singapore Yifei Zhong Dep of Mah, Naional Universiy of Singapore, Singapore July

More information

Differential Equations and Linear Superposition

Differential Equations and Linear Superposition Differenial Equaions and Linear Superposiion Basic Idea: Provide soluion in closed form Like Inegraion, no general soluions in closed form Order of equaion: highes derivaive in equaion e.g. dy d dy 2 y

More information

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand 36 Invesmen Managemen and Financial Innovaions, 4/4 Marke Liquidiy and he Impacs of he Compuerized Trading Sysem: Evidence from he Sock Exchange of Thailand Sorasar Sukcharoensin 1, Pariyada Srisopisawa,

More information

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR The firs experimenal publicaion, which summarised pas and expeced fuure developmen of basic economic indicaors, was published by he Minisry

More information

Life insurance cash flows with policyholder behaviour

Life insurance cash flows with policyholder behaviour Life insurance cash flows wih policyholder behaviour Krisian Buchard,,1 & Thomas Møller, Deparmen of Mahemaical Sciences, Universiy of Copenhagen Universiesparken 5, DK-2100 Copenhagen Ø, Denmark PFA Pension,

More information

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS Hong Mao, Shanghai Second Polyechnic Universiy Krzyszof M. Osaszewski, Illinois Sae Universiy Youyu Zhang, Fudan Universiy ABSTRACT Liigaion, exper

More information

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS RICHARD J. POVINELLI AND XIN FENG Deparmen of Elecrical and Compuer Engineering Marquee Universiy, P.O.

More information

Making a Faster Cryptanalytic Time-Memory Trade-Off

Making a Faster Cryptanalytic Time-Memory Trade-Off Making a Faser Crypanalyic Time-Memory Trade-Off Philippe Oechslin Laboraoire de Securié e de Crypographie (LASEC) Ecole Polyechnique Fédérale de Lausanne Faculé I&C, 1015 Lausanne, Swizerland philippe.oechslin@epfl.ch

More information

RC (Resistor-Capacitor) Circuits. AP Physics C

RC (Resistor-Capacitor) Circuits. AP Physics C (Resisor-Capacior Circuis AP Physics C Circui Iniial Condiions An circui is one where you have a capacior and resisor in he same circui. Suppose we have he following circui: Iniially, he capacior is UNCHARGED

More information

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal Quarerly Repor on he Euro Area 3/202 II.. Deb reducion and fiscal mulipliers The deerioraion of public finances in he firs years of he crisis has led mos Member Saes o adop sizeable consolidaion packages.

More information

Maintenance scheduling and process optimization under uncertainty

Maintenance scheduling and process optimization under uncertainty Compuers and Chemical Engineering 25 (2001) 217 236 www.elsevier.com/locae/compchemeng ainenance scheduling and process opimizaion under uncerainy C.G. Vassiliadis, E.N. Piikopoulos * Deparmen of Chemical

More information

Stock Trading with Recurrent Reinforcement Learning (RRL) CS229 Application Project Gabriel Molina, SUID 5055783

Stock Trading with Recurrent Reinforcement Learning (RRL) CS229 Application Project Gabriel Molina, SUID 5055783 Sock raing wih Recurren Reinforcemen Learning (RRL) CS9 Applicaion Projec Gabriel Molina, SUID 555783 I. INRODUCION One relaively new approach o financial raing is o use machine learning algorihms o preic

More information

Risk Modelling of Collateralised Lending

Risk Modelling of Collateralised Lending Risk Modelling of Collaeralised Lending Dae: 4-11-2008 Number: 8/18 Inroducion This noe explains how i is possible o handle collaeralised lending wihin Risk Conroller. The approach draws on he faciliies

More information

Sampling Time-Based Sliding Windows in Bounded Space

Sampling Time-Based Sliding Windows in Bounded Space Sampling Time-Based Sliding Windows in Bounded Space Rainer Gemulla Technische Universiä Dresden 01062 Dresden, Germany gemulla@inf.u-dresden.de Wolfgang Lehner Technische Universiä Dresden 01062 Dresden,

More information

Nikkei Stock Average Volatility Index Real-time Version Index Guidebook

Nikkei Stock Average Volatility Index Real-time Version Index Guidebook Nikkei Sock Average Volailiy Index Real-ime Version Index Guidebook Nikkei Inc. Wih he modificaion of he mehodology of he Nikkei Sock Average Volailiy Index as Nikkei Inc. (Nikkei) sars calculaing and

More information

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar Analogue and Digial Signal Processing Firs Term Third Year CS Engineering By Dr Mukhiar Ali Unar Recommended Books Haykin S. and Van Veen B.; Signals and Sysems, John Wiley& Sons Inc. ISBN: 0-7-380-7 Ifeachor

More information

ARCH 2013.1 Proceedings

ARCH 2013.1 Proceedings Aricle from: ARCH 213.1 Proceedings Augus 1-4, 212 Ghislain Leveille, Emmanuel Hamel A renewal model for medical malpracice Ghislain Léveillé École d acuaria Universié Laval, Québec, Canada 47h ARC Conference

More information

Forecasting and Information Sharing in Supply Chains Under Quasi-ARMA Demand

Forecasting and Information Sharing in Supply Chains Under Quasi-ARMA Demand Forecasing and Informaion Sharing in Supply Chains Under Quasi-ARMA Demand Avi Giloni, Clifford Hurvich, Sridhar Seshadri July 9, 2009 Absrac In his paper, we revisi he problem of demand propagaion in

More information

Network Discovery: An Estimation Based Approach

Network Discovery: An Estimation Based Approach Nework Discovery: An Esimaion Based Approach Girish Chowdhary, Magnus Egersed, and Eric N. Johnson Absrac We consider he unaddressed problem of nework discovery, in which, an agen aemps o formulae an esimae

More information

The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance

The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance 1 The Belief Roadmap: Efficien Planning in Belief Space by Facoring he Covariance Samuel Prenice and Nicholas Roy Absrac When a mobile agen does no known is posiion perfecly, incorporaing he prediced uncerainy

More information

Dispatch-and-Search: Dynamic Multi-Ferry Control in Partitioned Mobile Networks

Dispatch-and-Search: Dynamic Multi-Ferry Control in Partitioned Mobile Networks Dispach-and-Search: Dynamic Muli-Ferry Conrol in Pariioned Mobile Newors ing He, Ananhram Swami, and Kang-Won Lee IBM Research, Hawhorne, NY 1532 USA {he,angwon}@us.ibm.com Army Research Laboraory, Adelphi,

More information

Present Value Methodology

Present Value Methodology Presen Value Mehodology Econ 422 Invesmen, Capial & Finance Universiy of Washingon Eric Zivo Las updaed: April 11, 2010 Presen Value Concep Wealh in Fisher Model: W = Y 0 + Y 1 /(1+r) The consumer/producer

More information

Capacitors and inductors

Capacitors and inductors Capaciors and inducors We coninue wih our analysis of linear circuis by inroducing wo new passive and linear elemens: he capacior and he inducor. All he mehods developed so far for he analysis of linear

More information

Table of contents Chapter 1 Interest rates and factors Chapter 2 Level annuities Chapter 3 Varying annuities

Table of contents Chapter 1 Interest rates and factors Chapter 2 Level annuities Chapter 3 Varying annuities Table of conens Chaper 1 Ineres raes and facors 1 1.1 Ineres 2 1.2 Simple ineres 4 1.3 Compound ineres 6 1.4 Accumulaed value 10 1.5 Presen value 11 1.6 Rae of discoun 13 1.7 Consan force of ineres 17

More information

Understanding the Profit and Loss Distribution of Trading Algorithms

Understanding the Profit and Loss Distribution of Trading Algorithms Undersanding he Profi and Loss Disribuion of Trading Algorihms Rober Kissell Vice Presiden, JPMorgan Rober.Kissell@JPMChase.com Robero Malamu, PhD Vice Presiden, JPMorgan Robero.Malamu@JPMChase.com February

More information