Fast gradent descent method for mean-cvar optmzaton Garud Iyengar Alfred Ka Chun Ma February 27, 2009 Abstract We propose an teratve gradent descent procedure for computng approxmate solutons for the scenaro-based mean-cvar portfolo selecton problem. Ths procedure s based on an algorthm proposed by Nesterov [13] for solvng non-smooth convex optmzaton problems. Our procedure does not requre any lnear programmng solver and n many cases the teratve steps can be solved n closed form. We show that ths method s sgnfcantly superor to the lnear programmng approach as the number of scenaros becomes large. 1 Introducton ntro} The goal of portfolo selecton s to dstrbute a fxed amount of captal over a gven set of nvestment opportuntes to maxmze return whle managng the rsk. Although the benefts of dversfyng were well-known, the frst mathematcal model for portfolo selecton was proposed by Markowtz [10]. In the Markowtz model, the return of a portfolo s gven by the expected return of the portfolo and the rsk of the portfolo s measured by the varance of the return of the portfolo. The varance s a good measure of rsk only f the returns are symmetrc. The returns on equty, at least for short tme horzons, can be approxmated by a Normal random varable; consequently, the varance s an adequate measure for the rsk n the portfolo. However, when the dstrbuton of the returns of the underlyng assets s not symmetrc, varance s not an adequate rsk measure. Recently, Condtonal Value-at-Rsk (CVaR [15] has been proposed as a rsk measure for asset classes that have asymmetrc return dstrbutons. CVaR has many nce propertes: t s coherent rsk measure [4], Rockafellar and Uryasev [14] show that the CVaR of Department of Industral Engneerng and Operatons Research, Columba Unversty, New York, NY 10027. Emal: garud@eor.columba.edu Department of Industral Engneerng and Operatons Research, Columba Unversty, New York, NY 10027. Emal: km2207@columba.edu 1
a portfolo can be computed from scenaro by solvng a lnear program (LP, usng LP dualty CVaR upper bound constrants can be formulated as lnear constrants, and emprcal studes suggest that the mean-cvar approach where the portfolo return s gven by ts expected return and the portfolo rsk s gven by the CVaR of the portfolo s more approprate than the mean-varance approach f the rsk-return relaton s nonlnear [1]. From the results n Rockafellar and Uryasev [14], t follows that the mean-cvar portfolo selecton problem reduces to an LP. However, the resultng LP s very ll-condtoned and solvng such LP, partcularly when the scenaro sze s large, s very dffcult n practce [2]. We adapt a gradent descent method proposed by Nesterov [13] to solve the mean-cvar optmzaton problem. The method we propose does not requre solvng an LP and therefore t s able to potentally handle a very large number of scenaros. In addton, the method can be easly mplemented. These features mply that a portfolo manager can use our method wthout nstallng any thrd-party LP solvers. We also show how to ncorporate analysts vews nto the mean-cvar portfolo selecton problem [5, 6]. 2 Mean-CVaR optmzaton Suppose there are n assets n the market. Let R R n denote the random returns on the n assets. Let w R n denote the portfolo of the nvestor,.e., 1 T w = n =1 w = 1. The CVaR 1 β ( Rw at the probablty β (0, 1 of the portfolo w s defned as CVaR 1 β ( Rw =E P [ Rw Rw F 1 Rw (β], where Rw denotes the loss on the portfolo w, and F Rw denote the cumulatve densty functon (CDF of the random varable Rw. Thus, the CVaR s condtonal expectaton of the lowest β-quantle of the random portfolo return. The mean-cvar portfolo selecton problem we consder s as follows: mn w W CVaR 1 β( Rw, (1 meancvar} where the set W s the set of all feasble portfolos w. For example, by settng W = } w : E P (R T w = r, 1 T w =1, 2
where E P [R] denotes the expected returns on the assets, one recovers the canoncal mean-cvar portfolo selecton problem where the goal s to select the mnmum CVaR portfolo that has a target return r. Rockafellar and Uryasev [14, 15] show that CVaR 1 β ( Rw = mn (τ + 1β EP ( Rw τ +, (2 cvar} τ where 1 β s the confdence level and the functon (x + = max(x, 0. It s typcally very hard to explctly characterze the dstrbuton of the returns R, and therefore, n practce, E P ( Rw τ + s approxmated by usng return vectors R generated by some scenaro generator [8]. Let R : =1,..., N} denote N scenaros and let p, =1,..., N, denote the probablty of the -th scenaro. Then the expectaton n (2 can be approxmated as follows. N E P ( Rw τ + p ( R T w τ +. =1 By ntroducng new varables a ( R T w τ+, = 1, =1,..., N, the optmzaton problem (1 can be reformulated nto the lnear program (LP mn τ + 1 β N =1 p a s.t. a R T w τ, =1,..., N, Aw = b, (3 lpmeancva for W n the form of W = a 0, } w : Aw = b. The LP (3 s large t has O(N constrants, and s, often, very ll-condtoned [2]. Thus, solvng the LP (3 as the number of samples N becomes large s very hard. See Secton 4 for further evdence of the numercal nstablty of the LP formulaton. Our soluton method for the optmzaton problem (1 s based on the followng varatonal characterzaton of CVaR [4, 16, 9] CVaR 1 β ( Rw = max Q Q EQ ( Rw, (4 cvardual} where Q denotes a probablty measure on the returns R and the set of measures Q = Q :0 Q P 1 }. β 3
Thus, the mean-cvar portfolo selecton problem (1 can be formulated as the followng mn-max problem mn max w W Q Q EQ ( Rw. (5 meancvarg Ths formulaton can be thought of as a game played by the nature and the portfolo manager. It s then natural to consder teratve methods to solve the mean-cvar portfolo selecton problem. When the dstrbuton P s approxmated by N scenaros, the set of measures Q s gven by Q N = q R N : 1 T q =1, 0 q 1 β p }, (6 eq:cq-deg where p =(p 1,..., p N T and the nequaltes are nterpreted as component-wse nequaltes. From now on, we let R T =[R 1,..., R N ] R n N denote the matrx where the -th column s the asset return n the -the scenaro, =1,..., N. Thus, the scenaro-based mean-cvar problem reduces to the saddle-pont problem mn max q T Rw }. (7 meancvarw W q Q N 3 An teratve algorthm We solve the mnmax problem (7 usng a gradent-based procedure proposed by Nesterov [13]. Ths procedure requres that the admssble set of portfolos W be bounded. In practce, there s always margn requrement on the short postons n the portfolo. Such a margn requrement can be modeled as follows. (1 + M ( w + w +, (8 margnreq for some M>0. Snce the portfolo weghts sum to one, we have 1= n w + j n n ( w j + (1 + M ( w j + n n ( w j + = M ( w j +. (9 margnbou Therefore, we have n n w 1 = w + j +( w j + = 1 + 2 ( w j + 1+ 2. (10 normbound M In order to keep the portfolos w bounded, we wll mpose constrants n the form of w 1 1 + 2/M or w 2 w 1 1 + 2/M. 4
A nave approach to solve the modfed mnmax problem (7 would nvolve generatng terates (w (k, q (k }, where w (k s the best-response to the nature s move q (k 1,.e., } w (k = argmn w W (q (k 1 T Rw, and q (k s the best-response to the nvestor s move w (k 1,.e., q (k = argmn q T Rw (k 1} q Q N The objectve q T Rw s not smooth n (w, q; consequently, ths teratve scheme converges very slowly. Nesterov [13] devsed a procedure that s able to escape ths convergence bottleneck. The Nesterov procedure conssts of two steps. The frst step s smoothng the optmzaton n q:. Let w (k denote the k-th terate. Then the smoothed best response of nature s gven by } q (k = argmax q Q q T Rw (k µd 2 (q, (11 fmu} where µ>0 and d 2 (q s any strongly convex functon. We choose d 2 (q = N q log q +(p /β q log(p /β q. (12 d2} =1 In Appendx A, we show that d 2 (q s strongly convex wth parameter σ 2 = 1 1 β wth respect to the l 1-norm. The Lagrangan functon L for optmzaton n q s gven by N N L(q = q T Rw (k µd 2 (q α(1 T q 1 + µ q ν (q p /β. =1 =1 Settng q L = 0, we have that q (k must satsfy ( R T w (k µ ln q (k p /β q (k α + µ ν =0,.e., q (k p /β q (k = e R T w(k α+µ ν µ. Thus, t follows that for all values of (α, µ, ν, we have that 0 < q (k < p/β. Therefore, complementary 5
slackness mples that µ = ν = 0, and q (k = where α s the soluton of the equaton β 1 p 1+e 1 µ (RT w(k +α, =1,..., N, (13 q-opt} p β 1 1+e 1 µ (RT w(k +α =1. (14 sumofq} The second-step n the Nesterov procedure s to compute the update w (k usng a convex combnaton of two updates z (k and y (k defned as follows. y (k = argmn q (k 1 Ry + Ω } y w (k 1 2 2, (15 ykdef} y W 2µσ 2 z (k = argmn z W w (k = ( 1 k +3 k 1 ( t +1 2 t=0 ( k +1 k +3 z (k + ( Ω q (t Rz + z 2}, (16 zkdef} 2µσ 2 y (k, (17 where Ω = max max q 1 1 w 2 1 ( q T Rw 2 = max R 2 2 and σ 2 s the convexty parameter for the strongly convex functon d 2 (q. The terate y (k s a modfed best-response where one penalzes large movements from the last response w (k 1. The terate z (k n (16 consders all the prevous responses q (t : t =0,..., k 1 } to compute the response. The weght on y (k ncreases as the teraton count k ncreases. When the set W s descrbed by lnear equaltes,.e., W = w : Aw = b }, we add the addtonal constrant w 2 1+2/M, and n ths case t s easy to show that (15 and (16 can be solved n closed form. When the set W s descrbed by lnear nequalty constrants, we mpose the constrant w 1 1+2/M. Then (15 and (16 are quadratc programs that can, n practce, be solved very effcently usng actve set methods. Note that each quadratc problem encountered n the course of our proposed teratve procedure has n varables and O(m constrants, where m denotes the number of components n b. Nesterov [13] proves that after N steps the output (ŵ, q of the algorthm dsplayed n Fgure 1 satsfes } } ( mn q T Rŵ max q T D1 D 2 Ω 1 2 Rw < δ N = 1, (18 gap} q Q N w W σ 2 K 6
Nesterov Procedure D 1 1 2 (1 + ( 2 M 2, D 2 1 β β ln(β (1 β ln(1 β, σ2 1 Ω max R 2 2, K 1 ΩD 1D 2 ε σ 2, µ ε 2D 2, w (0 1 n 1 for k 0 to K do } q (k argmax q Q q T Rw (k µd 2 (q } y (k+1 argmn y W q (k Ry + Ω 2µσ 2 y w (k 2 2 z (k+1 argmn z W ( ( k t+1 t=0 2 q (t Ω Rz + ( ( w (k+1 z (k+1 + y (k+1 1 k+3 return ŵ = y (K, q = K k=0 ( k+1 k+3 2(+1 (N+1(N+2 q (k. 2µσ 2 } 1 β Fgure 1: Nesterov Procedure fg:neste.e., after K teratons the algorthm produces a par (ŵ, q that are δ N -optmal polces for nature and the nvestor. One can, therefore, termnate the algorthm once we are satsfed wth the qualty of the portfolo. ( 1 2 D Moreover, choosng K 1D 2Ω σ 2 1 ε can ensure that the output of the algorthm s ε-optmal. In our numercal calculatons we found that usng the gap n (18 termnates the algorthm much qucker than usng the upper bound. The man features of ths algorthm are as follows. (a The modfed best-response y (k and z (k of the nvestor are computed by solvng a separable quadratc optmzaton problem that s smlar to the mean-varance portfolo selecton wth uncorrelated assets. Ths mples that the technology for mean-varance optmzaton can be used to solve the mean-cvar problem. (b The terates ( w, q are at least δ N -optmal, and often, the qualty of the soluton s sgnfcantly superor to that mpled by the bound. Thus, one can termnate the algorthm at any stage where one obtans a soluton of suffcent qualty. (c In Secton 4, we show that ths algorthm converges to a reasonably accurate soluton wth the error ε = 10 3 very quckly even when the number of scenaros N = 10 6. Snce the scenaro-based mean- CVaR problem s tself an approxmaton to the orgnal problem, solvng the scenaro-based CVaR very accurately does not serve any purpose. 7
4 Numercal results results} We tested our procedure on the example n [12]. Our asset unverse conssted of Treasury bonds wth 2, 5, 10, and 30 years to maturty. As n the example n [12], we approxmated the returns on the assets a Delta-Gamma approxmaton usng the yelds on bonds wth 6 month, 2 years, 5 years, 10 years, 20 years, and 30 years to maturty as the rsk-factors. We smulated N scenaros for the rsk factors and then used the Delta-Gamma approxmaton to compute N return scenaros. We refer the reader to [11] for a detaled dscusson of the smulaton procedure. In Table 1, we dsplay the optmal soluton to the LP formulaton for the mean-cvar problem (3 wth β =0.05 and N = 10 6. We use MOSEK [3] to solve these LPs. Table 2 shows the optmal portfolo computed by our proposed algorthm wth the error tolerance ε = 10 3. The portfolos produced by our algorthm and the LP formulaton (3 are qute dfferent; although the CVaR values are close. These results only mply that the LP approach and the our proposed teratve approach are consstent,.e., both approaches are able to solve the mean-cvar problem; these results are not able to dfferentate between the two approaches. The most mportant results of ths secton are n Tables 3 and 4. In Table 3 we dsplay the CPU tme for solvng the LP formulaton usng ILOG CPLEX [7] and MOSEK, and the CPU tme for computng an ε =0.001 optmal soluton usng our algorthm as a functon of the number of scenaros N. It s clear that the ndustry leader LP solver CPLEX performs very poorly on ths problem. MOSEK performs much better but the run tmes for ths commercal solver s an order of magntude hgher than that of our MATLAB-based code. Table 4 dsplays the run tmes and the number of teratons requred by our algorthm as a functon of the accuracy ε. The performance of our algorthm degrades very quckly as ε decreases. Therefore, ths algorthm s only suted for applcatons where one wants to compute a reasonably accurate soluton very quckly. An example of such an applcaton s hgh-frequency tradng. The data n hgh-frequency tradng s typcally very nosy; therefore, t s pontless to compute a very accurate soluton. Note that the LP approach does not allow any flexblty n settng the accuracy level. Next, we show how to use analysts vews to bas the sample probablty mass functon p. We restrct ourselves to vews of the form: ν T R g, where ν R n s a vector that determnes the partcular lnear combnaton of the return vector R, and g s a probablty densty on R. We convert ths vew on the dstrbuton of the random return R to a vew on 8
dstrbuton of the N sample returns R, t =1,..., n, by defnng a vew probablty vector p = g(ν T R N, =1,..., N. k=1 g(νt R k Suppose we have m dfferent vews,.e., there are m dfferent vew probablty vectors p (j, j =1,..., m. We combne these vectors nto a sngle sample probablty vector p as follows: m p = u j p (j + u 0 p (0, (19 probabl where p (0 = 1 N 1 denotes the emprcal measure, and u j denotes the confdence weght on vew p (j. Snce p s a probablty vector, we requre that N j=0 u j = 1. Next, we solve the mean-cvar problem wth scenaro probablty vector p. Our algorthm also works wth other technques for combnng vews, see, for example [5, 6, 12]. For our numercal experments, we set m = 2. The two vews were chosen to be [ T ν (1 = 0 1 0 0 1 0], g (1 = unf[0, 0.001], [ T ν (2 = 0 0.5 1 0.5 0 0], g (2 = unf[0, 0.0005]. The weght vector was set such that u 0 =0.9, u 1 = u 2 =0.05,.e., we assumed that we had 90% confdence n the emprcal dstrbuton and 5% confdence n each of the two vews. Table 5 shows the optmal portfolo computed usng the LP formulaton (3. As n the prevous case, the LP was solved usng MOSEK. Table 6 shows the results computed usng our algorthm wth ε =0.001. 5 Concluson In ths paper, we propose an effcent algorthm for solvng mean-cvar portfolo selecton problem wthout usng an LP solver. As shown n the numercal experments, the algorthm s a useful alternatve to the LP approach when one wants a very fast solver that guarantees an accuracy algorthm wth ε 10 3. Ths technque can also be extended to solve many other types of portfolo selecton problems. 9
bond/target return (r 0.0020 0.0035 0.0045 0.0050 2y 0.1856 0.7919 1.4413 1.7684 5y 0.9591 2.1602 2.9548 3.3541 10y 0.1857 0.4601 0.6379 0.7244 30y 0.0409 0.0917 0.1244 0.1387 CVaR 0.00994 0.01597 0.02003 0.02206 Table 1: Optmal portfolo and CVaR for the Mean-CVaR problem solved by LP approach. tab:1} bond/target return (r 0.0020 0.0035 0.0045 0.0050 2y 0.4739 0.2455 0.0932 0.0171 5y 0.3282 0.2484 0.1953 0.1687 10y 0.1888 0.2512 0.2928 0.3136 30y 0.0090 0.2548 0.4187 0.5006 CVaR 0.0101 0.0171 0.0218 0.0244 Error 0.0001 0.0011 0.0017 0.0024 Table 2: Optmal portfolo and CVaR for the Mean-CVaR problem solved by our algorthm, and the absolute error compared wth LP approach. tab:1b} References [1] V. Agarwal and N.Y. Nak. Rsks and portfolo decsons nvolvng hedge funds. Revew of Fnancal Studes, 17(1:63 98, Sprng 2004. [2] S. Alexander, T.F. Coleman, and Y. L. Mnmzng CVaR and VaR for a portfolo of dervatves. Journal Bankng and Fnance, 30(2:583 605, February 2006. [3] E. D. Andersen and K. D. Andersen. The MOSEK optmzaton toolbox for MATLAB manual Verson 4.0. http://www.mosek.com/products/4 0/tools/help/ndex.html, 2006. [4] P. Artzner, F. Delbean, J.M. Eber, and D. Heath. Coherent measure of rsks. Mathematcal Fnance, 9(3:203 228, July 1999. [5] F. Black and R. Ltterman. Asset allocaton: combnng nvestor vews wth market equlbrum. Goldman Sachs Fxed Income Research, 1990. [6] F. Black and R. Ltterman. Asset allocaton: combnng nvestor vews wth market expectatons. Journal of Fxed Income, 1(1:7 18, September 1991. [7] ILOG. ILOG CPLEX 11.1. http://www.log.com/products/cplex/, 2008. [8] Y.K. Koskosds and A.M. Duarte Jr. A scenaro-based approach to actve asset allocaton. Journal of Portfolo Management, 23:74 85, Wnter 1997. 10
N CPLEX MOSEK Our algorthm (Iteratons 10000 1.42 0.76 0.34 (1 50000 39.25 3.00 0.56 (1 100000 155.71 4.41 1.06 (1 500000 6633.09 25.08 2.76 (1 1000000 44439.79 50.62 5.54 (1 Table 3: CPU tme for both methods n second and number of teratons requred for our algorthm. tab:3} ε η CPU tme Iteratons CVaR Error 0.001 0.01 4.06 1 0.0244 0.0023 0.0005 0.005 4.77 1 0.0244 0.0023 0.0002 0.002 3331.4 703 0.0240 0.0019 0.0001 0.001 9457.8 2192 0.0230 0.00094 Table 4: CPU tme and teraton counts for our algorthm. tab:4} [9] H. Lüth and J. Doege. Convex rsk measures for portfolo optmzaton and concepts of flexblty. Mathematcal Programmng, 104(2:541 559, November 2005. [10] H.M. Markowtz. Portfolo selecton. Journal of Fnance, 7(1:77 91, March 1952. [11] A. Meucc. Rsk and asset allocaton. Sprnger, 2005. [12] A. Meucc. Beyond black-ltterman: Vews on non-normal markets. Rsk Magazne, 19:87 92, 2006. [13] Y. Nesterov. Smooth mnmzaton of non-smooth functons. Mathematcal Programmng, 103(1:127 152, May 2005. [14] R.T. Rockafellar and S. Uryasev. Optmzaton of condtonal value-at-rsk. Journal of Rsk, 2(3:21 41, 2000. [15] R.T. Rockafellar and S. Uryasev. Condtonal value-at-rsk for general loss dstrbutons. Journal Bankng and Fnance, 26(7:1443 1471, July 2002. [16] R.T. Rockafellar, S. Uryasev, and M. Zabarankn. Devaton measures n rsk analyss and optmzaton. Techncal report, Department of Industral and System Engneerng, Unversty of Florda, 2002. Appendx A Detals of the parameters n the Nesterov algorthm The Hessan 2 d 2 (q of the smoothng functon d 2 (q = N by q log q +(β 1 p q log(β 1 p q s gven 2 (d 2 (q = dag([q 1 1,..., q 1 N ] + dag([β 1 p 1 q 1 1,..., (β 1 p N q N 1 ]. compmaxt 11
r 0.0020 0.0035 0.0045 0.0050 Bond New Change New Change New Change New Change 2y 0.0328 0.1528 1.0998 0.3079 1.8520 0.4107 2.2297 0.4613 5y 1.1395 0.1804 2.5328 0.3726 3.4553 0.5005 3.9186 0.5645 10y 0.1923 0.0066 0.4862 0.0261 0.6785 0.0406 0.7745 0.0501 30y 0.0199 0.021 0.0533 0.0384 0.0753 0.0491 0.0855 0.0532 CVaR 0.0110 0.00106 0.0180 0.00203 0.0228 0.00277 0.0252 0.00314 Table 5: Optmal portfolo and CVaR for the Mean-CVaR problem solved by our algorthm wth weghts on vews u 0 =0.9, u 1 = u 2 =0.05 by LP approach. tab:2a} r 0.0020 0.0035 0.0045 0.0050 Bond New Change New Change New Change New Change 2y 0.4457 0.0282 0.1860 0.0595 0.0128 0.0804 0.0737 0.0901 5y 0.3175 0.0107 0.2280 0.0204 0.1683 0.027 0.1384 0.0303 10y 0.1960 0.0072 0.2677 0.0162 0.3155 0.0227 0.3394 0.0258 30y 0.0408 0.0318 0.3184 0.0636 0.5034 0.0847 0.5959 0.0953 CVaR 0.0113 0.0012 0.0196 0.0025 0.0254 0.0036 0.0283 0.0039 Error 0.0003 0.0016 0.0026 0.0031 Table 6: Optmal portfolo and CVaR for the Mean-CVaR problem solved by our algorthm wth weghts on vews u 0 =0.9, u 1 = u 2 =0.05 by our algorthm. tab:2b} Therefore, h T 2 (d 2 (qh = = = N =1 ( N =1 ( N =1 h 2 q + N =1 h 2 q h 2 (β 1 p q ( N =1 q 1 ( N + =1 h q 2 1 + q β 1 1 1 1 β h 2 1, h 2 ( N (β 1 p q ( N =1 =1 (β 1 p q β 1 1 (20 h β 1 p q β 1 p q 2 (21 where (20 follows from the fact that q = 1 and (β 1 p q =β 1 1, and (21 follows from the Cauchy-Schwatrz nequalty. By settng w (k = 0 n (11, t follows that q mn = argmn q QN d 2 (q} satsfes q mn = β 1 p, =1,..., N, 1+eα/µ 12
where α s chosen to ensure that 1 T q mn = 1. Therefore, t follows that q mn = p, and mn q Q N d 2 (q = p log p + ( p (β 1 1 log p + log(β 1 1. Snce d 2 (q s a convex functon, max q QN d 2 (q occurs at extreme ponts of the polytope Q N. The extreme ponts of the polytope Q N are of the form: β 1 p, π(1,..., π(k 1}, q = 0, π(k + 2,..., π(n} where π s a permutaton of the set 1,..., N} and q π(k+1 [0, β 1 p π(k+1 ] s chosen to ensure that N =1 q = 1. The value d 2 (q = β 1 p π( ln(β 1 p π( : π(k+1 + q π(k+1 ln(q π(k+1 +(β 1 p π(k+1 q π(k+1 ln(β 1 p π(k+1 q π(k+1 β 1 p π( ln(β 1 p π(, where the last nequalty follows from q π(k+1 ln(q π(k+1 + (β 1 p π(k+1 q π(k+1 ln(β 1 p π(k+1 q π(k+1 (β 1 p π(k+1 ln(β 1 p π(k+1. Thus, D 2 = max q Q d 2(q mn q Q d 2(q β 1 p log(β 1 p p log p = β 1( β log β + (1 β log(1 β. ( p (β 1 1 log p + log(β 1 1 (22 D2_q} 13