Agent-based Micro-Storage Management for the Smart Grid

Agent-based Mcro-Storage Management for the Smart Grd Perukrshnen Vytelngum, Thomas D. Voce, Sarvapal D. Ramchurn, Alex Rogers, and Ncholas R. Jennngs Intellgence, Agents, Multmeda Group, School of Electroncs and Computer Scence, Unversty of Southampton, UK {pv,tdv,sdr,acr,nrj}@ecs.soton.ac.uk ABSTRACT The use of energy storage devces n homes has been advocated as one of the man ways of savng energy and reducng the relance on fossl fuels n the future Smart Grd. However, f mcro-storage devces are all charged at the same tme usng power from the electrcty grd, t means a hgher demand and, hence, requres more generaton capacty, results n more carbon emssons, and, n the worst case, breaks down the system due to over-demand. To allevate such ssues, n ths paper, we present a novel agentbased mcro-storage management technque that allows all (ndvdually-owned) storage devces n the system to converge to proftable, effcent behavour. Specfcally, we provde a general framework wthn whch to analyse the Nash equlbrum of an electrcty grd and devse new agent-based storage learnng strateges that adapt to market condtons. Taken altogether, our soluton shows that, specfcally, n the UK electrcty market, t s possble to acheve savngs of up to 13% on average for a consumer on hs electrcty bll wth a storage devce of 4 kwh. Moreover, we show that there exsts an equlbrum where only 38% of UK households would own storage devces and where socal welfare would be also maxmsed (wth an overall annual savngs of nearly GBP 1.5B at that equlbrum). Categores and Subject Descrptors I.2.11 [Dstrbuted Artfcal Intellgence]: Mult-agent Systems General Terms Algorthms, Management, Economcs Keywords Agent-based smulaton, Smart Grd, Energy, Mcro-storage 1. INTRODUCTION Energy storage s one of the key underpnnngs of the vson of the Smart Grd whch ams to support sustanable energy provsonng across the world [2, 4, 8]. Gven ths, Cte as: Agent-based Mcro-Storage Management for the Smart Grd, Vytelngum et al., Proc. of 9th Int. Conf. on Autonomous Agents and Multagent Systems (AAMAS 2010), van der Hoek, Kamnka, Lespérance, Luck and Sen (eds.), May, 10 14, 2010, Toronto, Canada, pp. 39-46 Copyrght c 2010, Internatonal Foundaton for Autonomous Agents and Multagent Systems (www.faamas.org). All rghts reserved. research has been focused on desgnng new effcent low cost storage devces that would be able to effcently store electrcty for long perods of tme and allow a suffcent number of chargng/dschargng cycles wthout sgnfcant degradaton n performance [8]. 1 By usng such devces, t s expected that energy usage can be mproved n a number of ways. If storage devces can be used to supply home devces at peak electrcty consumpton tmes (typcally n the mornng and evenng), then t should be possble to lower peak demand such that fewer carbon-ntensve and expensve peakng plant generators are requred, thus reducng both energy costs and carbon emssons. Furthermore, storage devces can be used to compensate for the varablty of typcal renewable electrcty generaton (e.g., wnd, wave, solar), thus makng the ntegraton of such generaton facltes nto the exstng grd more vable n practce [8]. Such energy storage may even take the form of electrc vehcles (EVs) or plug-n hybrd electrc vehcles (PHEVs); ths vson s sometmes referred to as vehcle to grd (V2G). There are, however, a number of potental challenges n ths settng. For example, consder ndvdual homes (among the 26M UK households) storng electrcty accordng to ther own needs and all decdng to charge ther batteres at the same tme (e.g. ncentvsed by cheaper prces). Now, not only would ths cause a hgher peak n demand n the electrcty market, and hence hgher carbon emssons and more costly electrcty, but, n the worst case, t could cause blackouts and nfrastructure damage f ths demand were to exceed network capacty. Moreover, f ndvduals were only chargng ther batteres accordng to the amount they use, they may be payng for electrcty at a hgher prce than f they dd not have the devce when the cost of the battery s added to the mx. Fnally, f most homes n the system start usng storage and manage to shave off peak demand, electrcty prces may become lower than the prce of storng electrcty. To address such ssues, the mult-agent systems paradgm has been advocated as both a soluton and a framework to analyse the propertes of systems n whch multple selfnterested partes nteract [3, 6, 9]. In partcular, wth the advent of smart meters that can montor and control devces n the home, t s now possble to envsage that smart software agents could be nstalled on these devces. These agents would then be able to optmse the usage and storage profle of the house usng nformaton from varous sources (e.g., weather data to predct energy and hence heatng costs or prce plan data from supplers). Now, most of the exstng approaches to applyng ntellgent agents typcally study 1 See batteres recently developed by Ceramatec: http://www.ceramatec.com 39

how ndvdual homes could optmse the way they store energy or how storage devces could coordnate wth renewable energy generaton facltes to maxmse energy used from such sources (see Secton 2 for more detals). However, ther approach gnores the ndvdual preferences of each home and does not exactly model the real mpact of agents learnng to adapt to the constrants that they themselves mpose on the system. Thus, an approach that focuses on the system dynamcs where all agents n the system are gven the freedom to buy electrcty whenever they see ft, would address these ssues. In ths paper we address ths shortcomng and provde a game-theoretc framework for modellng storage devces n large-scale systems where each storage devce s owned by a self-nterested agent that ams to maxmse ts monetary profts. Usng ths framework, under certan assumptons, we are able to predct the behavour of the system gven that each agent behaves ratonally (.e. always adopts a storage profle that mnmses ts costs) and only reacts to a prce sgnal. Buldng on ths, we then go on to devse ntellgent agent-based storage strateges that can learn the best storage profle gven the market prces that keep changng as a result of consumers usng storage. In more detal, ths work advances the state of the art n the followng ways: 1. We provde a novel game-theoretc framework to study storage strateges that agents mght adopt. Gven the normal electrcty usage profle of all users n the system, we are then able to compute the Nash equlbra whch descrbe when agents are gong to charge ther batteres, use ther stored electrcty, or use electrcty from the grd. 2. We provde new agent-based mcro-storage strateges that are able to learn the best storage profle to adopt when agents n the system may not have exactly the same storage capactes or effcences. Our strateges are shown to converge to the same Nash equlbra as those predcted by our framework. 3. Gven our agent-based learnng strateges, we are able to show how agents could learn to buy the most proftable storage capacty and usng evolutonary game theoretc analyss, we are able to predct the porton of the populaton that would actually acqure storage capacty to maxmse ther savngs. In short, ths s the frst attempt at modellng, predctng equlbra, and buldng ntellgent strateges for the problem of electrcty storage on a large scale. The rest of ths paper s structured as follows. In Secton 2 we dscuss related work n the area of electrcty storage and electrcty markets. Secton 3 dscusses the key features of the electrcty markets and lays down the general assumptons upon whch we buld our framework. Secton 4 presents our game-theoretc framework and shows how the Nash equlbra of the system can be computed. Buldng on ths, Secton 5 descrbes the dynamcs of a market where agents are gven the ablty to learn ther best storage profle and, and Secton 6 emprcally studes ths system through smulatons. Fnally, Secton 7 concludes. 2. BACKGROUND Very lttle work exsts on the applcaton of agent-based technques to storage management n electrcty grds. Typcally, electrcty storage has manly been a concern of the energy supplers usng large chemcal batteres to store energy from ntermttent renewable energy sources (e.g., wnd or solar) [7]. The effect of such large scale storage on commodty markets n general s a mature area of study (see [5, 11] for some state of the art energy/fuel market specfc results). However, the electrcty market s unusual n that t has large daly cycles of demand, (see Fgure 1 for the average UK daly consumer load profle) whch make electrcty storage potentally proftable even on an ndvdual household scale. Wth the advent of new types of batteres that charge up to 20kWh of energy a day (.e., suffcent to power all the devces n a house for a day), t s now possble to envsage that mcro-storage devces wll be wdely used. Indeed, the energy storage requrements of a typcal home are well algned wth the storage requred n a feasble EV or PHEV. Moreover, wth the advent of smart meters, t wll be possble to manage the storage and usage of electrcty wthn a sngle home usng software agents resdng on such meters. Thus, decentralsed autonomous agent-based approaches are strong canddates for managng energy storage n future electrcty networks. In ths context, we note the semnal work of Daryanan et al. [1] whch llustrated how ndvdual smart meters could optmse, through teratve algorthms, the storage profle of a house. Ther approach was, however, lmted to consderng very basc battery propertes and dd not consder wder ssues for the grd. In the same ven, more recently, [6] provde algorthms for agents to optmse storage usng CHPs (Combned Heat and Power) but gnore how populatons of such agents would mpact on the grd. On the other hand, [3, 9] have studed the applcaton of storage devces on a wder scale. They show that usng demand-sde management (.e., drectly controllng the storage profle of a number of homes) coupled wth storage can ncrease savngs made n the system. If not properly managed, storage systems can be unproftable [8], so n the settng we consder, t s mportant to know whether small scale storage can be ndvdually benefcal, and what strateges maxmse ths proft. It s also mportant to understand the system-wde effects of such strateges, n partcular, quantfyng the lmts on the usefulness of small scale storage from a socal welfare pont of vew. These are the key open questons that are addressed by ths paper. Fgure 1: Representatve Load Profle n UK (the Domestc Unconstraned profle). 3. MODEL DESCRIPTION Thssectondescrbesthemodelsusednthspaper. Our analyss consders fxed tme nterval consstng of sngle 40

days, each separated nto T = 48 settlement perods of half an hour. Each day, agents consume electrcty whch s bought from supplers through an electrcty market. Ths market operates for each tme nterval n the day, so that varatons n demand over tme can be met. We proceed wth a descrpton of our models of behavour for the agents, followed by a descrpton of the market and a defnton of the relevant socal welfare metrcs we consder. 3.1 Agents We consder a set of consumers A whch we defne as selfsh agents that always am to mnmse ther ndvdual costs. Each agent a Ahas a load profle l a I= {1,...,T}, such that l a s the amount of electrcty requred by agent a for tme nterval durng each day. The aggregate load profle of the system s gven by a A la = d. We consder ths load profle to be fxed over dfferent days (although there are seasonal varatons n demand n practce, there s a hgh degree of consstency from day to day). Each agent a A may also have some storage avalable to t, wth capacty e a, effcency α a and runnng costs c a. Here, the cost c a may represent ongong storage costs (for example, some battery devces expend energy through heatng whle they are n use) or may ncorporate a fxed captal nvestment by a, to be pad off over tme. The storage effcency α a and cost c a are modelled to be such that f q amount of energy s stored, then α a q may be dscharged and the storage cost s c a q. In order to mnmse costs, a can attempt to strategse over ts storage profle, b a Iwhere b a b a b a +, where b a s the dschargng capacty of the storage, and b a +, the chargng capacty. For all Iwe have b a = b a+ b a, where b a+ s the chargng profle and b a, the dschargng profle. Snce we are attemptng to model the effect of the wdespread adopton of small scale houshold storage devces, we can assume that l a, b a +,andb a are small n comparson to d. We denote the total storage capacty as e = a A ea, and the net storage profle as b Iwhere b = a A ba. The net chargng and dschargng capactes are defned as b + = a A ba + and b = a A ba. To supply ts load profle and energy chargng needs each agent much purchase electrcty from the avalable market. The next subsecton contans our market model. 3.2 The Electrcty Market We consder a macro-model of the electrcty market; a black box that abstracts the market mechansm and tradng, as well as transmsson power flow securty nvolved n an actual electrcty market mechansm. Gven the characterstcs of the market, our model gves us the market prces based on the economcs of demand and supply (see Fgure 2). The supply curve n ths case s generated from UK Natonal Grd prces for the perod of August and September 2009. The behavour of electrcty supplers s specfed by the supply curve s ( ) for every tme pont I. The supply curve s ( ) ndcates the cost of electrcty that generators experence, or mnmum prce they are wllng to sell at, when producng a certan quantty. For our model we assume that s ( ) s contnuous and strctly ncreasng. As defned above, each tme nterval Ihas an nelastc demand 2 quantty d, representng the total amount of electrcty consumed by agents, and a net storage effect, b, representng the aggregated effect of storage. Thus, the total amount of electrcty bought from supplers at tme nterval Is q = d + b. 2 The market demand s modelled as nelastc to reflect the currently nelastc demand of ndvdual consumers. Under our model, for each tme nterval I, the market sets a prce for electrcty p = s (q ). Each agent pays p (l a + b a ) and the total cost for all agents s p q. Fgure 2: Supply s modelled from actual UK market prces and demand s assumed to be nelastc. 3.3 Socal Welfare Metrcs A key am of ths paper s to study the effect of storage on the system and whether the global socal welfare of the system mproves as agents adopt storage. In more detal, we measure socal welfare by consderng the followng standard metrcs of an electrcty market: Dversty factor (DF) s the rato of the sum of the ndvdual maxmum demands of varous consumers of the system to the maxmum demand of the complete system. The dversty factor s usually greater than 1. The Load Factor (LF) s the average power dvded by peak power, over a perod of tme and, deally, s 1. A low LF suggests peak demands n the system. The Grd Carbon Content ntensty s the carbon produced to generate the requred electrcty. It s expressed as g of CO 2 per kwh and, s deally as low as possble. The carbon emsson from electrcty generaton n the UK s gven n Fgure 3 and s calculated from the generaton supply mx from the UK Natonal Grd for the perod of August and September 2009. 4. A GAME-THEORETIC ANALYSIS In ths secton we analyse the models gven above from a game-theoretc pont of vew. For tractablty we assume that agents have homogeneous effcency and runnng costs, that s α a = α and c a = c for all a Afor some α and c. Formally, the game we consder has players whch concde wth our agents, a A, and the game descrbes the outcome of a sngle 24 hour nterval. The pay-off an agent receves s equal to mnus the total costs that agent experences when purchasng electrcty that day, p(la + b a ). The strategy space avalable to each agent s the set of feasble storage profles, b a Iwhere b a b a b a +. Here we also make two further restrctons on feasble storage profles. Frstly, for all a Awe requre that the amount of energy dscharged s equal to the amount of energy charged multpled by the effcency, that s ba = αba+. Secondly, for all a A, we requre that ba+ e a, that s the total amount charged s less than the storage capacty. Ths s a strcter constrant than smply requrng 41

Fgure 3: UK Carbon emsson from Electrcty Generaton. that the capacty s never exceeded at any tme. However, t s a reasonable gauge of storage capacty lmtatons for a day-long tme perod, where demand typcally goes through a sngle cycle of low to hgh to low, mplyng that storage devces would go through a correspondng cycle of chargng to dschargng to chargng. We now proceed to characterse the determnstc Nash equlbra for ths game. 4.1 Nash Equlbra as Global Optmsers Suppose agents have chosen some strategy profles, and let us consder the effect of a feasble change n strategy for one agent. That s, some a Aconsders gong from b a I to b a +Δb I, for some values Δb. The change n payoff for agent a would be: ( (s(q ) s (q +Δb ) ) (l a + b a ) s (q +Δb )Δb. As noted n the prevous secton, snce we are examnng wdespread mcro-storage devces, we can assume that for all I and a A, l a, b +,andb are small n comparson to d and q. Ths means that the (l a +b a ) term n the above wll be small, and s (q +Δb ) wll be close to s (q ). Usng these approxmatons, the determnstc Nash equlbra for ths game correspond to strategy profles whch mnmse q 0 s(x)dx. These approxmatons reduce our search for the Nash equlbrum of a complex mult-player game to a relatvely straghtforward global optmsaton problem that of mnmsng global generators costs. We now proceed to fnd solutons to ths optmsaton problem. 4.2 Charactersaton of Nash Equlbra We seek to fnd an aggregate storage profle b Iwth b b b +, (b)+ e, and α(b)+ = (b), whch mnmses d +b s 0 (x)dx. Here, and throughout the paper, we use the notaton ( ) + to denote postve part,.e. y =(x) + means y = x f x>0, y = 0 otherwse. Lkewse we use (x) to denote ( x) +. We begn wth a defnton. Defnton 1. For a storage system as descrbed, we defne the dschargng prce pont, p d, to be the maxmum of thesolutonto qd (p d )=α qc (αp d c), andthe soluton to qd (p d )=αe, f one exsts. Here we defne, for each nterval, q d (p) =max ( b, (d s 1 (p)) ),andq c (p) = mn ( b +, (d s 1 (p)) +). We defne the chargng prce pont, p c,tobethemnmum of αp d c and the soluton to qc (p c )=e f one exsts. Ths defnton s well defned by the Lemma 1 n the Appendx. We can now state the man result of ths analyss. Theorem 1. For a storage system as descrbed, the set of Nash equlbra for the system s precsely the set of agent strateges where, for all I, b = q d (p d ) q c (p c ). Proof. See the Appendx. 4.3 Idealsed Scenaros For specal dealsed scenaros, we have the followng two corollares. Corollary 1. If b + and b are suffcently large, then for all, p c s (q ) p d. Furthermore, f for any I, p c <p <p d,thenb =0. Proof. If, for all a Awe let b a + and b a be equal to e a, then ths does not break our smallness assumpton, and, furthermore, for all I, we ll have q d (p d ) < b and q c (p c ) <b +. Thus, for all, fb s non zero then ether b = q d (p d )=s 1 (p d ) d < 0, n whch case q = s 1 (p d ) and so p = p d,orelseb = q d (p d )=s 1 (p c ) d > 0, n whch case q = s 1 (p c )andsop = p c. If b =0then (p c ) d s 1 (p d )andsop c <p <p d as requred. s 1 So, f the charge and dscharge rates are suffcently hgh, then we could expect prces to always le wthn p c and p d. Corollary 2. If b + and b are suffcently large, capacty e s suffcently hgh, c =0and α =1, then for all, p c = s (q )=p d. Proof. Ths follows drectly from the prevous corollary. Hence, n an dealsed scenaro, wth perfectly effcent, cost free, and hgh capacty storage, we would expect the market prces over tme to flatten to a sngle value. Ths s because perfect storage capablty would allow agents to transport energy from any tme nterval to any other tme nterval free of charge. Thus, dfferent supplers n dfferent tme ntervals would have to compete wth each other, resultng n convergence to a sngle market prce. 4.4 Ratonalty Assumpton Theorem 1 gves the aggregate storage behavour when our game s n a determnstc Nash equlbrum. We can use ths result to specfy lmts of the socal welfare beneft that can result from adoptng small-scale storage. If the actons of such selfsh agents are to result n stable aggregate behavour, then we can do no better than the outcome descrbed above. However, n usng game theory, we have made some mplct assumptons, specfcally that agents are ratonal and have complete nformaton about the market throughout the tme perod. In realty, nformaton avalable to those ownng storage devces wll not be perfect. Furthermore, even wth perfect nformaton, t mght not be apparent to an automated agent whch strateges are preferable. Instead, the agents themselves must adapt over tme, to become aware of the repeatng daly patterns of supply and demand, and learn whch storage strateges are preferable. Ths s a dffcult problem and t s not guaranteed that selfsh learnng behavour can converge. For example, f agents over-react to perceved opportuntes n the market, cycles of prce fluctuatons could develop. In the next secton, we provde a novel adaptve storage strategy for agents to maxmse ther savngs. Under ths scheme, agents change ther storage profles each day to be closer to ther perceved optmal strategy. In Secton 6 we show that provded the adaptaton s not too fast, t results n the aggregate convergence predcted by Theorem 1. 42

5. AN ADAPTIVE STORAGE STRATEGY As dscussed above, the next step of our work s to desgn a novel adaptve storage strategy that an agent can use to decde on when to store energy and when to use the stored energy. Now, because market prces are contnuously changng as a result of changng demand (due to consumers usng storage devces), we desgn a learnng mechansm that adapts to these changng market prces. In more detal, our strategy s based on a day-ahead bestresponse storage. Because the market prces are unknown a pror, we can only calculate the storage profle on a dayahead bass, as a best-response to the predcted market prces. To mtgate predcton errors, the consumer gradually adapts her storage towards the best-response storage. In ths secton, we frst descrbe how we calculate the dayahead best response storage profle and, second, we descrbe our learnng mechansm, that s, how the consumer adapts her storage. 5.1 The Day-Ahead Best-Response Storage The objectve of an agent a s to mnmse ts costs by storng energy when prces are low and usng that energy when prces are hgh. Now, because market prces are unknown untl the aggregated load of all consumers, s, where a A la = s, s known, the agent needs to decde on ts storage profles based on the predcted market prces of the followng day. Note that n our work, we assume that market prces do not move sgnfcantly over days and use a weghted movng average to predct future market prces. 3 We compute the storage profle, b a = b a+ b a at every tme-slot durng the day as the soluton to an optmsaton problem where we mnmse the followng cost functon 4 : arg mn b a ( subject to the followng constrants: Contrant 1: storage effcency b a ) p (b a+ b a + l a )+c a e a = α a b a+ I Contrant 2: wthn chargng and dschargng capacty b a b a and b a+ b a + I Contrant 3: energy that can be stored or used at a tme-slot ( b a α a e a 0 + 1 ( ) ) j=1 b a+ j b a j, I b a+ e a e a 0 + 1 ( ) j=1 b a+ j b a j, I Contrant 4: no-resellng allowed l a b a 0, I The last constrant can be removed n a system where consumers are allowed to sell power to the grd and that e a can be fxed or unconstraned. In lne wth our model (see Secton 3), c a s the relatvely small dscounted runnng cost of usng storage, e a s the storage capacty. α a s the effcency of the agent s storage, b a + s ts maxmum chargng and b a 3 As we wll demonstrate later on, ths s not very senstve n our work as the prce movements are generally small. However, a number of more sophstcated predcton algorthms, such as Gaussan Processes could be used nstead. 4 We used IBM ILOG CPLEX 9.1 to mplement and solve the optmsaton problem. (1) ts maxmum dschargng rates and e a 0 s the storage at the begnnng of the day whch equals the storage at the end of the day (.e., chargng at the end of a day for the next day). Because market prces move over tradng days, the agent needs to contnuously adapt ts storage profle to reflect these changes. Now, because of the relatvely hgh cost of storage, t s more sensble and realstc for the agent to gradually change ts capacty by analysng the trend of market prces. To ths end, we develop a novel learnng mechansm to adapt storage profles n an electrcty market. 5.2 Learnng n the Market Our learnng mechansm s based on a two-pass approach. Intally, the agent computes the maxmum storage capacty, e a U t would requre to mnmse ts costs. e a U s the costmnmsng capacty by optmsng over e a (see Equaton 1). Now, e a U consttutes a desred capacty towards whch the agent learns ts storage capacty,.e. t adapts ts storage capacty progressvely to follow the changng market trends. The storage capacty of the agent s defned by Equaton 2 as e a (t) that follows the desred unconstraned storage capacty e a U such that: e a (t +1)=e a (t)+β 1(e a U e a (t)) I (2) where e a (0) = 0 by default and β 1 s the learnng rate 5 of the storage capacty of agent a. Gven ts storage capacty, the agent then computes ts optmal storage profle for the followng day by fxng e a at e a (t +1)nEquaton1. On the second pass, gven ts current storage profle, the agent adapts ts storage profle (to mtgate any rsk of havng poorly predcted market prces) as follows: b a (t +1)=b a (t)+β 2(b a b a (t)) I (3) where b a s the desred storage profle gven as the optmal storage profle subject to a fxed storage capacty of e a (t + 1) and β 2 s the learnng rate of the storage profle. Note that we analyse n more detal the senstvty of the learnng parameters as part of the emprcal study of the system n the next secton. 6. AN EMPIRICAL ANALYSIS OF THE UK MARKET In ths secton, we emprcally analyse the effect of storage on the UK market. To ths end, gven our macro-model of the UK market, we setup ndvdual consumers wth typcal UK load profles, 6 dfferent learnng rates and dfferent storage types. Learnng rates, as are chargng and dschargng capactes, are unformly dstrbuted 7 (to represent consumers wth dfferent learnng atttudes and storage types) around means that are based on current technologes (see Secton 2). Now, because our macro-model s general enough, our framework can be appled to any electrcty market around the world, and our results and nsghts broadly generalse. Gven ths setup, we frst provde a game-theoretc soluton, dentfyng the Nash equlbrum of the system and, second, we provde a dynamc analyss as to whether or not 5 As we wll emprcally demonstrate further on, the choce of the learnng rates determnes the evolutonary stablty of the system and has to be reasonably small. 6 We do so by addng random nose to the average UK load profle; on the tmngs usng Posson dstrbuton of demand tmes and a unform dstrbuton of nose over demanded quanttes. 7 In all experments except for when we analyse the effect of learnng rates, we use a mean value of 0.05. 43

Fgure 4: Nashequlbrum (wtha storage capacty of 3.55 kwh). Fgure 6: Average Storage Profle convergng to NashEqulbrum. Fgure 5: Convergence of the average strategy profle to the Nash equlbrum. such an equlbrum can be reached f a proporton of the populaton were to acqure storage devces as well as use our adaptve storage strategy wth the am of maxmsng ther ndvdual savngs. Fnally, we analyse how the socal welfare of the system evolves wth a large number of agents adaptng ther storage and whether the socal welfare of the market mproves whle consumers are able to make a savng. 6.1 A Game-Theoretc Soluton Gven the game-theoretc framework outlned n Secton 4, we frst calculate the Nash equlbrum gven a typcal domestc average unconstraned profle (see Fgure 4). It s clear that equlbrum behavour for a consumer s to charge at off-peak hours (at nght) and use the stored energy durng peak hours (after workng hours) when the consumers load s hghest. 6.2 Evaluaton of the Adaptve Storage Gven the adaptve storage strategy we desgned n the paper, we now analyse how the system evolves as agents are changng ther behavours wthn a realstc settng and whether the system converges to the Nash equlbrum. As we can see from Fgures 4 and 5, our average storage profle ndeed converges to the Nash Equlbrum of our game-theoretc analyss. Fgure 6 shows how the average storage profle evolved towards an equlbrum as market prces were flattened n the system (see Fgure 7). Gven these results (.e., that we converge to the Nash Equlbrum and hence the optmal soluton see Secton 4), we can clam our adaptve strategy sets the benchmark for any learnng strategy n ths system! Now, t s also Fgure 7: Changng Market Prces (market prces eventually flatten). mportant to analyse how the socal welfare of the system evolves as the system s evolvng to the Nash equlbrum to ensure that agents adoptng storage does not break the system (.e. socal welfare does not decrease). To ths end, we analyse the market dversty factor, load factor and carbon content reducton (see Secton 3). For the system effcency gven n Fgure 8, we consdered a populaton wth around 38% wth storage capablty (our choce of 38% wll become clearer further on). As can be clearly observed, the system effcency mproves and gradually converges as agents adapt ther storage profles and market prces are flattened. In more detals, the average maxmum storage capacty requred converges to around 3.55 kwh after several tradng days whle the market load factor converges to around 0.94 where the load n the market s nearly flattened. Furthermore, the dversty factor ncreases suggestng that, because of storage, consumers now have less correlated demand requrements from the electrcty market (whch generally reduces peaks n a system). A sgnfcant beneft of storage at a mcro-level s that f a suffcent proporton of the populaton does adopt storage, the carbon ntensty of electrcty market would decrease apprecably as peak demands are reduced. Indeed, n Fgure 9, we show how the carbon content s reduced by up to 7% for dfferent proportons of populaton adoptng storage (by extrapolatng our results to the 26M UK households). Furthermore, from Fgure 10, t s clear that there s a fnancal ncentve for consumers to adopt storage, wth a maxmum savng of around 13% (based on the current system wth no storage). As expected, because storage flattens 44

Fgure 8: Socal Welfare of System. Fgure 10: Savngs wthand wthout storage. Replcator dynamcs (arrows on the x-axs) converge to the NashEqulbrum at 0.382 wtha savng of 8.5% for usng and not usng storage. Fgure 9: Socal welfare for dfferent proporton of the populaton usng storage. the market prces, other consumers, even wthout storage, also ndrectly beneft. Now, as storage becomes more and more popular (as consumers become aware that they can save on ther electrcty blls), we observe a decrease n ther savngs, reachng a pont where a consumer can save more by not havng storage than havng storage (see average savngs n Fgure 10). In the next subsecton, we analyse n more detal ths socal trend. 6.3 When to Adopt Storage Here, we formulate the problem as a game where agents have amxedstrategyx r (0, 1),.e. a probablty that they have storage capablty and are only motvated by fnancal gans. By analysng how x r evolvesasthepayoffschange for dfferent x r, we want to study how the proporton of the populaton usng storage evolves. To ths end, we use the classcal evolutonary game-theory (EGT) [10] based on the followng equatons: x r =[u(e r,x) u(x, x)]x r where u(x, x) = r S u(er,x)x r x nash = arg mn x (0,1) S (max[u(ej,x) u(x, x), 0]) 2 Frst, we compute a heurstc payoff table (based on smulatons) to calculate the payoffs for usng and not usng storage for dfferent x r. The replcator dynamcs x r descrbes the dynamcs of the populaton,.e. how x r s evolvng, and, whether t converges to any Nash Equlbrum x nash. Fgure 10 shows our EGT analyss, wth x nash at 0.382 adoptng storage, and the replcator dynamcs all convergng towards that equlbrum. Surprsngly enough, ths means that the populaton wll gradually settle at an equlbrum where only 38% of the populaton use storage. At that equlbrum, all consumers make an average savngs of 8.54% (.e. an annual savng of GBP60 per household based on an average annual electrcty bll of GBP675). Now, the equlbrum suggests that too many consumers storng can be counterproductve for the system. Ths s because there s a pont beyond whch addtonal storage adds more volatlty to already flattened market prces (seen from a decrease n the load factor n Fgure 8), and those agents that store are more exposed to ths volatlty. Fnally, around the equlbrum pont, we also observe that the socal welfare of the system peaks such that the ndvdual goals of the agents (to save on the electrcty blls) s well algned wth maxmsaton of the socal welfare, wth the dversty factor DF decreasng as too many households start storng energy. 6.4 Senstvty of the Learnng Mechansm Fnally, we analyse the senstvty of the learnng mechansm aganst the socal welfare and the agent s self-nterested objectves. Fgure 11 shows, as expected, that the smaller the learnng rate, the more effcent the system (wth a hgher load factor) and the better the average savngs of the ndvdual agents. Now, because an nfntely small learnng rate s nfeasble as t mples an nfntely long tme to reach the equlbrum, a trade-off s requred. Specfcally, because the learnng parameters are not very senstve when they are small, a value of 0.05 to 0.20 would be reasonable. A hgh learnng rate, on the other hand, would result n agents adoptng ther optmal storage profle mmedately rather than adaptng gradually, whch clearly results n poor savngs and poor system effcency. Fnally, there s no convergence, mpled by the load factor droppng below 0.7 wthout any storage to 0.59 such that there are now more peaks n the system (as everyone s chargng at the same tme). 7. CONCLUSIONS In ths paper, we developed a framework to analyse agentbased mcro-storage management for the smart grd. Specfcally, we desgned a storage strategy (wth an adaptve mechansm based on predcted market prces) for consumers and emprcally demonstrated that the average storage profle converges towards a Nash equlbrum. At that pont, peak demands are reduced, reducng the requrements for more costly and carbon-ntensve generaton plant. Moreover, n our analyss of the socal welfare at ths equlbrum we show that, whle beng stable, t results n reduced costs and carbon emssons. Ths also shows that the objectve of buyng 45

Fgure 11: The effect of the learnng rate. storage to save on electrcty blls s algned wth maxmsng socal welfare. Fnally, we show that the populaton would adopt storage untl an equlbrum of 38% s reached, around whch the socal welfare s maxmsed. For future work, we ntend to ntegrate a more accurate model of the electrcty market mechansm n our work as well as models of deferrable loads and how an agent can control such loads n parallel wth ts storage for more effcent cost-savng behavours. Furthermore, we would lke to explore how consumers wth preferences for low carbon electrcty, not just low cost electrcty, could nteract wthn ths model. 8. REFERENCES [1] B. Daryanan, R. Bohn, and R. Tabors. Optmal demand-sde response to electrcty spot prces for storage-type customers. Power Systems, IEEE Transactons on, 4(3):897 903, 1989. [2] U. S. Department-Of-Energy. Grd 2030: A Natonal Vson For Electrcty s Second 100 Years, 2003. [3] L. Exarchakos, M. Leach, and G. Exarchakos. Modellng electrcty storage systems management under the nfluence of demand-sde management programmes. Internatonal Journal of Energy Research, 33(1):62 76, 2009. [4] R. Galvn and K. Yeager. Perfect Power: How the McroGrd Revoluton Wll Unleash Cleaner, Greener, More Abundant Energy. McGraw-Hll Professonal, 2008. [5] A. Holland. Welfare losses n commodty storage games. In Proceedngs of The 8th Internatonal Conference on Autonomous Agents and Multagent Systems, pages 1253 1254, Budapest, 2009. [6] M.Houwng,R.R.Negenborn,P.W.Hejnen,B.D.Schutter, and H. Hellendoorn. Least-cost model predctve control of resdental energy resources when applyng μchp. In Power Tech, pages 425 430, London, UK, 2007. [7] M. Korpaas, A. T. Holen, and R. Hldrum. Operaton and szng of energy storage for wnd power plants n a market system. Internatonal Journal of Electrcal Power & Energy Systems, 25(8):599 606, October 2003. [8] D. MacKay. Sustanable energy: wthout the hot ar. UIT, Cambrdge, 2009. [9] K. H. van Dam, M. Houwng, and I. Bouwmans. Agent-based control of dstrbuted electrcty generaton wth mcro combned heat and power cross-sectoral learnng for process and nfrastructure engneers. Computers & Chemcal Engneerng, 32(1-2):205 217, 2008. [10] J. W. Webull. Evolutonary Game Theory. MIT Press, Cambrdge, MA, 1995. [11] J. Wllams and B. Wrght. Storage and Commodty Markets. UIT, Cambrdge, 1991. Appendx Ths appendx contans the lemmas and proofs left out of the man body of the paper. We begn wth a lemma whch justfes the defnton of chargng and dschargng prce ponts gven n Secton 4. Lemma 1. There always exsts a soluton to, qd (p) = α qc (αp c). Furthermore, f p s the soluton then p d = p and p c = αp c unless qd (p) >αe,nwhch case, p d s the soluton to qd (p d )=αe, andp c s the soluton to qc (p c )=e. Proof. If p s suffcently small then for all Iwe wll have s 1 (p) <d and hence q d (p) wll be strctly postve and, snce αp c<p, q c (αp c) wll be zero. Lkewse f p s suffcently large then for all Iwe wll have s 1 (p) > d and so q c (p) wll be strctly postve and q d ((p + c)/α) wll be zero. Snce the functons q d ( ) are decreasng for all and q c ( ) are ncreasng for all, we can conclude that qd (p d ) α qc (αp d c) s a contnuous decreasng functon n p whch s negatve for suffcently small p and postve for suffcently large p. Ths mples the exstence of some soluton ˆp such that qd (ˆp) =α qc (αˆp c). Now f n =1 qd (ˆp) αe then p d =ˆp and qc (αˆp c) e, hencep c = αˆp c. If n =1 qd (ˆp) >αethen, snce we know that n =1 qd (p) = 0 for large enough p, there must exst some ˇp ˆp such that n =1 qd (ˇp) =αe. By defnton, p d must be equal to ths ˇp. Smlarly, snce qc (αˆp c) > e, we can deduce that p c αˆp c and n =1 qc (p c )=e, as requred. We now prove Theorem 1 from Secton 4. Proof Theorem 1. We seek to fnd an aggregate storage profle b = {b } whch mnmses f(b) where f(b) = d +b s 0 (x)dx. If, for all Iwe extend the defnton of s (x) to be 0 for negatve x, then we can see that f( ) tends to nfnty as for large feasble b. Thus,f( ) must have at least one local mnmum over the feasble doman, one of whch has to be the global mnmum. To do fnd these allocatons we seek feasble b for whch the dervatve of f(b) s non-negatve n every drecton that leads to another feasble allocaton. The gradent of f(b) s{p }, thustremans to characterse all b such that pδb 0 for every Δb where b +Δb s feasble. Nowsupposewehavesomeb whch locally maxmses f( ). If there s, j wth b + >b > 0andb + >b j > 0, then t would be feasble to ncrease b and decrease b j by an equal quantty, (or vce versa), hence we must have p = p j. From ths we can deduce that f for some, j, p <p j, b > 0 and b j > 0, then we must have b = b +. Ths means that therewllbesomeprce,ˆp c such that f b > 0, for any, then p ˆp c, wth equalty f b <b +. Smlarly, we can showthattherewllbesomeprceˆp d such that f b < 0 for, then p ˆp d, wth equalty f b > b. Furthermore, there cannot be, j such that p + c>αp j and b > 0and b j < 0, for then t would be feasble to decrease b by some Δb and ncrease b j by Δb j = αδb. Hence, we must have ˆp c +c αˆp d. Ths mples that for all, b = q d (ˆp d ) q c (ˆp c ). Furthermore, f for some, j, b > 0andb j < 0 p + c< αp j, then t would be proftable to ncrease b by some Δb and ncrease b j by Δb j = αδb.so,etherˆp c + c = αˆp d and ths does not happen, or else ˆp c + c<αˆp d but ths change s never feasble due to the capacty constrant. In whch case (b)+ = e and (b) = αe. Thus, we have that ˆp c = p c and ˆp d = p d, and so all local maxmsers of f( ) must be as n the statement of the theorem. However, ths precsely specfes the storage profle, and so there can only be one local mnmum of f( ), whch s also the glocal mnmum, and s gven by the statement of the theorem, as requred. 46