Residential Demand Response under Uncertainty

Resdental Demand Response under Uncertanty Paul Scott and Sylve Thébaux and Menkes van den Brel Australan Natonal Unversty NICTA, Australa {frstname.lastname}@ncta.com.au Pascal Van Hentenryck Unversty of Melbourne NICTA, Australa pvh@ncta.com.au Abstract Ths paper consders a resdental market wth real-tme electrcty prcng and flexble electrcty consumpton profles for customers. Such a market rases an optmsaton problem for home automaton systems where they need to schedule consumpton actvtes to reduce costs, whlst mantanng a base level of comfort and convenence. Ths optmsaton problem faces uncertanty n real-tme prces, weather condtons, and occupant behavour. The paper presents an onlne stochastc combnatoral optmsaton algorthm that produces fast, hgh-qualty solutons to ths problem. Ths algorthm s compared wth reactve control strateges and an approach usng an expected scenaro. Our results demonstrate the value of stochastc nformaton and onlne stochastc optmsaton n resdental demand response. 1 Introducton Electrcty consumpton n resdental markets wll undergo fundamental changes n the next decade due to the avalablty of solar panels and novel prcng mechansms, progress n batteres and electrc cars, and the emergence of smart applances and home automaton. These technologes provde resdental customers wth the ablty to actvely partcpate n smart grd actvtes such as demand response where loads are shfted to tmes favourable for the network as a whole. Havng an ntellgent Home Automaton System (HAS) wthn each home s a key component n ths vson. The HAS receves nformaton about devce operatng characterstcs, usage requests and network sgnals, and can send control actons back to smart devces. Apart from provdng useful feedback to occupants on ther consumpton habts, t can also autonomously make control decsons. Through ths control the HAS can target one or more of the followng objectves: 1. Improve occupant comfort, 2. Reduce overall electrcty consumpton, 3. Perform demand response for network. These objectves are often conflctng, so n order to get the rght balance, occupants need to ndcate how they value comfort aganst cost savngs. The task of the HAS s then to decde on a seres of control actons to take over tme, whch produces an optmal soluton for the weghted combnaton of objectves. The HAS can mplement smple polces to try and meet these conflctng objectves. Or, more nterestngly, t can use sophstcated stochastc optmsaton technology whch explots forecasts and observed patterns n prces, weather, resdental actvtes and smart devce usage. Ths paper ams to determne the benefts of onlne stochastc optmsaton for a HAS that s exposed to Real- Tme Prcng (RTP) as a demand response mechansm. A number of research projects have started examnng ths very ssue (see the related work secton) but they often gve an ncomplete pcture of the benefts of optmsaton and the value of stochastc nformaton. These projects often consder smpler uncertanty models, whch gve a partal understandng of the true benefts that optmsaton can brng to ths settng. In contrast, ths paper makes two prmary contrbutons: one conceptual and one algorthmc. At the conceptual level, the paper presents a compostonal archtecture for HAS optmsaton, where each devce can be modelled ndependently n terms of a collecton of functons that encapsulate ts behavour. These devces are then assembled nto a model of a home, from whch the HAS optmsaton problems can derve. At the algorthmc level, the paper presents a comprehensve study of the value of HAS optmsaton n the presence of uncertanty about future prces, occupant behavour, and envronmental condtons. Our formulaton uses models representatve of physcal devces and stochastc models traned on real weather and network demand data. These devce and stochastc models are used n two onlne stochastc optmsaton algorthms whch are compared to smple control systems based on reactve polces. The expermental results not only show the value of stochastc nformaton, but also that stochastc optmsaton provdes solutons that are close to the clarvoyant solutons whch have perfect knowledge of the future. The onlne stochastc algorthms usng MILP technology are fast and they produce sgnfcantly better solutons than the reactve controllers. Also of nterest s the comparson between the two onlne stochastc algorthms, and an experment that shows the dstance nto the future that needs to be consdered when makng a decson. The rest of the paper presents the determnstc HAS optmsaton problem, ts stochastc verson, the stochastc mod-

els, and the expermental results. 2 Determnstc HAS Optmsaton A house contans a collecton of controllable devces whch nfluence the amount of power consumed n the house and the level of comfort that resdents experence. We consder the operaton of these devces over dscrete tme steps 1 : Z : t R where t > t 1 and Z : t stp = t t 1. Gven a real tme prce for electrcty and other nput parameters (e.g., external temperatures and devce requests), optmal operaton of these devces s acheved by mnmsng the sum of monetary and comfort costs. The optmsaton problem decson varables are the devce actons at each tme step, whch are constraned by devce characterstcs and total power lmts on the house. 2.1 Formal Defnton We start wth a new formal defnton of a devce, whch s a collecton of functons that govern the devce operaton. These nclude functons for permssble devce actons, state updates, the electrcal power transferred wth the house, and any non-power related operaton costs. Operaton costs are always postve and may nclude any occupant comfort costs, fuel consumpton or wear and tear on the equpment. By conventon power consumed by the devce s negatve, and power generated, e.g., by a rooftop photovoltac system s postve. Defnton 1 (Devce) A devce s a tuple d = (A d, S d, R d, q d, g d, f d, l d ), where: A d R m d Z m d S d R k d Z k d s the set of devce actons s the set of devce states R d R w d Z w d s the set of devce nput parameters q d : S d R d P(A d ) s the permssble acton functon g d : A d S d R d S d s the state update functon f d : A d S d R s the electrcal power functon l d : A d S d R d R R s the operatonal cost functon A house s smply a set of devces, together wth bounds on the nstantaneous amount of power the house can transfer to or from the grd: Defnton 2 (House) A house s a tuple h = (D h, p h, p h ), where: D h s the set of devces p h, p h R are the lower and upper power lmts We now turn to the determnstc formulaton of the HAS optmsaton problem whch wll be later used as a buldng block for our stochastc formulaton. The determnstc formulaton assumes that the nput parameters are known over a horzon of n tme steps. The task s to choose devce actons at each step to reduce the total cost over the horzon. 1 Varable tme step szes wll be used to focus computatonal tme where most needed. Inputs nclude the devce ntal states, the RTP, the house background power 2 and the devce nput parameters at each step. We account for the fact that the RTP s often dfferent dependng on whether power s bought from or sold to the grd. The optmsaton varables at each tme step nclude the devce actons and states, and the devce and house power consumptons and costs. These varables and nputs are lnked together va the devce functons n Defnton 1 and house power lmt constrants n Defnton 2. We use the followng notaton: (a) + = a f a > 0 and 0 otherwse, and smlarly (a) = a f a < 0 and 0 otherwse, where a R. Defnton 3 (Determnstc HAS Optmsaton Problem) Let h = (D h, p h, p h ) be a house. The HAS optmsaton problem over a horzon n N for h s the followng: Inputs: for each devce d = (A d, S d, R d, q d, g d, f d, l d ) D h s d,0 S d s the devce ntal state for each devce d D h and tme step {1... n} r d, R d are the devce nput parameters for each tme step {1... n} p b h, R s the house background power v R 2 s the real-tme prce (buyng, sellng) Decson varables: for each devce d D h and tme step {1... n} a d, A d are the devce acton varables Other varables: for each devce d D h and tme step {1... n} s d, S d are the devce state varables p d, R s the devce power c d, R + s the devce operaton cost for each tme step {1... n} p h, [p h, p h ] s the total power c h, R s the total cost Constrants: for each devce d D h and tme step {1... n} a d, q d (s d, 1, r d, ) s the acton permssblty constrant s d, = g d (a d,, s d, 1, r d, ) s the state update constrant p d, = f d (a d,, s d, ) s the devce power constrant c d, = l d (a d,, s d,, r d,, t stp ) s the devce cost constrant for each tme step {1... n} p h, = d D h p d, +p b h, s the house power constrant p h p h, p h s the house power lmts constrant c h, = d D h c d, + t stp v,1 (p h, ) t stp v,2 (p h, ) + s the house cost constrant Objectve: mn n =1 c h, 2 Ths aggregates uncontrollable electrcal consumpton, e.g., lghtng, entertanment and cookng.

2.2 Modelled Devces In our experments we consder a modern house wth electrcal HVAC, hot water heatng, solar panels, washng machne, clothes dryer and dsh washer. We also nclude two devces that are expected to become standard wthn the next decades, an electrc vehcle (EV) and a dedcated battery bank for storng electrcal energy. Descrptons of these devces are gven n ths secton. Some lberty has been used n these descrptons to ad understandng, but wth slght reformulaton they all ft nto the rgorous devce defnton. Devce electrcal powers and operatonal costs are consstently represented by the varables p and c, where a negatve power represents power consumed by a devce. The physcal behavour of devces has been approxmated by lnearsng ther physcal equatons and dscretsng tme. Only sgnfcant steps of ths process are mentoned n the devce descrptons. For the experments parameters were selected to be representatve of typcal devces. For example, the EV battery capacty s equal to that of a Nssan Leaf, and the house floor area for heatng purposes s typcal of an average-szed house. Due to the dffculty n obtanng some parameters, for example the chargng effcency of a Nssan Leaf, and snce our models are an approxmaton of these real systems anyway, some estmates had to be made. Battery. A battery has a stored energy state E [0, Ē] and a charge/dscharge power p [p, p] acton varable. Energy s lost through a fxed effcency η when power s charged nto the battery. The stored energy state update functon s gven by: E = E 1 + t stp (η(p ) (p ) + ) (1) A battery lfetme cost c s assocated wth any power that s dscharged from the battery through a lfetme prce v: c = v(p ) + (2) Electrc Vehcle. An electrc vehcle (EV) s essentally the same as the battery just presented, but wth a few addtonal constrants. Frstly the EV battery can only be charged/dscharged when t s at home, whch s ndcated by the nput parameter x h {0, 1}: x h = 0 = p = 0 (3) The nput parameter p d R + represents the power drawn from the battery whlst t s drvng. Ths modfes the state update functon as follows: E = E 1 + t stp ( η(p ) (p ) + p d ) (4) The fnal constrant s on the amount of energy stored n the battery. The house occupants provde an nput parameter E m [0, Ē] that represents the mnmum energy that the EV battery should have n t at each pont n tme. Ths value represents how much energy the occupant expects to need f they drve away n the car at a gven tme. Ths s not a hard constrant as the draw from drvng can brng the battery charge below ths lmt, but t ensures that f the battery power does fall below ths lmt, then t charges back up as fast as possble. x h = 1 = E mn [ E 1 + t stp ( ηp p d ), E m ] (5) Hot Water Heatng. The hot water system s made up of a storage tank and an electrc heatng element. We gnore the detals of the nteracton between hot and cold water n the tank and consder the state of the tank as beng the amount of energy E [0, Ē] t contans above the nlet cold water temperature. The tank s consdered empty of hot water when ths value s zero. The acton varable s the power settng of the electrc heater p [p, 0] at each tme step. An amount of power, gven by the nput parameter p d R +, s drawn from the tank at each tme step n order to meet occupant demand. The energy state update functon s gven by: E = E 1 + t stp ( p p d p l + p u ) (6) The varable p l R + represents thermal losses from the tank to the outdoor envronment. The rate of loss depends on how full the tank s and the dfference n temperature between the water set pont T s R and the outdoor temperature T o R through a resstvty R R + : p l = 1 E R Ē (T s T o ) (7) The varable p u R + s a recourse varable that s used to ndcate the amount of hot water demand whch goes unmet,.e. water drawn from the tank when t s empty. Ths s heavly penalsed as a cost c through an unmet demand prce v: c = vp u (8) The hot water system has a mnmum stored energy level E m [0, Ē], much lke the electrc vehcle. If drawn water brngs the energy level of the tank below ths value then the heater must work as hard as possble to brng the energy back up. Ths value s fxed n tme and s used to represents a safety margn that the occupants mpose n order to reduce the lkelhood of runnng out of cold water. E mn [ E 1 + t stp ( p p d p l + p u ), E m ] (9) Under Floor Heatng/Coolng. The heatng system of the house ncludes a heat pump whch heats/cools water whch s then pumped through ppng embedded n the floor of the house. The temperatures of the floor and the ar n the room T f, T a R are the devce states. The acton varable s the amount of thermal energy that s suppled to the floor of the house p t R. Ths s lmted by the heat pump electrcal power consumpton p [p, 0] through heatng and coolng Coeffcents of Performance (COP) η h [η h, η h ], η c [η c, η c ]: p = 1 η h (p t ) + 1 η c (p t ) (10) The COPs depend on the temperatures of the two thermal wells between whch the heat pump s operatng. We assume the nternal thermal well s at a constant temperature, and the external well s at the outdoor temperature T o R. We have the COPs as lnear functons of T o for some constants a h, a c R + and b h, b c R, wth hard upper and lower lmts: η h = mn [ max [ a h T o + b h, η h], η h] (11)

η c = mn [ max [ a c T o + b c, η c], η c] (12) Heat can transfer between the floor and the outdoor envronment p fo R, the floor and the ar n the room p fa R, and the ar n the room and the outdoor envronment p ao R. We use smple lumped thermal resstvtes R fo, R fa, R ao R + to govern these heat flows: p fo = 1 R fo (T f T o ) (13) p fa = 1 R fa (T f T a ) (14) p ao = 1 R ao (T a T o ) (15) The temperature state update functons are gven by: ( ) T f = T f 1 + tstp m f κ f p t p fo p fa + A f I ( T a = T 1 a + tstp m a κ a p fa p ao + p g ) (16) (17) where m f, m a, κ f, κ a R + are the floor and ar, mass and specfc heat capacty coeffcents respectvely. Sunlght enters through the wndows at an rradance of I, and lands on a floor area of A f. The nput p g R + s thermal power generated by occupant metabolsms and background electrc applances, that contrbutes to heatng the ar n the room. The fnal relaton we have s for the comfort cost c whch depends on the dfference between the ar temperature and an occupant specfed set pont temperature T s R. Two occupant specfed tme varyng comfort prces v a, v b are used, one of whch s only ncluded after a threshold temperature dfference T b : { v a c = T a T s f T a T s < T b (v a + vb ) T a T s (18) otherwse Shftable Loads. Shftable loads are devces that need to run once wthn a tme wndow. An occupant sets two nput parameters: a start tme s and a last allowed start tme l, between whch the controller must schedule the devce to run. Examples of ths knd of devce nclude washng machnes, clothes dryers and dsh washers. We model non-preemptve shftable loads whch can have tme varyng power consumptons. The start of run ndcators x {0, 1} act as both the devce acton and state varables. A shftable load has a cumulatve energy consumpton functon ψ : R + R + whch takes a run duraton and returns the cumulatve amount of energy that the devce has consumed for that duraton. Constrants on the run ndcator varables and the devce power p R are gven by: p = l k= s x k = 1 (19) ψ(t t k 1) ψ(t 1 t k 1) x k t stp k= s (20) Photovoltacs. The photovoltac (PV) panels have no acton varables, the amount of electrcty they generate s purely determned by the solar rradance nput parameter. We model a PV system gnorng temperature and shadng effects and by assumng the panels lay on a horzontal surface. The generated electrc power p R + s then a smple functon of the panel area A R +, effcency η [0, 1] and global rradance nput parameter I R + : p = ηai (21) 3 Stochastc HAS Optmsaton So far we have consdered the determnstc home automaton formulaton that requres perfect foresght about what wll happen n the future. However n practce, almost all the nput parameters are uncertan, and ther uncertanty s only revealed n real tme (e.g., outdoor temperature) or n some cases a few tme steps n advance (e.g., RTP). Ths motvates the use of onlne stochastc optmsaton (Van Hentenryck and Bent 2006), whch explots statstcal models of the uncertan parameters n order to make the best decsons on average. 3.1 The Stochastc Model In the stochastc HAS problem, the RTP v, background house power p b h, and devce nput parameters r d, are random varables. We denote ther real-world realsatons (.e. ther values when the uncertanty s revealed) wth the symbol. For nstance, T o denotes the real outdoor temperature at tme step. For notatonal convenence, all nputs are combned nto one vector z = (v, p b,h, r d1,, r d2,,...) T (22) where we ndex elements wth a k (e.g., z,k ). Random varables at tme step may be dependent on each other and on the varables at prevous tme steps. Therefore the jont dstrbuton for random varables up to tme step s gven by: P (z, z 1,...) (23) Let t represent the current real world tme. Each nput z,k s revealed a fxed amount of tme t rev k R + n advance (or n real tme f t rev k = 0). Ths means, that for a gven t an nput z,k s known to be z,k f t t + t rev k, otherwse t s a random varable. Gven and t we use K,t = {k t t + t rev k } to denote the set of known nput ndces. 3.2 Onlne Stochastc Optmsaton In an onlne stochastc algorthm decsons are made one step at a tme usng stochastc nformaton about future events. After each tme step the uncertanty and the effect of all actons s revealed, updatng the state of the system. Decsons for the next perod are computed and the process s repeated. Onlne stochastc optmsaton has been used successfully on a wde varety of problems (e.g., (Powell, Smao, and Bouzaene-Ayar 2012; Van Hentenryck and Bent 2006)). Our algorthms use a rollng fnte horzon as llustrated n Fgure 1, where the tme steps 1,..., n are algned to each

horzon wth t 0 = t. Optmsaton s performed wthn each horzon usng stochastc nformaton for any unrevealed nputs, and then the actons for the frst tme step are executed n the real world. 1 2 1 2 n n 1 2 n Fgure 1: Rollng horzon for 3 consecutve teratons. It mght not be possble to execute actons produced by the optmsaton f the real world nput parameters z 1 dffer from what the optmsaton antcpated. For example, f the optmsaton decdes to run the hot water heater at full power, and the tank unexpectedly reaches ts capacty (due to less demand for hot water than expected), then the power of the heatng acton wll need to be reduced so as to reman wthn the tank s capacty. Our HAS handles ths automatcally n the executon step, by usng very smple executves for each devce whch select the closest feasble acton. In the followng sectons we ntroduce two approaches to solvng the stochastc optmsaton problem wthn each horzon: the expectaton and the 2-stage algorthms. 3.3 Expectaton Formulaton The expectaton onlne stochastc algorthm takes the condtonal expected value of any unrevealed nputs n the optmsaton horzon, and solves the determnstc verson of the problem gven n Defnton 3. We use the term expected value loosely because n truth we calculate the expected value only where t makes sense, whch s typcally for contnuous nputs. For the rest of the nputs the most lkely value s calculated nstead. For example, expected value s used for outdoor temperatures and most lkely value for the washng machne requests. Both of these calculatons are performed usng the jont dstrbuton for nputs n the horzon, condtoned on any known nputs n and pror to the horzon: P (z n, z n 1,..., z 1 (z n,k, k K n,t0 ),..., (z 1,k, k K 1,t0 ), z 0,...) (24) 3.4 2-Stage Formulaton In ths algorthm 2-stage stochastc programmng s used wthn each horzon. Ths provdes an approxmaton to a full mult-stage stochastc program whch are, n general, known to be extremely challengng computatonally (Shapro 2006). The frst stage ncludes tme step 1, and the second stage tme steps 2,..., n. Tradtonally, n 2-stage stochastc programmng there s no uncertanty n the frst stage (Shapro, Dentcheva, and Ruszczyńsk 2009). However n our problem, we are requred to make decsons before some nputs n the frst stage are revealed. To resolve ths, frst stage nputs are set to ther real values f known, otherwse ther condtonal expected value s taken (as descrbed n 3.3). The second stage uses sampled scenaros to represent the uncertanty n the nput parameters. We defne a second stage scenaro s as beng a sample from the jont dstrbuton of random varables n the second stage, condtoned on any revealed nputs n the second stage, and nputs n and pror to the frst stage: s P (z n, z n 1,..., z 2 (z n,k, k K n,t0 ),..., (z 2,k, k K 2,t0 ), z 1, z 0,...) (25) We use the Sample Average Approxmaton (SAA) (Shapro, Dentcheva, and Ruszczyńsk 2009) to lmt the number of scenaros S N that we need to consder n the second stage. Each scenaro n the second stage needs to have ts own set of varables n the optmsaton problem. For example we denote the power of devce d at tme step n scenaro s by p d,,s. The 2-stage objectve functon s gven by: mn c 1 + 1 n c,s (26) S s {s 1,...,s S } =2 3.5 Stochastc Inputs Stochastc nputs nclude the real-tme prcng (RTP), outdoor temperature, solar rradance, background power, nternal heat generaton, hot water demand, EV usage and shftable load requests. Accurately modellng any of these random processes s a sgnfcant undertakng n tself. The models we developed, whle not the most sophstcated, sut the purposes of our experments by capturng the fundamental nature of these stochastc processes. We nvestgated a number of dfferent model types before settlng on Generalsed Addtve Models (GAM) (Haste and Tbshran 1990) for the contnuous varables lke temperature, and Markov Models for the more dscrete occupant drven behavours such as shftable devce requests. Generalsed Addtve Models. In order to predct future values, the GAMs models take advantage of weather forecasts that can be readly obtaned from natonal weather servces. These forecast values nclude daly maxmum and mnmum temperatures, as well as mornng and afternoon cloud cover and wnd speed. They also take n the value from the prevous tme step and temporal nformaton. The models were traned on data obtaned from the Bureau of Meteorology 3 and Australan Energy Market Operator 4 relevant to the states of New South Wales (NSW) and the Australan Captal Terrtory (ACT) n Australa. The best way of mplementng RTP n retal markets s stll an open queston and so s worth partcular menton. It s unlkely that t wll be a smple replcaton of the wholesale spot market prce due to ts hgh volatlty. More lkely t wll be controlled by the retaler, but have a shape representatve of the wholesale market. We desgned our RTP to be a quadratc functon 5 of the amount of power that fossl 3 Bureau of Meteorology, www.bom.gov.au 4 Australan Energy Market Operator, www.aemo.com.au 5 The quadratc s representatve of an ncreasng margnal supply prce (Ramchurn et al. 2011).

fuel sources must supply to meet total network load. Ths s the total network demand mnus the generaton from renewable sources such as wnd and solar. We used a GAM for the total network demand. The generaton from renewables s a functon of wnd speed and solar rradance. The RTP s only revealed to a house 30 mnutes n advance. Markov Models. Input parameters arsng from the behavour of house occupants were captured va sem-markov models representng the actvtes and consumpton patterns (e.g., hot water, shftable load requests and EV usage) of the four occupants of a specfc house n the ACT. Each model dentfes the key actvtes of an occupant (e.g., sleepng, takng a shower and leavng for work), and specfes the probabltes of transtng from one actvty to the next wthn certan tme wndows. Each actvty s assocated wth a seres of actons (e.g., watchng TV, requestng the dsh washer to operate) that trgger changes n nput parameters. Condtonal samplng through these models s used to generate scenaros. Whlst ths scheme was convenent for our experments, other more data-drven optons are possble: we could smply gather and use a database of raw scenaros, or learn model parameters from dsaggregated demand data (Kolter, Batra, and Ng 2010; Parson et al. 2012). 4 Experments We mplemented the 2-stage and expectaton onlne algorthms usng Gurob as a backend to solve the MILP wthn each horzon. The devces n Secton 2.2 were mplemented and ncluded n the expermental house, and condtonal samplers were created for the uncertan nput parameters n Secton 3.5. We created a smple smulator that uses the same physcal equatons as the optmsaton to smulate the executon of actons n the real world. We compare the performance of the 2-stage and expectaton controllers wth nave and smart reactve controllers, and a controller that has perfect nformaton. The Nave reactve controller represents a household that ether has no ablty or no desre to respond to a RTP. It starts shftable devces as soon as a request s receved, flls up the hot water tank n off-peak hours, charges the EV only f t s below the requested mnmum level, mantans the room at the set pont temperature and never uses the battery bank. The Smart reactve controller uses smple devce acton polces to decde how to respond to changes n RTP. It delays runnng a shftable devce untl t reaches a cheap prce or the last avalable start tme, uses thresholds about a movng average of the RTP to decde when to charge or dscharge energy from the batteres, EV and hot water system, and mantans the room at the set pont temperature lke the nave controller. The Perfect controller has perfect foresght about what wll happen n the future. It optmses the determnstc problem n Defnton 3 over the whole experment duraton wth full knowledge of z. Ths controller (whch s nfeasble n practce) s used to gve a lower bound on the objectve that can be acheved by the other controllers. 4.1 Controller Comparson Nne sets of nput parameters typcal for the month of February were generated. These were used n 9 separate expermental runs, each wth a duraton of 7 days. The onlne algorthms had 16 hour optmsaton horzons, wth 15 mnutes for the frst two tme steps and 30 mnute tme steps for the remander of the horzon 6. The reactve and perfect controllers had 15 mnute tme steps. The 2-stage algorthm sampled 30 scenaros n ts second stages. The controller costs are plotted n Fgure 2a for each of the 9 expermental runs. These results are adjusted to account for any energy that remans n the battery, EV, or hot water system at the end of an expermental run. Ths s done by valung the left-over energy at the average RTP for the last 24 hours. Wthout ths adjustment t would not be far to compare the controllers, snce any controller that antcpates the need to store energy for a future purpose, would perform poorly f t does so just before the experment ends. Ths s an artefact of the fnte length of our expermental runs; wth very long duratons ths problem goes away as the costs assocated wth left-over stored energy become nsgnfcant. We see that the 2-stage and expectaton algorthms get qute close to the performance of the controller wth perfect foresght. They produce sgnfcant cost reductons over the two reactve controllers. In run 5 the performance of the 2- stage and expectaton controllers s drastcally reduced. Ths s caused by large spkes n hot water consumpton that occur durng ths run. The expectaton algorthm fals to account for these spkes. The 2-stage antcpates them, but ther effects are heavly dscounted because of ther low lkelhood. When the spkes eventuate both algorthms are ht by a large cost for not beng able to supply enough hot water. The 2-stage fares better because t has done some preparaton, whch s the amount that on average wll gve the optmal outcome. The two reactve controllers are not sgnfcantly affected by these unlkely spkes because they have to plan for the worst case and always keep the tank relatvely full. The expectaton algorthm does just as well or better than the 2-stage controller for a majorty of the expermental runs, but on average the 2-stage outperforms t by 7.4%. The expectaton algorthm performs poorly when t fals to antcpate an mportant scenaro, such as the possblty for large spkes n hot water consumpton. The amount of tme spent optmsng n Gurob per day s on average 64 seconds for the 2-stage algorthm and 0.9 seconds for the expectaton algorthm (usng a sngle core of an Intel 7-2600 3.4GHz CPU). Whlst the 2-stage s much slower than the expectaton algorthm, ts computatonal tme remans nsgnfcant when spread out over a day. Fgure 2b shows the costs for each controller n run 1 dsaggregated nto those assocated wth each devce, gnorng the PV. We see cost savngs for all devces when usng the onlne stochastc controllers over the reactve ones. The greatest area where the 2-stage and expectaton controllers can be mproved s n hot water heatng; n ths experment 6 By usng larger tme steps for more uncertan values further nto the future we reduce the computatonal burden wth only a mnor reducton to soluton qualty.

12 10 Perfect Stage Expect Smart Nave 5 4 Perfect Stage Expect Smart Nave 8 3 Cost ($) 6 4 Cost ($) 2 1 2 0 0 1 2 3 4 5 6 7 8 9-1 Battery EV HVAC Water Dsh Wash Dryer (a) Controller comparson. (b) Dsaggregated costs. Fgure 2: Costs for each expermental run and dsaggregated costs for run 1. the smart reactve controller s even outperformng them. Asde from ths ther performance s close to that of the perfect controller. Fgure 3a gves an example of the power exchanged between the house and the grd for one day, along wth the RTP. As expected most consumpton occurs when the prces are low, and when the prce s hgh power s sold back to the grd from the battery, EV and PV. The 2-stage and expectaton controllers follow the general trend of the perfect controller wth some small dvergences. 4.2 Parameter Tunng Fgure 3b shows the results of an experment where we nvestgated how performance changes wth the horzon duraton. Ths plot shows the performance of the perfect controller runnng as an onlne algorthm where t s restrcted to only havng perfect foresght a certan dstance nto the future. The experment s performed on run 1 for a number of dfferent horzon duratons, and the results are compared to the orgnal perfect controller that can see the full 7 days. The results show that there s lttle to be ganed by lookng any further nto the future than 20 hours. We reran the 2-stage experments 3 tmes usng dfferent startng sampler seeds. The standard devaton of the results was typcally less than 2 cents for each run except run 5 where we had a standard devaton of 42 cents. Ths seems to suggest the 2-stage algorthm could have benefted from more samples n ts second stage for run 5. Indeed ntal experments show some mprovements when usng more samples but more comprehensve testng s stll requred. 5 Related Work Much of the exstng lterature on resdental demand response focuses on determnstc formulatons over fxed horzons where the scheduler has perfect foresght (Ramchurn et al. 2011; Gatss and Gannaks 2012). Those that have consdered uncertanty n the problem typcally focus on just one aspect (e.g., real-tme prcng) (Mohsenan-Rad and Leon-Garca 2010), or use very smple models for random varables (Tscher and Verbc 2011). Model-predctve control has been used to account for the uncertanty of estmated devce model parameters and measurement nose (Yu et al. 2012), but not the uncertanty of the type we model. In general, model-predctve control s best suted to unconstraned, purely contnuous settngs wth lmted uncertanty. Dynamc programmng (Tscher and Verbc 2011; Km and Poor 2011) and Q-learnng (Levorato, Goldsmth, and Mtra 2010) have been used n conjuncton wth Markov Decson Process (MDP) formulatons of the resdental load schedulng problem, to generate polces that allocate power to each devce. MDP approaches suffer from severe scalablty ssues, especally snce the state space needs to be dscretsed. Moreover, MDPs seem somewhat excessve for our problem, gven that uncertanty does not depend on the decsons taken. Our stochastc programmng approach whch uses scenaro samplng s more scalable and more natural n the presence of exogenous uncertanty. One paper (Tscher and Verbc 2011) found that actng on the bass of the optmal dynamc programmng soluton dd not provde any beneft over actng on expectatons. For many of our experments we found ths to be true, but we dd dentfy certan cases where the more cautous nature of the 2-stage algorthm s superor. The dfference n our two results s thought to be due to our use of more sophstcated uncertanty models and dfferences n devce models. The paper closest to ours compares two-stage stochastc programmng and robust optmsaton technques for schedulng resdental loads (Chen, Wu, and Fu 2012). Uncertanty s restrcted to the RTP whch s known for the frst stage but becomes uncertan thereafter. The objectve ncludes mnmsng expected prce and the probablty mass of rsky scenaros whose prce exceeds a certan threshold. Comfort s handled by mposng hard constrants under whch applances must run, rather than by ncluson nto the objectve. In ths settng, two-stage stochastc programmng

10 5 Prce Perfect Stage Expect 8 7.5 6 5 Horzon Perfect Power (kw) 0-5 7 6.5 6 Prce (c/kwh) Cost ($) 4 3 2-10 5.5 1-15 0 5 10 15 20 Tme (Hours) 5 0 4 8 12 16 20 24 Horzon Duraton (Hours) (a) House power profles. (b) Perfect horzon. Fgure 3: House power profles over one day and performance for dfferent horzon duratons. was observed to provde benefts over robust schedulng. The scope of our analyss goes sgnfcantly beyond these results, by explorng uncertanty from a large range of sources and by dentfyng the value of stochastc nformaton. We enable rcher sources of uncertanty to be consdered n our framework, by allowng nputs to be revealed at arbtrary ponts n tme. 6 Concluson and Future Work Ths paper contrbutes to the growng body of work on resdental control of loads and storage under real-tme prcng, by developng a framework that accounts for uncertanty. To our knowledge, t s the frst work that provdes a scalable and accurate soluton n the presence of uncertanty about future prces, occupant behavour and envronmental condtons. Usng models representatve of physcal devces and random processes, we have shown the monetary and comfort cost savngs that can be acheved by usng onlne stochastc algorthms over reactve control, and the performance ncrease of a 2-stage approach over actng on expectatons. Studes such as the one n ths paper are mport for rallyng ndustry and customers towards more effectve energy management schemes. Further research s needed to nvestgate how closely realty can be modelled wth random processes, and f n turn they are sutable for onlne learnng. We also need to further nvestgate how tme step szes and the number of second stage scenaros nfluence performance, and to conduct more experments for dfferent months of the year. The expermental set up we have developed can be used to experment wth and compare dfferent prcng schemes. For example, tme of use prcng and schemes where the prce offered for generaton s lower than that for consumpton. We also plan on nvestgatng how multple houses react to a RTP and what sort of emergent behavour develops when they are all learnng ther statstcal models onlne. Commercally avalable resdental DR solutons 7 typcally focus or drect load control or smple reactve polces. Such systems could experence more optmal DR performance and greater resdental customer satsfacton by usng our algorthms. However, there are two practcal challenges that need to be addressed before our technology can have wdespread adopton n homes. Frstly there needs to be a standard smart devce nterface for communcaton wth HASs. The proposed Australan demand response applance standards n AS4755 are a move n the rght drecton, but true two-way communcaton would enable better control outcomes. All new applances should come wth such an nterface, whle some exstng applances could be retroft. The second requrement s for utltes to start offerng RTP servces to resdental customers. The HAS could receve the RTP sgnal drectly 8 or use exstng smart meters as an ntermedate. Customers wll be attracted to these schemes and to the purchase of a HAS wth DR capabltes by the potental electrcty bll savngs. Acknowledgements Ths work s supported by NICTA s Optmsaton Research Group as part of the Future Energy Systems project. We thank our project members and revewers for useful dscussons and helpful suggestons. NICTA s funded by the Australan Government as represented by the Department of Broadband, Communcatons and the Dgtal Economy and the Australan Research Councl through the ICT Centre of Excellence program. 7 e.g., comverge: www.comverge.com, nest: www.nest.com and Cooper Power Systems: www.cooperndustres.com 8 e.g., usng openadr Allance protocols: www.openadr.org

References Chen, Z.; Wu, L.; and Fu, Y. 2012. Real-tme prce-based demand response management for resdental applances va stochastc optmzaton and robust optmzaton. Smart Grd, IEEE Transactons on 3(4):1822 1831. Gatss, N., and Gannaks, G. B. 2012. Resdental load control: Dstrbuted schedulng and convergence wth lost am messages. Smart Grd, IEEE Transactons on PP(99):1 17. Haste, T., and Tbshran, R. 1990. Generalzed Addtve Models. Monographs on Statstcs and Appled Probablty Seres. Chapman & Hall, CRC Press. Km, T., and Poor, H. 2011. Schedulng power consumpton wth prce uncertanty. Smart Grd, IEEE Transactons on 2(3):519 527. Kolter, J. Z.; Batra, S.; and Ng, A. Y. 2010. Energy dsaggregaton va dscrmnatve sparse codng. In 24th Annual Conference on Neural Informaton Processng Systems (NIPS), 1153 1161. Levorato, M.; Goldsmth, A.; and Mtra, U. 2010. Resdental demand response usng renforcement learnng. In n Smart Grd Communcatons (SmartGrdComm), 2010 Frst IEEE Internatonal Conference on, 409 414. Mohsenan-Rad, A.-H., and Leon-Garca, A. 2010. Optmal resdental load control wth prce predcton n realtme electrcty prcng envronments. Smart Grd, IEEE Transactons on 1(2):120 133. Parson, O.; Ghosh, S.; Weal, M.; and Rogers, A. 2012. Non-ntrusve load montorng usng pror models of general applance types. In Proceedngs of Twenty-Sxth Conference on Artfcal Intellgence (AAAI-12). Powell, W.; Smao, H.; and Bouzaene-Ayar, B. 2012. Approxmate dynamc programmng n transportaton and logstcs: a unfed framework. EURO Journal on Transportaton and Logstcs 1(3):237 284. Ramchurn, S. D.; Vytelngum, P.; Rogers, A.; and Jennngs, N. R. 2011. Agent-based control for decentralsed demand sde management n the smart grd. In Tumer; Yolum; Sonenberg; and Stone., eds., Proc. of 10th Int. Conf. on Autonomous Agents and Multagent Systems - Innovatve Applcatons Track (AAMAS 2011), 330 331. Shapro, A.; Dentcheva, D.; and Ruszczyńsk, A. 2009. Lectures on Stochastc Programmng: Modelng and Theory. MPS-SIAM seres on optmzaton. Socety for Industral and Appled Mathematcs (SIAM, 3600 Market Street, Floor 6, Phladelpha, PA 19104). Shapro, A. 2006. On Complexty of Multstage Stochastc Programs. Operatons Research Letters 34(1):1 8. Tscher, H., and Verbc, G. 2011. Towards a smart home energy management system - a dynamc programmng approach. In Innovatve Smart Grd Technologes Asa (ISGT), 2011 IEEE PES, 1 7. Van Hentenryck, P., and Bent, R. 2006. Onlne Stochastc Combnatoral Optmzaton. Cambrdge, Mass.: The MIT Press. Yu, Z.; McLaughln, L.; Ja, L.; Murphy-Hoye, M. C.; Pratt, A.; and Tong, L. 2012. Modelng and stochastc control for home energy management. In 2012 Power and Energy Socety general meetng.