Sensor placement for leak detection and location in water distribution networks

Sensor placement for leak detecton and locaton n water dstrbuton networks ABSTRACT R. Sarrate*, J. Blesa, F. Near, J. Quevedo Automatc Control Department, Unverstat Poltècnca de Catalunya, Rambla de Sant Nebrd, 10, 0222 Terrassa, Span *Correspondng author, e-mal ramon.sarrate@upc.edu The performance of a leak detecton and locaton algorthm depends on the set of measurements that are avalable n the network. Ths work presents an optmzaton strategy that maxmzes the leak dagnosablty performance of the network. The goal s to characterze and determne a sensor confguraton that guarantees a maxmum degree of dagnosablty whle the sensor confguraton cost satsfes a budgetary constrant. To effcently handle the complexty of the dstrbuton network an effcent branch and bound search strategy based on a structural model s used. However, n order to reduce even more the sze and the complexty of the problem the present work proposes to combne ths methodology wth clusterng technques. The strategy developed n ths work s successfully appled to determne the optmal set of pressure sensors that should be nstalled to a Dstrct Metered Area n the Barcelona Water Dstrbuton Network. KEYWORDS: Leak detecton and locaton, sensor placement, structural analyss, water dstrbuton network 1. INTRODUCTION An mportant matter concernng water dstrbuton networks s system water loss, whch has a meanngful effect on both water resource savngs and costs of operaton (Farley and Trow, 2003). Contnuous mprovements on water loss management are beng appled. New technologes are developed to acheve hgher levels of effcency, ntended to reduce losses to acceptable levels consderng techncal and economcal aspects. Usually a leakage detecton method n a Dstrct Metered Area (DMA) starts analyzng nput flow data, such as mnmum nght flows and consumer meterng data. Once the water dstrbuton dstrct s dentfed to have a leakage, technques are used to locate the leakage for ppe replacement or reparng. The whole process could take weeks or months wth an mportant volume of water wasted. To overcome ths problem, dfferent leakage detecton and locaton technques are carred out n the feld. Fault dagnoss technques are appled by Brdys and Ulanck (1994). A mathematcal model s used whch permts comparng the data gathered by nstalled sensors n the network wth the data obtaned by a model of ths network. If a dfference s detected between these data sets, a detecton of an abnormal event s obtaned. Thus, modelng s paramount n order to acheve successful results. Ths model s the mathematcal tool lnkng the real sensor data gathered from the network to the decson makng procedure. The tool provdes leak detecton as well as ts approxmate locaton n the network. dstrbuton networks 1

Fault dagnoss systems are an ncreasng and mportant topc n many ndustral processes. The number of publcatons devoted to fault dagnoss has ncreased notably n the last years (Blanke et al., 2006). In model-based fault dagnoss, dagnoss s bascally performed based on the responses of resdual generators. These are functons obtaned from the model whch perform the task of comparng the process model and on-lne process nformaton. Snce process nformaton s usually obtaned by means of the sensors nstalled n the process, t s mportant to develop methodologes to place the correct sensor set n the process n order to guarantee some dagnoss specfcatons. Some results devoted to sensor placement for dagnoss can be found n (Travé-Massuyès et al., 2006; Krysander and Frsk, 2008; Sarrate et al., 2012). All these works use a structural model-based approach and defne dfferent dagnoss specfcatons to solve the sensor placement problem. A structural model s a coarse model descrpton, based on a b-partte graph, whch can be obtaned early n the development process, wthout maor engneerng efforts. Ths knd of model s sutable to handle large scale systems snce effcent graphbased tools can be used and does not have numercal problems. Structural analyss s a powerful tool for early determnaton of fault dagnoss performances (Blanke et al., 2006). In (Sarrate et al., 2012) an algorthm s developed to determne where to nstall a specfc number of pressure sensors n a DMA n order to maxmze the capablty of detectng and solatng leaks. The number of sensors to nstall s lmted n order to satsfy a budgetary constrant requrement. Despte an effcent branch and bound search strategy based on a structural model s used, ts applcablty s stll lmted to medum-szed networks. In order to overcome ths drawback by reducng even more the sze and the complexty of the problem the present work proposes to combne ths methodology wth clusterng technques. Clusterng s the unsupervsed classfcaton of patterns (observatons, data tems, or feature vectors) nto groups (clusters). The clusterng problem has been addressed n many contexts and by researchers n many dscplnes (Jan et al., 1999). It s a mature and actve research area (Xu and Wunsch, 2005) and many effcent algorthms have been developed n the lterature. The man contrbuton of ths paper conssts n combnng a clusterng technque wth a branch and bound search based on a structural model to solve the sensor placement problem. Ths methodology s appled to a DMA network n Barcelona to determne the best locaton of pressure sensors for leak detecton and locaton. Ths paper s organzed as follows. In Secton 2, some model-based fault dagnoss concepts are revewed. Secton 3 formally states the sensor placement problem. In Secton 4 the structural approach to sensor placement s recalled, whereas the clusterng approach s descrbed n Secton 5. Next, the whole methodology s appled to a DMA network n Secton 6. Fnally some conclusons are gven n Secton 7. 2. MODEL-BASED FAULT DIAGNOSIS Model-based fault dagnoss s a consoldated research area (Blanke et al., 2006). Most approaches to detect and solate faults are based on consstency checkng. The basc dea behnd all these works s the comparson between the observed behavor of the process and ts correspondng model. Ths s performed by means of consstency relatons, whch can be roughly descrbed as a functon of the form hyt ( ( ), ut ( )) 0, (1) dstrbuton networks 2

where y( t ) and ut ( ) are vectors of known varables, denotng respectvely process measurements and process control nputs. Functon h s obtaned from the model and s the bass to generate a resdual rt () hyt ( (), ut ()). (2) A resdual s a temporal sgnal ndcatng how close s behavng the process compared wth ts expected behavor predcted by the model. At the absence of faults, a resdual equals zero. In fact, a threshold based test s usually mplemented n order to cope wth nose and model uncertanty effects. Otherwse, when a fault s present the model s no longer consstent wth the observatons (known process varables) and the resdual exceeds the prefxed threshold. Detectng faults s possble wth only one resdual senstve to all faults. However, fault solaton s usually requred rather than ust detectng the presence of a fault. The fault solaton task s performed by desgnng a set of resduals based on several consstency relatons. Each resdual s senstve to dfferent faults such that the resdual fault sgnature s unque for each fault. Therefore, dstngushng the actual fault from other faults s possble by lookng at the resdual fault sgnature. These fault sgnatures are usually collected n matrx form. Gven a set of resduals R and a set of faults F the fault senstvty matrx s defned n (3). When an element s close to zero then resdual r R s weakly senstve to fault f F, whereas when t dverges from zero then the resdual s strongly senstve to the fault f. r1 r1 f1 f m (3) rn rn f1 f m Ths matrx can be obtaned by convenent model equatons manpulaton as long as faults effects are ncluded n them (Blesa et al., 2012). Alternatvely, t can be obtaned by senstvty analyss through smulaton (Pérez et al., 2011). The latter approach wll be used n the present paper. Gven a set of possble measurable varablesx, x,, x 1 2 n, the fault senstvty matrx wll collect those (prmary) resduals that are obtaned by comparng each real measurement x to the correspondng sgnal obtaned through smulaton ˆx n the fault free case,.e. r ˆ x x. An approxmate procedure to obtan the fault senstvty matrx nvolves usng a smulator to get an estmaton of measurement x n the fault free case x ˆ0, as well as n every faulty stuaton xˆ, f F,.e. ˆ ˆ x x 0. Sometmes a bnary verson of the fault senstvty matrx s used. Then the correspondng bnary resduals are usually called structured resduals, whereas n the non-bnary matrx they are referred to as drectonal resduals. dstrbuton networks 3

3. PROBLEM FORMULATION Let S be the canddate sensor set and m the number of sensors that wll be nstalled n the system. Then, the problem can be roughly stated as the choce of a combnaton of m sensors n S such that the dagnoss performance s maxmzed. It s assumed that a bounded budget s assgned to nstrumentaton and that all sensors to be nstalled have equal cost. Ths s the case n the DMA applcaton, snce all canddate sensors wll be pressure sensors. Let F be the set of faults that must be montored. In a water dstrbuton doman a leak s an example of a fault, but other damages could be consdered such as ppe blockng or tank overflow. In ths work, the sngle fault assumpton wll hold (.e., multple faults wll not be covered) and no canddate sensor fault wll be consdered. In model-based dagnoss, fault detectablty and fault solablty are the man obectves (Blanke et al., 2006). Assumng structured resduals, a fault s detectable f ts occurrence can be montored, whereas a fault f F s solable from a fault f F f the occurrence of f can be detected ndependently of the occurrence of f. Assumng that a sensor confguraton S S s nstalled n the process, FD ( S) F wll denote the detectable fault set. Fault solablty wll be characterzed by means of fault pars. Let :FF be all fault pars, then FI ( S) wll denote the set of solable fault pars (.e., ( f, f ) FI( S) means that fault f s solable from f when the sensor confguraton S s nstalled n the system). Based on the set of solable fault pars the solablty ndex s defned as the number of solable fault pars when a sensor confguraton S s nstalled,.e. I(S)= F I (S), where denotes the cardnalty of the set. To solve the sensor placement problem proposed n ths paper, a system descrpton s also requred. Such descrpton wll allow the computaton of the detectable faults and the solablty ndex for a gven sensor confguraton. Hence, the sensor placement for fault dagnoss can be formally stated as follows: GIVEN a canddate sensor set S, a system descrpton, a fault set F, and the number m of sensors to be nstalled. FIND the m-sensor confguraton S S such that FD ( S) F and I( S) I( S), SS S m. Ths problem was already solved n (Sarrate et al., 2012), usng a branch and bound search strategy based on a structural model of the process. However, the complexty of such algorthm crtcally depends on the cardnalty of S. In order to overcome ths constrant, a preprocessng step s proposed n the present paper. Clusterng technques wll be appled to reduce the canddate sensor set before solvng the sensor placement problem as n (Sarrate et al., 2012). dstrbuton networks 4

4. STRUCTURAL APPROACH TO SENSOR PLACEMENT 4.1 Fault dagnoss based on structural analyss The analyss of the model structure has been wdely used n the area of model-based fault dagnoss (Blanke et al., 2006). Therefore, consstent tools exst n order to perform dagnosablty analyss and consequently compute the set of detectable and solable faults. The structural model s often defned as a bpartte graph GM (, X, A ) where M s a set of model equatons, X a set of unknown varables and A a set of edges, such that ( e, x ) Aas long as equaton e M depends on varable x X. A structural model s a graph representaton of the analytcal model structure snce only the relaton between varables and equatons s taken nto account, neglectng the mathematcal expresson of ths relaton. Structural modelng s sutable for an early stage of the system desgn, when the precse model parameters are not known yet, but t s possble to determne whch varables are related to each equaton. Furthermore, the dagnoss analyss based on structural models s performed by means of graph-based methods whch have no numercal problems and are more effcent, n general, than analytcal methods. However, due to ts smple descrpton, t cannot be ensured that the dagnoss performance obtaned from structural models wll hold for the real system. Thus, only best case results can be computed. It s well-known that the over-determned part of the model s the only useful part for system montorng (Blanke et al., 2006). The Dulmage-Mendelsohn decomposton (Dulmage and Mendelsohn, 1958) s a bpartte graph decomposton that defnes a partton on the set of model equatons M. It turns out that one of these parts s the over-determned part of the model and s represented as M +. Fault detectablty and solablty can be defned as propertes of the over-determned part of the model (Krysander and Frsk, 2008). Frst, t s assumed that a sngle fault f F can only volate one equaton (known as fault equaton), denoted by e f M. Wthout loss of generalty, t s assumed also that a sensor s S can only measure one sngle unknown varable x s X. In the structural framework, such sensor wll be represented by one sngle equaton (known as sensor equaton), denoted as e s. Gven a set of sensors S, the set of sensor equatons s denoted as M S. Thus, gven a canddate sensor confguraton S and a model M, the updated system model corresponds to M S M. Hence, the dectectable fault set and the set of solable fault pars can be determned as F ( S) f F e M M, (4) D f S FI( S) f, f ef MS M \ ef. (5) 4.2 Optmal sensor placement algorthm The sensor placement algorthm developed n (Sarrate et al., 2012) s brefly recalled n ths secton. Algorthm 1 s based on a depth-frst branch and bound search. The search nvolves buldng a node tree by recursvely callng functon searchop m, begnnng at the root node down to the leaf nodes. Each node corresponds to a sensor confguraton (node.s) and chld dstrbuton networks 5

nodes are bult by removng sensors from ts correspondng parent node. Set node.r specfes those sensors that are allowed to be removed. Algorthm 1 S * = searchop m (node, S * ) chldnode.r := node.r for m-( S - R )+1 teratons do Take s chldnode.r at random \ s chldnode.s := node.s chldnode.r := chldnode.r \ s f I(chldNode.S) > I(S * ) and F D (chldnode.s) = F then f chldnode.s > m then S * := searchop m (chldnode, S * ) else S * := chldnode.s f I(node.S)=I(S * ) then return S * end f end f end f end for return S * Throughout the search, the best soluton s updated n S * whenever an m-sensor confguraton wth a hgher fault solablty ndex than the current best one s found, gven that all faults are detectable. The search s ntalzed as follows: node.s = node.r = S and S * = Ø. Durng the search only those branches that can be further expanded to an m-sensor confguraton are vsted. Tree expanson s aborted whenever the fault solablty ndex s not mproved or some faults are not detectable. 5. CLUSTERING APPROACH TO SENSOR PLACEMENT 5.1 Clusterng technques Gven a set of elements x1, x2,, x n, clusterng conssts n parttonng the n observatons nto l sets 1, 2,, l (l n) n such a way that obects n the same group (called cluster) are more smlar (n some sense) to each other than those n other groups (clusters). For example, k-means clusterng algorthm (MacQueen, 1967) mnmzes the wthn-cluster sum of dstances by solvng the optmzaton problem l 1 x arg mn d x,, (6) where d s a dstance and s the centrod of cluster (.e. t s the mean of observatons n accordng to metrc d). In the orgnal algorthm, d s the squared Eucldean dstance, but other dstance measures are possble. Problem (6) s nonconvex and obtanng the soluton s NP-hard, but there are effcent heurstc algorthms that converge quckly to a local optmum. dstrbuton networks 6

5.2 Problem reducton through clusterng technques In ths paper, we propose a reducton n the number of canddate sensors by groupng the n ntal sensors nto l groups (l n) applyng the k-means algorthm. Then, a representatve sensor wll be selected for each cluster, settng up the new canddate sensor set. In ths case, the crteron used for determnng the smltude between elements (sensors) s the senstvty pattern of ther prmary resduals to faults. In partcular, accordng to the procedure descrbed n Secton 2, ths s gven by every row of the fault senstvty matrx defned n (3). So, choosng x =, 1,..., n(where s the row vector of matrx ) and applyng the k-means algorthm defned n (6), a set of l clusters of sensors wth a smlar fault senstvty pattern wll be obtaned. Snce resduals are drectonal, the cosne dstance s chosen for the k-means algorthm. Once the elements x (sensors) have been grouped n l clusters, the most representatve sensors c, =1,, l can be chosen as the nearest ones to the cluster centrods among the elements of each cluster. 6. APPLICATION TO A WATER DISTRIBUTION NETWORK 6.1 Water network descrpton The sensor placement methodology s appled to a DMA located n Barcelona area (see Fgure 2). It has 883 nodes and 927 ppes. The network conssts of 311 nodes wth demand (RM type), 60 termnal nodes wth no demand (EC type), 48 hydrant nodes wthout demand (HI type), 14 dummy valve nodes wthout demand (VT type) and 448 dummy nodes wthout demand (XX type). The network has two nflow nputs modeled as reservor nodes. Leakage detecton s based on the premse that damage (leakage) n one or more locatons of the ppng network nvolves local lqud outflow at the leakage locaton, whch wll change the flow characterstcs (pressure heads, flow rates, acoustcs sgnals, etc.) at the montorng locatons of the ppng network. In ths work, t s assumed that leaks mght only occur at XX type nodes, so there are 448 potental leaks to be detected and located. Actually leaks could occur at any network node or ppe. However, leak locatons have been restrcted to certan type nodes n order to delmt the sze and complexty of the problem. A smlar practcal reason apples when defnng the possble locaton of the network montorng ponts. Pressure sensors at RM type nodes wll be used as network montorng ponts, so there are 311 canddate sensors that could be chosen for nstallaton. Despte measurng flow rate could also be useful for leak detecton, collectng pressure data s cheaper and easer, and pressure transducers gve nstantaneous readngs whereas most flow meters do not react nstantaneously to flow changes (de Schaetzen et al., 2000). 6.2 Water network model Solvng the sensor placement problem defned n Secton 3 requres a structural model of the water network (as descrbed n Secton 4.1) and a fault senstvty matrx (see Equaton (3)). The model of the DMA comprses 883 flow balance equatons q dn, (7) qqn dstrbuton networks 7

where Q n represents all flows correspondng to ncdent edges to node n and d n s the known flow demand of node n, and 927 ppe flow equatons q () t f( p ), (8) e where q e s the flow correspondng to edge e and f s a nonlnear functon of the pressure drop on the adacent nodes of edge e. These equatons depend on 927 unknown flow varables and 883 unknown pressure varables. The resultng structural model s depcted n Fgure 1. A dot (, ) n the fgure ndcates that varable appears n equaton. e ppe flow equatons flow balance equatons flow varables pressure varables Fgure 1. Structural DMA model. A leak n a node nvolves volatng a flow balance equaton, so fault equatons are Equaton (7) for type XX nodes. A fault senstvty matrx has been also obtaned usng the EPANET hydraulc smulator. Gven a set of boundary condtons (such as water demands) EPANET software has been frstly used to estmate the steady-state pressure at the 311 RM type nodes. Next, 448 leaks have been smulated n the XX type nodes and the steady-state pressure has been estmated agan n the 311 RM type nodes. Fnally, a 311 448 fault senstvty matrx has been obtaned as the pressure dfference between the fault free case and each faulty stuaton, accordng to the procedure descrbed n Secton 2. Although the fault senstvty matrx depends on the leakage sze, the dagnosablty propertes are robust aganst ths uncertanty. So n ths work the mnmum detectable leakage sze has been consdered n the smulatons. 6.3 Sensor placement analyss and results In prncple, to fully solate all 448 possble leaks, the requred solablty ndex should be 448 100128. However, accordng to the structural analyss, nstallng all 311 canddate 2 sensors, the solablty ndex would ust be 100099. Achevng a better performance would requre nstallng more sensors than those desgnated n the canddate sensor set. Therefore, there s a trade-off between the dagnoss performance and the number of nstalled sensors. Assume that the water dstrbuton company has establshed a maxmum budget for nvestment on nstrumentaton that makes t possble to nstall up to 8 pressure sensors. Hence, the water dstrbuton company wants to nstall 8 sensors maxmzng the resultng dagnoss performance. Applyng Algorthm 1 to the ntal canddate sensor set s not feasble dstrbuton networks 8

snce t would requre a huge amount of computaton tme. So frst of all the canddate sensor set wll be reduced by applyng the clusterng approach descrbed n Secton 5.2. k-means algorthm has been appled to partton the ntal canddate sensor set nto 31 clusters, and a representatve sensor for each cluster has been found. So, the new canddate sensor set has now 31 pressure sensors (see the blue crcled nodes n Fgure 2). Next, applyng Algorthm 1 the 8-sensor confguraton, ponted by a red arrow n Fgure 2, s obtaned. Wth these 8 sensors all leaks can be detected and the solablty ndex amounts to 100092. Fgure 2. DMA network sensor placement results. Regardng performance ssues the clusterng step takes around 22 s, whereas the branch and bound search takes more than 7 h. Bearng n mnd ths computaton tme dfference, t mght sound appealng to drectly apply the clusterng step to obtan the 8-sensor confguraton. However, ths would not necessarly produce the same results for several reasons. Frst, t s well known that although the k-means algorthm fnds an optmum partton, t does not necessarly fnd the global one. In fact, the algorthm s sgnfcantly senstve to the ntal randomly selected cluster centres. To allevate ths drawback the algorthm s commonly run multple tmes wth dfferent ntal condtons. Secondly, notce that only a reduced set of drectonal resduals (the prmary resduals) are represented n the fault senstvty matrx accordng to the smulaton method proposed n (Pérez et al., 2011). In fact, the full set of drectonal resduals that could be desgned based on the model equatons would be much bgger, but computatonally harder to obtan. The structural analyss approach takes all structured resduals nto account, nstead. Thus ts results are more complete. Therefore, the sole purpose of the clusterng step s complexty reducton. But the branch and bound step s always desrable snce t produces sound and complete results. Despte the branch and bound search s tme consumng, ts performance s much better than that of an exhaustve search. Remark that durng the branch and bound search, the most demandng operaton s evaluatng the solablty ndex through Equaton (5), whch takes n average 1.24 s n ths case. Whereas Algorthm 1 ust computes t 17286 tmes, an exhaustve dstrbuton networks 9

search would nvolve evaluatng t days. 31 7888725 8 tmes, whch would requre more than 100 CONCLUSIONS Ths work presents an optmal sensor placement strategy that maxmzes the water network leak dagnosablty. The goal s to characterze and determne a sensor set that guarantees a maxmum degree of dagnosablty whle a budgetary constrant s satsfed. To overcome the complexty of the problem ths work proposes the combnaton of a branch and bound search based on a structural model of the dstrbuton network and clusterng technques. The strategy developed s successfully appled to a Dstrct Metered Area of the Barcelona Water Dstrbuton Network. The results show that ths combned technque manages to solve the sensor placement problem n a reasonable tme, whch otherwse would not be possble. Although promsng, these are prelmnary results and more research has to be done. One the one hand, as already mentoned n Secton 6.3, the k-means algorthm does not guarantee a global optmal soluton so the performance of other clusterng technques should be nvestgated. On the other hand, applyng clusterng technques to reduce the problem complexty by parttonng the fault set could be nvestgated. ACKNOWLEDGEMENTS Ths work has been funded by the Spansh MINECO through the proect CYCYT SHERECS (ref. DPI2011-26243), and by the European Commsson through contracts -Sense (ref. FP7- ICT-2009-6-270428) and EFFINET (ref. FP7-ICT2011-8-318556). REFERENCES Blanke M., Knnaert M., Lunze J. and Starosweck M. (2006). Dagnoss and Fault-Tolerant Control. Sprnger, 2nd ed. Blesa J., Pug V. and Saludes J. (2012). Robust dentfcaton and fault dagnoss based on uncertan multple nput multple output lnear parameter varyng party equatons and zonotopes. J. Proc. Cont., 22(10), 1890 1912. Brdys M.A. and Ulanck B. (1994). Operatonal Control of Water Systems: Structures, Algorthms, and Applcatons. Prentce Hall Internatonal. Dulmage A.L. and Mendelsohn N.S. (1958). Coverngs of bpartte graphs. Canad. J. Math., 10, 517 534. Farley M. and Trow S. (2003). Losses n Water Dstrbuton Networks. IWA Publshng. Jan A.K., Murty M.N. and Flynn P.J. (1999). Data clusterng: a revew. ACM Comput. Surv., 31(3), 264 323. Krysander M. and Frsk E. (2008). Sensor placement for fault dagnoss. IEEE Trans. Syst. Man Cybern. A, 38(6), 1398 1410. MacQueen J. (1967): Some methods for classfcaton and analyss of multvarate observatons. In: Proc. 5th Berkeley Symp. on Math. Statst. and Prob., Berkeley, 1, 281 297. Unv. of Calf. Press. Pérez R., Pug V., Pascual J., Quevedo J., Landeros E. and Peralta A. (2011). Methodology for leakage solaton usng pressure senstvty analyss n water dstrbuton networks. Contr. Eng. Pract., 19(10), 1157 1167. Sarrate R., Near F. and Rosch A. (2012): Sensor placement for fault dagnoss performance maxmzaton n dstrbuton networks. In: Proc. 20th Medt. Conf. on Cont. & Autom., Barcelona, Span, 110 115, July 2012. de Schaetzen W.B.F., Walters G.A. and Savc D.A. (2000). Optmal samplng desgn for model calbraton usng shortest path, genetc and entropy algorthms. Urb. Wat., 2(2), 141 152. Travé-Massuyès L., Escobet T. and Olve X. (2006). Dagnosablty analyss based on component-supported analytcal redundancy relatons. IEEE Trans. Syst. Man Cybern. A, 36(6), 1146 1160. Xu R. and Wunsch D. (2005). Survey of clusterng algorthms. IEEE Trans. Neur. Net., 16(3), 645 678. dstrbuton networks 10