Distributed Monitoring and Aggregation in Wireless Sensor Networks

Ditributed Monitoring and Aggregation in Wirele Senor Network Changlei Liu and Guohong Cao Department of Computer Science & Engineering The Pennylvania State Univerity E-mail: {chaliu, gcao}@ce.pu.edu Abtract Self-monitoring the enor tatue uch a livene, node denity and reidue energy i critical for maintaining the normal operation of the enor network. When building the monitoring architecture, mot exiting work focue on minimizing the number of monitoring node. However, with le monitoring point, the fale alarm rate may increae a a conequence. In thi paper, we tudy the fundamental tradeoff between the number of monitoring node and the fale alarm rate in the wirele enor network. Specifically, we propoe fully ditributed monitoring algorithm, to build up a -pollee baed architecture with the objective to minimize the number of overall while bounding the fale alarm rate. Baed on the etablihed monitoring architecture, we further explore the hop-by-hop aggregation opportunity along the multihop path from the polee to the, with the objective to minimize the monitoring overhead. We how that the optimal aggregation path problem i NP-hard and propoe an opportunitic greedy algorithm, which achieve an approximation ratio of 4.Afar a we know, thi i the firt proved contant approximation ratio applied to the aggregation path election cheme over the wirele enor network. I. INTRODUCTION A enor node uually operate in an unattended, harh environment, they are prone to failure and may run out of battery [], []. To make enor network reliable a well a adaptable, enor tatu (uch a livene, denity etimation, reidue energy, etc.) ha to be cloely monitored and made known to the ink, which can promptly react to enor tatu change. In ditributed ytem, the only way to learn the tatu of a node i through receiving meage from the node. For example, in IP network, the -pollee tructure [0] ha been widely ued for network management, where ome pecialized node are called and the other node are called pollee. Each monitor it pollee by actively ending a ping meage and then waiting for the reply or by paively waiting for the pollee to end meage periodically. Compared with the wired network, deigning monitoring mechanim for enor network ha more challenge. One major challenge i that intead of over a fixed topology, enor need to elf-organize themelve into a monitoring architecture in a ditributed manner. Therefore, ditributed monitoring i a critical and challenging iue. In thi paper, we propoe olution to addre the challenge pecific to enor network, to deign a fault tolerant, energy efficient monitoring ytem in a ditributed manner. The whole architecture i build upon the -pollee tructure, where Thi work wa upported in part by National Science Foundation under grant CNS-0. enor elf-organize themelve into two tier, with pollee in the lower tier and in the upper tier. The pollee end tatu report to the along multihop path, during which the intermediate node do the aggregation to reduce the meage overhead. Each make local deciion baed on the received aggregated packet, and forward it deciion toward the ink. When building the monitoring architecture, we focu on the fundamental tradeoff between the number of monitoring node (i.e., ) and the fale alarm rate. Mot of the previou work target at minimizing the number of only, becaue electing more will enhance the difficulty of tracking the tatu of each and thu increae the network management cot [0]. However, in a loy environment, the fale alarm rate can be adverely affected by a maller number of. For example, if the number of i too mall, ome pollee will be too far away from the, and then the chance of link tranient failure will be higher and the fale alarm rate will be larger. To balance the tradeoff between the number of and fale alarm rate, we propoe a ditributed determinitic algorithm, which ue two parameter k,k to guide a better ditribution of and pollee; i.e., no two are le than k hop away from each other, and no pollee i more than k hop away from it. Thi property enable u to minimize the number of while bounding the maximum fale alarm rate. We dicu how to et up thee parameter and further reduce the meage overhead baed on a randomized technique. To increae the energy efficiency and reduce the monitoring overhead, we take the hop-by-hop aggregation opportunitie in enor network. When pollee, which can be multiple hop away from the, end tatu report to their, the tatu report can be aggregated to reduce the number of packet needed. It i nontrivial to determine which aggregation path hould be ued in order to achieve better aggregation. For example, a hown in Fig., the aggregation ratio (), which i defined a the number of tatu report that can be aggregated into one packet, i. Each of the four pollee end a report to the ame periodically. Fig. how that without aggregation, 4 packet will be needed for each period, with a cot of packet-hop. If the report are aggregated along the path hown in Fig., all 4 report can be packed into packet, with a cot of packet-hop. If the aggregation path of Fig. (c) i ued, only packet are needed, with a total of 4 packet-hop. In thi paper, we formulate the election of the optimal aggregation path to minimize the tranmiion energy a a NP-hard problem, and prove that an opportunitic greedy

forwarding cheme ha an approximation ratio of 4.Wealo prove that if the pollee i within hop of the, i.e., k =, the bounded ratio i +. (c) Fig.. An example to illutrate the benefit of aggregation: without aggregation aggregation path I (c) aggregation path II, where each line denote a phyical link, and each arrow denote a packet tranmiion over one hop (packet-hop) The ret of the paper i organized a follow. Section II preent the -pollee baed monitoring architecture. Section III formulate the minimum election problem and propoe the ditributed algorithm. Section IV focue on the optimal aggregation path problem, and tudie the greedy algorithm with contant approximation ratio. Performance evaluation are done in Section V. Section VI overview the related work and Section VII conclude the paper. Fig.. II. THE POLLER-POLLEE BASED MONITORING ARCHITECTURE The -pollee tructure reactive mode proactive mode Fig. how the two widely ued operational mode of the -pollee tructure, where each can either poll the pollee and wait for a reply (i.e., reactive mode) or let the pollee end report periodically (i.e., proactive mode). A the proactive mode cut the monitoring traffic by half compared with the reactive mode, we ue the proactive mode in thi paper. Each Link in Fig. i a logical link, which could be multiple hop of phyical link. A a reult, the tatu report detined to the ame have the opportunity to be aggregated at every intermediate node. To do aggregation, each intermediate node may have to wait for report to arrive from the downtream node. Due to the regular pattern of the monitoring traffic, the aggregation rule can be well defined without adding extra delay. The, pollee, and the phyical link between them form a tree. If a node i at the edge of the tree, it i called an edge node; otherwie it i called a non-edge node. Let t denote the polling time interval. For every time t, each pollee end a report to the. Suppoe the need time T d, referred to a the detection time, to detect a pollee failure. A fale alarm occur if a pollee doe not fail but the mie all it report within T d [], []. The Aggregation Aware Monitoring can be imply tated a follow. Each pollee chedule a report to the ame every t; each non-edge pollee collect report from each of it children and end an aggregated report (including it own) to the ame every t, with the aggregation ratio ; each make the deciion about each pollee tatu every T d, and inform the ink when neceary. It i required t T d. n0 r n n0 n n lot aggregation t t Td=t Td=t n time n n t Fig.. An example of the aggregation aited monitoring cheme. topology unynchronized chedule. The vertical arrow line denote the event of packet tranmiion/reception. Fig. ue an example to illutrate the -pollee baed monitoring. Fig. how the topology, where n 0 i the and n,n,n are the pollee. Fig. how the chedule at each node, where the vertical arrow line denote the event of packet tranmiion/reception. The time axe i divided into lice of polling interval t, and different node do not have to be ynchronized. Every t, n,n end report to n, which aggregate the received report with it own into one packet (aume ). With detection timer T d =t, the will receive two report every T d. Suppoe one report from n i lot, the may till receive another report during the next polling interval and will not have a fale alarm about n. Thu, there i a tradeoff between fale alarm rate and detection delay. To reduce the fale alarm rate, the detection timer T d hould be increaed, and vice vera. The fale alarm reult from the loy nature of the wirele link. The failure characteritic of the wirele link ha been tudied in [] by analyzing the packet trace over the real enor tetbed. Due to the oberved burty and tranient error pattern, the wirele link can be modeled a a continuou time Markov chain [], [8], and we have etablihed the relationhip between the fale alarm rate and ditance []. A our work focue on how to build a -pollee architecture, it i independent of the underneath link model. Therefore, we take the reult of [] and apply it here. Aume the link failure rate i f l, and ue F (h, T d ) to denote the fale alarm rate when the pollee i h hop away from the, and the detection timer at i T d. Fig. 4 how the fale alarm rate a a function of the number of hop from to pollee when f l =0.0. A can be een, the fale alarm rate increae a h increae, ince longer path i more vulnerable to failure. A T d increae, the fale alarm rate decreae. Thi i alo conitent with the reult of []. III. THE MINIMUM POLLER SELECTION PROBLEM In thi ection, we formulate the election problem a NP-hard and propoe ditributed algorithm. A. Problem Formulation We conider a network of n enor, where all enor are capable of being either or pollee. At firt hand, we time time time

fale alarm rate 0. 0. 0. 0.0 F(h, t) F(h, t) F(h, 4t) 0... 4 number of hop from to pollee (h) Fig. 4. Relationhip between the fale alarm rate and the ditance the number of meage 00 000 00 000 00 n=000 n=00 000 0 0. 0.4 0. 0.8 the probablity ρ Fig.. Relationhip between ρ and meage overhead want to elect the minimum number of node a o that the management cot of can be minimized. On the other hand, if the number of i too mall, ome pollee will be many hop away from a, thu increaing the fale alarm rate. Therefore, our goal i to trike a balance between the number of and fale alarm rate. Since the may alo fail, we aociate each pollee with ω. Each pollee maintain pointer to the different but end tatu report to only one at a time. When the fail, the aociated pointer hould be outdated and the next on the lit will be ued. We formulate the election problem minpl, with the objective to minimize the number of while limiting the maximum fale alarm rate a follow. Given a network graph, the error rate of each link, determine () which node are and which node are pollee, () a many-to-many mapping where each node i aociated with ω, to minimize the total number of while limiting the maximum fale alarm rate of pollee. Theorem : The problem minpl i NP-Hard. Proof: The problem minpl can be proved to be NPhard via a reduction from the minimum k-hop dominating et problem [4], which can be een a a pecial cae of minpl when ω = and the link failure rate i contant. Thi i becaue according to Fig. 4, different hop number between pollee and correpond to different fale alarm rate. Therefore, atifying the contraint of maximum fale alarm rate i equivalent to limiting the maximum number of hop from pollee to. B. Ditributed Poller Selection Algorithm The contruction of -pollee tructure hare ome imilarity with the traditional clutering cheme, where a i imilar to a cluter head. However, there i fundamental difference between them. Firt, the traditional clutering cheme are ingle-hopped, but the pollee hould be within ome bounded hop of it. Second, with multihop between the and pollee, aggregation i ued to reduce the monitoring traffic, which i not conidered in clutering cheme. Third, each pollee may be aociated with ω to be fault tolerant, wherea each cluter member only ha one cluter head. Thu, the traditional clutering cheme i only a pecial cae of the ingle-hop -pollee tructure with ω =. Below, we firt propoe a ditributed determinitic election algorithm, and then preent a hybrid algorithm to further reduce the meage overhead. ) The Randomized Algorithm: The randomized algorithm i preented a a baeline for comparion. Each node elect itelf a a with probability ρ. Poller then announce their tatu within k hop. Senor node that did not elect themelve a will be pollee. The randomized algorithm i very imple, yet it may produce ome pathological cenario where multiple may cluter together in ome area and no exit in ome other area. To addre thi problem, we propoe a determinitic algorithm. Algorithm Determinitic Poller Selection Algorithm Input: agraphg(n, E),k,k Output: a Poller Set S er, a Pollee Set S ee Procedure: Determine(k,k ) : Initialize the tatu and the timer : Broadcat locally to get k -hop neighborhood information : if timer not expired then 4: if id i the mallet among k hop unlabeled neighbor then : broadcat ID = id within k hop, exit : /*S er = S er {id}*/ : end if 8: wait until a packet i received or the timer i expired : if ID received then 0: broadcat polleeid within k hop, exit : /*S ee = S ee {id}*/ : end if : if polleeid received then 4: update the unlabeled Lit within k hop, reet timer and go to 4 : end if : ele : go to 8: end if ) The Determinitic Algorithm: The propoed determinitic algorithm i baed on the ditributed maximal independent et (MIS) algorithm []. An Independent Set i a ubet of node among which there i no edge between any two node. The et i a MIS if no more edge can be added to generate a bigger independent et. In the determinitic algorithm, the concept of MIS i extended to the multihop environment. Two parameter k,k are ued to govern the ditribution of the and pollee, to enure that no two are le than k hop from each other and no pollee i more than k hop from it. That i, the et S er i a k -hop MIS, in which no two node are le than k hop away from each other. Since parameter k,k control the geometrical propertie of the and pollee ditribution, they hould be determined beforehand according to uer demand. Firt, given the contraint of fale alarm rate, k can be determined, baed on the relationhip between the fale alarm and the hop ditance uch a in Fig. 4. Thu, bounding the maximum fale alarm rate become equivalent to bounding the maximum ditance from pollee to the. Second, after k i determined, k can be elected. Although a large k could reduce the number of elected but ome pollee may not find ω within k hop. Therefore, an appropriate k hould be choen to trike a balance between the number of and the number of unlabeled node. Thi can be achieved by the experiment, a in Section V. Given k,k, Algorithm I lit the peudo code of the determinitic election algorithm. At the outet, each node need to obtain the k -hop neighborhood information by

4 4 0 8 Start of the determinitic algorithm 0 Start of the hybrid algorithm 8 4 0 8 4 0 8 4 0 8 4 0 8 4 0 8 round : Ser={,} round : Ser={,,} round : Ser={,,,} 4 0 8 randomized phae D(k, k) D(k, k) Unlabeled node pollee Fig.. A numerical example. The determinitic algorithm (above) run in three round, exchanging meage. The hybrid algorithm (below) ha a randomized phae and two determinitic phae, exchanging 0 meage. a localized broadcat. Then the algorithm proceed in round. In each round, if a node ha the mallet id among the k -hop unlabeled neighbor, it elect itelf a belonging to the et S er, and broadcat ID = id within k hop. Node that are not yet labeled but received the declaration will label themelve a pollee, and broadcat polleeid = id within k hop. After that, a new round will tart, during which the algorithm i executed among the remaining unlabeled node. Thi proce repeat until all node are labeled either a or pollee. The upper diagram of Fig. ue a mall network of node to explain the determinitic algorithm, auming k = k =. In the firt round, node and node elect themelve a ince their id i the mallet among the -hop neighborhood, and their -hop neighboring node,4,,,8,0 are recruited a pollee. Then the labeling proce will repeat among the remaining node until everyone i labeled. Thu, in the econd round, node i labeled a and node i labeled a pollee. In the third round, node i labeled a. The meage complexity of Algorithm I can be computed a follow. Aume there are n node. At firt, each node need to broadcat within k hop to get the neighborhood information (line ). After that, each need to broadcat within k hop (line ) and each pollee need to broadcat within k hop (line 0). Suppoe k = k = k, each node broadcat within k hop exactly twice during the execution of the algorithm. Further uppoe d i the node denity, and r i the node communication range. The meage complexity can be etimated by O( n d (kr) )=O(k nd). It i poible to et k a a fractional number. For example, when k =., it mean that each node chooe k = with probability 0.8 and chooe k =with probability 0.. A k control the ditance between the neighboring, allowing k to be a fractional number will provide more flexibility to control the number of, which can be een from Section V. Then, Algorithm need to be extended a follow. Firt, et k = k with probability k k and et k = k with probability k k. Second, each node elect itelf a among the k -hop unlabeled neighbor (line 4). Third, each node broadcat id within k hop (line,0), but update the unlabeled lit within k hop (line 4). Algorithm Hybrid Poller Selection Algorithm Input: agraphg(n, E), k,k,ρ Output: a Poller Set S er, a Pollee Set S ee Procedure: Hybrid(k,k ) : Initialize the node label, timer, S er = φ, S ee = φ : generate a random number σ (0, ) : if σ<ρthen 4: S er = S er {id} : end if : if id S er /*node id i temporarily labeled a */ then : execute the determinitic algorithm Determine(k,k ) 8: /*S er and S ee are updated*/ : end if 0: if node id i unlabeled then : wait until a packet i received or the timer i expired : if ID received then : label itelf a pollee /*S ee = S ee {id}*/ 4: end if : end if : if id {N S er S ee} /*node id i not labeled*/ then : execute the determinitic algorithm Determine(k,k ) 8: end if ) The Hybrid Algorithm: In the determinitic algorithm, each node need to broadcat within k hop twice, auming k = k = k.ak become larger, the meage complexity increae dramatically. To reduce thi meage overhead, we propoe a hybrid algorithm which combine the randomized algorithm and the determinitic algorithm. A hown in Algorithm, the algorithm tart with a randomized phae, during which each node label itelf a a temporary with a probability ρ. After that, the determinitic algorithm i executed among the temporary. If two are le than k hop away from each other, one of them will change it role from to pollee and the other one will confirm itelf a. After receiving the confirmed ID, the unlabeled node within k -hop range of the confirmed will be recruited a pollee. To implement thi, a field in the packet header need to be reerved to differentiate the confirmed ID from the temporary ID. After receiving the confirmed ID, the unlabeled node change the tatu to pollee. Finally, the determinitic algorithm i executed among the et of unlabeled node, i.e., {N S er S ee },tohaveall the node labeled. Compared with the determinitic algorithm, the hybrid algorithm ha much le meage overhead. Firt, due to the ue of the randomized phae, each node doe not have to make the initial k hop broadcat. Second, when the determinitic algorithm i executed for the firt time (line ), only the temporary are involved. Third, after receiving the confirmed ID, many unlabeled node become pollee who are refrained from broadcat. Thu, only a mall portion of the node till remain unlabeled and execute the determinitic algorithm for the econd time (line ). In the following, we give a detailed analyi of the meage overhead, baed on which the optimal value of ρ can be derived to minimize the meage overhead. For eae of illutration, we firt aume k = k = k,

and then extend it to the general cae of k k.firt,the randomized phae introduce zero meage overhead. A a reult, on average a fraction ρ of the node are labeled a and execute the determinitic algorithm for the firt round (line ). Among the nρ temporary, a fraction of node, ay, α, are confirmed a, and a fraction α of the node become pollee. During the execution of the firt-round determinitic algorithm, the nρα confirmed will recruit unlabeled node within k hop a pollee. After that, only a fraction, ay, β, ofthen( ρ) unlabeled node till remain a unlabeled and execute the econd round (line ). Therefore, the total number of node involved in the two round of the determinitic algorithm i nρ + n( ρ)β, with all the other node recruited a pollee by the nρα confirmed. Further ue M(k) to denote the meage complexity of k-hop broadcat. Since each participating node need to broadcat twice within k hop, the overall meage complexity i: [nρ + n( ρ)β]m(k) () β i the fraction of n( ρ) unlabeled node that do not fall in the k-hop range of any of the nρα confirmed. For a node n, the probability that it fall within the k-hop range of node n i πk r a,wherea denote the area ize. Thu, β =( πk r a ) nρα () α denote the fraction of node that are to be labeled a if nρ uniformly ditributed node run the determinitic algorithm. When two node are within k hop, one of them will refrain from being a with half chance. Since there are total nρ node, the probability for each node to become a i α =( πk r a )nρ () Subtituting Eqn. and Eqn. into Eqn. will give u the meage complexity when k = k. Fig. numerically evaluate the effect of varying the number of node n and the probability ρ on the amount of meage overhead, when a =0,r =,k = k =.Iti hown that each curve ha an optimal ρ correponding to the minimum meage overhead. In addition, the optimal value of ρ reduce a the number of node n increae. For example, ρ i about 0., 0., when n = 000, 00, repectively. Fig. ue a imple network to compare the determinitic algorithm and hybrid algorithm. In the hybrid algorithm (lower diagram), the randomized phae elect three node, i.e., node,,, a temporary, among which the determinitic algorithm i executed. Node reign from the role of and become pollee becaue it id i larger than it neighbor node, which i alo a. A a reult, node, are confirmed a and broadcat locally to recruit node,,,4,8,0 a pollee. After that, only node, till remain unlabeled, among which the econd round of the determinitic algorithm i executed. In comparion of the meage overhead, the determinitic algorithm ha = meage exchange, but the hybrid algorithm only ha ( + ) = 0 meage exchange. The meage complexity i reduced by over 0% in thi example. A k,k become larger and the network grow to a larger cale, the meage aving will be more ignificant. After are determined, each pollee need to elect up to ω baed on the received announcement. The election could be imply baed on criteria uch a the ditance to the. Each pollee maintain a lit of pointer to the ω, and each pointer link to the routing entry (e.g., ditance, next hop) of a ditinct. After a fail, each pollee will update it active lit, remove the tale entry and chooe the next on the lit. A more fail, new can be elected by running Algorithm among the neighboring pollee of the failed. IV. THE OPTIMAL AGGREGATION PATH PROBLEM A. The Problem Formulation After the -pollee relationhip i etablihed, each pollee tart to periodically end a report to it. Along the multihop path from pollee to, each non-edge pollee collect report from it downtream children and end an aggregated report (including it own) to the. A demontrated in Fig., a pollee may have multiple path to the, which reult in different energy efficiency. Aume tranmitting one packet over one hop conume one unit of energy. The problem of finding the optimal aggregation path (optph) can be formulated a follow. In a network graph, given the -pollee relationhip, a contant aggregation ratio, each node need to end a report to it periodically. For each, find the optimal aggregation path from all it pollee, uch that the total energy conumption i minimized. When the aggregation ratio i infinite, it i called perfect aggregation. That i, infinite number of report can be aggregated into one packet. It i not hard to ee that the optimal olution of thi pecial cae i the minimum panning tree. For the general cae where the aggregation ratio i finite, it can be proved that the problem of optph i NP-hard, which i conitent with the finding in [4], []. The proof i omitted here due to the limited pace. B. A Greedy Algorithm And It Approximation Ratio Although the problem of electing the optimal aggregation path i NP-hard, we can deign ome heuritic. A imple greedy algorithm i to arbitrarily elect the next hop to forward, a long a it i on the hortet path to the, and opportunitically aggregate the tatu report at the intermediate node. Thi eemingly imple algorithm ha urpriingly good performance, with a tight bound of 4. That i, the energy conumption reulted from the greedy algorithm will not be more than 4 time the optimum olution. Before proving the approximation bound of the greedy algorithm for the optph problem, we firt define ome terminologie. Within a polling domain, the hortet path from all the pollee to the of the domain form a tree, which may be partitioned into different level. A node i aid to be at level i (i 0) if it i i hop away from the ink (i.e., the ). Therefore, the ink i the only node at level 0. A rooted tree i called a level-connected tree if the node

(k-) node Level Level Level k- Level k (k-) node Level Level Level k- Level k (k-) node (k-) node (c) Fig.. Wort-cae topology with the bet trategy (above) and wort trategy (below). -level connected tree, =. -level connected tree, =4. (c) k-level connected tree, =(k ). belonging to the neighboring level are connected with each other. Specifically, a k-level-connected tree i compoed of the node at level-i, i =0,...k. Given a tree topology, the et of aggregation path that conume the leat/mot energy i called the bet/wort path, denoted a H b and H w, repectively. Correpondingly, the aggregation trategy that can produce the bet/wort path i called the bet/wort trategy. Givena topology, the ratio of energy conumption between the wort and bet path determine the performance bound/ratio of the topology, denoted a δ. Denote the energy conumption of the bet path H b and wort path H w a E b, E w,thenwe have δ = Ew E b. Given an algorithm, the maximum performance bound/ratio among all poible topologie, denoted a δ max, i called the approximation ratio of the algorithm. The wortcae topology i the topology whoe performance bound equate the approximation ratio of the algorithm. To have an intuitive idea about how to calculate the performance bound, we firt how ome topologie in Fig., whoe performance bound i 4. We can later prove that δ = 4 i the maximum performance bound among all topologie, o thee clae of k-level-connected tree are actually the wortcae topology. In Fig., the wort-cae k-level connected tree i contructed by having each level-i, i =...k with node and level-k with (k ) node, where Fig. are the pecial cae when k =, and Fig. (c) i the general cae. Note that in each level-connected tree, node at the neighboring level are all connected with each other. But for the clarity purpoe not all the link are hown in Fig., with only the aggregation path depicted. The number hown by each link indicate the number of packet tranmiion over that link. In Fig., the upper diagram repreent the bet aggregation path, and the lower diagram repreent the wort aggregation path. A further calculation how that the energy conumption of the bet and wort path are 4(k ) and (k ) in Fig. (c). With =(k ), wehaveδ = 4. In the ret of thi ection, we will prove that δ = 4 i the maximum performance bound among all topologie, and thu the approximation ratio of the greedy algorithm. A all the hortet path contitute a tree and any tree topology i a ubet of ome level-connected tree, we can retrict our attention to the level-connected tree. The following theorem give the approximation ratio for the -level-connected tree, which will be ued to derive the approximation ratio for the arbitrary level-connected tree. Theorem : For a -level-connected tree, the approximation ratio of the opportunitic greedy algorithm i no greater than +,where i the aggregation ratio. Proof: Given a -level-connected tree, the energy pent by ending the report from the level- node to the level- node i fixed, i.e., equal to the number of level- node. The problem optph become how to minimize the energy pent between the level- node and the root by packing the level- report into a minimum number of level- packet. We give the bet and wort aggregation trategy a follow. Bet trategy: packing/ending every level- report into a ingle level- node until all level- node have a packet of (including it own) report contructed without reidue room left. All the remaining level- report are then packed into a ingle level- node. Wort trategy: packing/ending all level- report into a ingle level- node. Excluding it own report, each level- node ha vacancie in the firt packet. The bet trategy trie to fully utilize all the vacancie, while the wort trategy trie to ue a many packet a poible by packing all tatu report aggreively into a ingle node. There may be other alternative bet or wort trategie, but it i ufficient to find one of them for our proof. Taking Fig. 8 a an example, where =, the bet trategy pack every level- report into a level- node, and the wort trategy pack all level- report into a ingle level- node. A imple calculation how that δ =. Now aume level ha l node and level ha l node. Then the wort-cae topology mut atify: ( )(l ) < l ( )l. That i, all but one of the level- node need to be fully packed. Otherwie, if l ( )(l ), ome level- node become edge node without children in both the bet and wort trategy, and could be removed to increae the performance ratio. In other word, the original topology i not the wort cae topology when l ( )(l ). On the other hand, if l > ( )l, l ( )l level- edge node can be removed to increae the performance ratio. To ee thi, removing level- node deceae the number of packet packed at level by exactly l ( )l in the bet trategy, but decreae by at mot l ( )l in the wort trategy. Thu, the original topology i not the wort cae topology when l > ( )l. Therefore, we et l =( )l i, i =0.... Iti followed that E b = l + l,e w = l+ +(l )+l. Then, δ = E w E b = l+ l ++( ) E b l + l l + l = ( )l i ( )l = l i l (4) The performance ratio + i tight, which can be een from Fig. 8. Specifically, Fig. 8& how the wort-cae

n n n n n n n n n - node - node - node Total (-) node =, δ=/4 =, δ =/ (c) δ =+(-)/² n n n ni ni Fig. 8. The derived bound i tight for -level-tree: the bet trategy (above) and wort trategy (below). The number by the link indicate the number of packet tranmitted over thi link. topology when =and =. For arbitrary, thewortcae topology can be contructed a in Fig. 8(c), where level ha node and level ha ( ) node, then we have δ =+ l+ =+ ( )+ =+ l + l ( ) + () Theorem : The approximation ratio for the arbitrary levelconnected tree i no greater than 4, i.e., δ max = 4. Proof: The theorem can be proved by the induction of L, the depth of the tree. Firt, Theorem how that the bae cae for -level-connected tree i true, i.e., δ max = max( + )= 4. Auming the claim i true for k-level-connected tree, k L, we want to prove that δ max = 4 for (L+)- level-connected tree. A hown in Fig., we divide the total energy conumption of the (L+)-level-connected tree into two part, namely, the upper part coniting of the link from level-(l-) to level- 0 (Fig. ) and the lower part coniting of the link from level-(l+) to level-(l-) (Fig. (c)). Becaue the node below level-(l-) alo contribute to the energy conumption of the upper part, we need to expand the node below level-(l-) onto level-(l-). Then the level-(l-) node in Fig. are the coalecence of level-(l-), level-l and level-(l+) node in Fig.. On the other hand, The node at level-(l-) and above have no contribution to the energy conumption of the lower part, o they can be afely hrunk into one virtual node. A hown in Fig. (c), the virtual node i the root of the - level tree. Ue Ew up,eup to denote the energy conumption of b n n the wort path and the bet path in the upper part (Fig. ), and ue Ew low,elow b to denote the energy conumption of the wort path and the bet path in the lower part (Fig. (c)). It can be een that the above expanding and hrinking operation hall not reduce the gap between the energy conumption in the bet cae and wort cae. Fig. i a (L-)-level-connected tree. By the inductive hypothei, δ max = 4 for Fig.. Therefore E up w Eup b ( 4 )Eup b = 4 Eup b () Fig. (c) i a -level-connected tree. According to Theorem, δ max = 4 for Fig. (c). Then we have E low w E low b ( 4 )Elow b = 4 Elow b () Level Level L- Level L Level L+ upper part energy lower part energy Level Level L- (merging level-l and level-(l+)) Level Level Fig.. Proof of the performance bound by dividing the energy conumption into two part. the original connected tree the upper part conit of the link above level-(l-), with the node at level-l and level-(l+) expanded onto level-(l-). (c) the lower part conit of the link below level-(l-), with the node at level-(l-) and above hrunk into a ingle virtual node. Combining Eqn. and Eqn., we finally have δ = Eup w E up b + Ew low + Eb low 4 The bound of 4 i tight a een from Fig., where the wort-cae topologie are hown. V. PERFORMANCE EVALUATIONS In thi ection, we ue imulation to tet the parameter and evaluate the propoed algorithm. In the imulation, n enor are randomly deployed in a 0 0 quare area. Each pollee i aociated with up to ω, which are ued for fault tolerance. Each enor end a tatu report to it every t, with the report aggregated at the intermediate node. The enor tranmiion range i, and tranmitting one packet over one hop conume one unit of energy. The link failure i modeled a a continuou-time markov chain with an average failure rate of f l =0.0 and a detection period T d =t. The experiment are done over a cutomized C++ imulator. A. Parameter Setting Three parameter need to be determined, k,k for the determinitic algorithm and ρ for the hybrid algorithm. In Section II, Fig. 4 give the relationhip between the fale alarm rate and the ditance from the pollee to the when the average link failure rate i known. A a reult, given the contraint of the maximum fale alarm rate, k can be determined to limit the number of hop from the pollee to the. Suppoe the fale alarm rate i required to be le than 4%, then k baed on Fig. 4. We will et k = in the following experiment unle otherwie pecified, baed on which k and ρ are choen. Fig. 0 how the effect of k.ak increae, there will be le number of becaue k control the ditance between the neighboring. However, the number of pollee that cannot find ω increae a k increae. For example, a k increae from to.8, the number of pollee that cannot find (ω =) increae from 0 to 0. Thi i becaue with le number of, it i le likely for a pollee to find ω within k hop. A can be een, if k =, all pollee can find three. However, if k =.8, over 0% pollee cannot find three. We thu et k =in the following experiment, without otherwie pecified. Note that k may be a fractional number. (c) (8)

the number 0 00 0 00 0 total number of pollee with le than pollee with le than 0..4..8 the value of k Fig. 0. and the number of pollee that cannot find ω (n=000) the number of meage 000 800 00 400 00 000 800 The effect of k on the number of Fig.. n=000 n=00 In the hybrid algorithm, each node firt elect itelf a with probability ρ. Fig. how how the value of ρ affect the meage overhead when k = k =. Compared with the analytical reult in Fig., the ame trend i oberved in Fig., where the meage overhead firt drop then rie a ρ increae. The point with the leat number of meage correpond to the optimal ρ. For example, when n = 000, 00, ρ i around 0., 0. in the imulation, and around 0., 0. in the analyi. The little mimatch of the theoretical reult may be explained by the boundary effect of the finite field in reality. That i, in the analyi it i aumed that the ening range of a node fall within the field of the deployment, but thi i not true for the node on the boundary, which will affect the accuracy of the analytical reult. Similarly, Fig. how the relationhip between the meage overhead and ρ when k =,k =. It can be oberved that at the optimal point, ρ i around 0.0, 0.0 when n = 000, 00, which i much maller than the cae when k = k =. Thi i becaue with larger k, a maller number of are able to cover mot of the unlabeled node. In the following, we will ue the optimal ρ correponding to the different n, k,k. B. Comparion of Different Scheme In thi ection, we compare the geometrical property of the hybrid algorithm and the randomized algorithm. Fig. how naphot of the -pollee ditribution after running the randomized and hybrid algorithm among 00 node in a field. The whole network i connected but for clarity purpoe we only how the link within each polling domain. Fig. how that the randomized algorithm produce a cenario where the may be iolated (e.g., node in the up-left corner), clutered together (e.g., node 8,,4 on top), or the pollee (e.g., node in the bottom-left corner) i too far away from it. However, after running the determinitic algorithm in Fig., the hybrid algorithm fixe thee problem. For intance, in Fig. there i on iolated and no i within one hop of each other and no pollee i more than hop away from it. Fig. 4, further compare the hybrid algorithm with the randomized algorithm in term of tatitical geometrical property and fale alarm rate. For fair comparion, we et ρ = 0., 0.0, k =, when n = 000, 00, in the randomized algorithm, to elect the ame number of a in the hybrid algorithm. A een in Fig. 4, the ditance 0. 0. 0. 0.4 0. 0. election probability ρ The effect of probability ρ on Fig.. the meage overhead in the hybrid algorithm (k =,k =) the number of meage 000 000 4000 000 000 n=000 n=00 000 0.0 0.04 0.0 0.08 0. 0. 0.4 election probability ρ The effect of probability ρ on the meage overhead in the hybrid algorithm (k =,k =) between and pollee i bounded by hop in the hybrid algorithm, but pan up to hop in the randomized algorithm. About 8% of the pollee in Fig. 4 are more than hop away from their, o the contraint of the fale alarm rate cannot be met in the randomized algorithm. Fig. how that the hybrid algorithm outperform the randomized algorithm in term of the average fale alarm rate, which i calculated by averaging the um of the fale alarm rate over all the node. It can be een that the average fale alarm rate of the hybrid algorithm i about 0% or 0% maller than that of the randomized algorithm, when n = 000 or 00, repectively. Both the determinitic algorithm and the hybrid algorithm have provable ditribution property no i le than k hop away from each other, and no pollee i more than k hop away from it. However, Fig. how that the hybrid algorithm ubtantially reduce the number of meage. The reduction i about 0% and 80%, when k =and k =, repectively. The benefit i thank to the randomized phae adopted by the hybrid algorithm. fale alarm rate... 0. x 0 4 0 random, n=000 random, n=00 hybrid, n=000 hybrid, n=00 0.0 0.08 0.08 0.0 0.0 0. the average link failure (T =t) d the number of meage 0 800 000 00 400 00 the number of node Fig.. Comparion of random Fig.. Comparion of determinitic algorithm with hybrid algorithm algorithm with hybrid algorithm... 0. 4 x 0 4 D(k =,k =) Dk =,k =) H(k =,k =) Hk =,k =) VI. RELATED WORK While a lot of reearch in enor network focue on the field or target monitoring [], [], little attention ha been given to the monitoring of enor network itelf. In [], [], two imilar ditributed failure detector were propoed independently for wirele enor network, where each node i collaboratively monitored by it one-hop neighbor. Thee cheme can only detect node failure, but more general tatu uch a reidue energy and coverage cannot be monitored. In [], data aggregation had been ued to obtain a global abtraction of the enor reidue energy, but only when pecific continuou energy diipation model are aumed. More recently, a local monitoring infratructure i propoed

4 4. 8 4 8 4 4 8 88 0. 8 48 8 4 4 8. 0 0 4 8 8 4 8 0 4 4. 0 4 4 44 4 08 8 8 0. 4 8 40 84 4 48 80 0 8 0 0 4 4 4. 8 4 8 4 4 Hop : % 8 88 0 Hop : % Hop : %. 8 48 Hop 4: % 8 4 4 8. Hop : % 0 0 4 8 8 4 Hop : 4% 8 0 4 4. 0 4 4 44 4 08 8 8 0. 4 8 40 84 4 48 80 Fig. 4. 0 8 0 0 4 Fig.. Snaphot of -pollee ditribution, where the tar denote the and the dot denote the pollee: randomized algorithm, ρ =0., hybrid (randomized + determinitic) algorithm, ρ =0.,k =,k = Hop : 4% Hop : % Hop : 4% Ditribution of the ditance between and pollee: randomized algorithm hybrid algorithm in []. But their goal i to monitor the tranmiion over the wirele link for ecurity purpoe, intead of from the perpective of fault tolerance. By contrat, our -pollee baed monitoring architecture can repond to querie about a variety of enor tatu tailored to the application demand. The problem of electing a ubet of node to form a backbone ha been extenively tudied in different context, e.g., clutering [], connected dominating et (CDS) [8], [4], relay node placement [], etc. The objective of thee work i to minimize the cardinality of the elected ubet of node, but ignore the contraint of fale alarm rate, which i crucial in wirele enor network. In addition, mot of thee work are limited in the election of ingle-hop, ingle vantage point (e.g., cluter head, dominator). However, our work focue on multi-hop multi- monitoring architecture contruction, where ome geometrical propertie of the -pollee ditribution can be guaranteed. Aggregation path election problem ha been propoed in [4], [], wherein ome heuritic are developed but without performance guarantee. There are alo ome other aggregation cheme that target for the different application cenario [], [], [0], but none of them guarantee contant approximation ratio. To the bet of our knowledge, our reult i the firt proved contant approximation ratio applied to the aggregation path election cheme for the wirele enor network. VII. CONCLUSIONS In thi paper, we focu on the ditributed deign of monitoring and aggregation algorithm for wirele enor network. Baed on the -pollee tructure we firt propoed fully ditributed algorithm to elect the minimum number of while bounding the fale alarm rate. Then a greedy aggregation cheme wa propoed to reduce the meage overhead due to monitoring. Theoretical analye and extenive imulation how that the determinitic algorithm can flexibly control the -pollee ditribution property to bound the fale alarm rate, the hybrid algorithm can reduce the meage overhead ignificantly, and the greedy aggregation cheme decreae the monitoring traffic with a contant approximation ratio of 4. REFERENCES [] I. Akyildiz, W. Su, Y. Sankaraubramaniam, and E.Cayirci, Wirele Senor Network: A Survey, Computer Network, March 00. [] S. Baagni, A ditributed algorithm for finding a maximal weighted independent et in wirele network, in th International Conference on Parallel and Ditributed Computing and Sytem (PDCS),. [] R. Critecu, B. Beferull-Lozano, and M. Vetterli, On network correlated data gathering, in INFOCOM, 00. [4] F. Dai and J. Wu, On contructing k-connected k-dominating et in wirele network, Journal of Parallel and Ditributed Computing (JPDC), 00. [] Dezun Dong, Yunhao Liu, and Xiangke Liao, Self-monitoring for enor network, in MobiHoc, 008. [] Kai-Wei Fan, Sha Liu, and Praun Sinha, On the potential of tructurefree data aggregation in enor network, in INFOCOM, 00. [] Chih fan Hin and Mingyan Liu, A ditributed monitoring mechanim for wirele enor network, in WISE, 00. [8] A. Konrad, B. Y. Zhao, A. D. Joeph, and R. Ludwig, A markov-baed channel model algorithm for wirele network, Wirele Network, vol., pp. 8, 00. [] H. Lee, A. Cerpa, and P. Levi, Improving wirele imulation through noie modeling, in IPSN, 00. [0] Li (Erran) Li, Marina Thottan, Bin Yao, and Sanjoy Paul, Ditributed network monitoring with bounded link utilization in ip network, in INFOCOM, San Francico, April 00. [] C. Liu and G. Cao, Minimizing the cot of mine election via enor network, in INFOCOM, 00. [] C. Liu and G. Cao, An multi- baed energy-efficient monitoring cheme for wirele enor network, in INFOCOM mini-conference, 00. [] S. Mira, S. Hong, G. Xue, and J. Tang, Contrained relay node placement in wirele enor network to meet connectivity and urvivability requirement, in INFOCOM, 008. [4] S. J. Park and R. Sivakumar, Energy efficient correlated data aggregation for wirele enor network, International Journal of Ditributed Senor Network, 008. [] S. ROSS, Stochatic Procee, John Wiley and Son,. [] Stanilav Rot and Hari Balakrihnan, Memento: A health monitoring ytem for wirele enor network, in SECON, 00. [] L. Su, C. Liu, and G. Cao, Routing in intermittently connected enor network, in ICNP, 008. [8] Peng-Jun Wan, Khaled M. Alzoubi, and Ophir Frieder, Ditributed contruction of connected dominating et in wirele ad hoc network, Mobile Network and Application, vol., no., 004. [] O. Youni and S. Fahmy, Ditributed clutering in ad-hoc enor network: A hybrid, energy-efficient approach, in INFOCOM, 004. [0] Bo Yu, Jianzhong Li, and Yinghu Li, Ditributed data aggregation cheduling in wirele enor network, in INFOCOM, 00. [] W. Zhang and G. Cao, Dctc: Dynamic convoy tree-baed collaboration for target tracking in enor network, IEEE Tranaction on Wirele Communication, 004. [] Jerry Zhao, Rameh Govindan, and Deborah Etrin, Reidual energy can for monitoring wirele enor network, in WCNC, 00. [] R. Zheng and R. Barton, Toward optimal data aggregation in random wirele enor network, in INFOCOM, 00.