Research Article QoS and Energy Aware Cooperative Routing Protocol for Wildfire Monitoring Wireless Sensor Networks

The Scentfc World Journal Volume 3, Artcle ID 43796, pages http://dx.do.org/.55/3/43796 Research Artcle QoS and Energy Aware Cooperatve Routng Protocol for Wldfre Montorng Wreless Sensor Networks Mohamed Maalej, Sofane Cherf, and Hchem Besbes Engneerng School of Telecommuncatons of Tuns (Sup Com), Cty of Communcatons Technologes, El Ghazala 83, Arana, Tunsa Correspondence should be addressed to Sofane Cherf; sofane.cherf@supcom.rnu.tn Receved March 3; Accepted May 3 Academc Edtors: A. E. Cetn, I. Korpeoglu, B. U. Toreyn, and S. Verstockt Copyrght 3 Mohamed Maalej et al. Ths s an open access artcle dstrbuted under the Creatve Commons Attrbuton Lcense, whch permts unrestrcted use, dstrbuton, and reproducton n any medum, provded the orgnal work s properly cted. Wreless sensor networks (WSN) are presented as proper soluton for wldfre montorng. However, ths applcaton requres a desgn of WSN takng nto account the network lfetme and the shadowng effect generated by the trees n the forest envronment. Cooperatve communcaton s a promsng soluton for WSN whch uses, at each hop, the resources of multple nodes to transmt ts data. Thus, by sharng resources between nodes, the transmsson qualty s enhanced. In ths paper, we use the technque of renforcement learnng by opponent modelng, optmzng a cooperatve communcaton protocol based on RSSI and node energy consumpton n a compettve context (RSSI/energy-CC), that s, an energy and qualty-of-servce aware-based cooperatve communcaton routng protocol. Smulaton results show that the proposed algorthm performs well n terms of network lfetme, packet delay, and energy consumpton.. Introducton The automatc montorng of wldfre generally supports multmodal observatons. Ths s due to the extent of the areas to be covered and the dffculty of detectng fre. In fact, most fre detecton technques, for example, based on thevdeo,sufferfromfalsealarms.theuseofwrelesssensor networks (WSNs) can mprove the qualty of the detecton and consequently the reducton of the false alarm. WSN can be easly deployed and do not requre specal auxlary nstallaton. They are manly used to control buldngs, houses, or archaeologcal stes n the forest. However, the forest envronment presents the problem of wde covered areas requrng the transmsson of a large amount of nformaton through the network wth the rsk of sgnfcant energy consumpton and hence lmtng the lfetme of the network. Partcularly, energy parameter s crucal for the wldfre applcaton. Ths s due to the complexty of mantenance of the sensors and the substtuton of dead batteres due to the dffculty of access to these sensors placed generally n large covered areas. The second problem whch arses n ths type of envronment s the fadng effect due to the presence of trees leadng to an mportant shadowng phenomenon. To solve these problems, we propose a new methodology to desgn and optmze WSN based on both energy conservaton and consderaton of the qualty of transmsson for choosng the routng protocol. Cooperatve communcaton s a promsng soluton for enhancng WSN lfetme. In recent works, ths concept has been proposed to explot the spatal dversty gans n wreless networks [ 3]. Data aggregaton n WSN often uses multhop transmsson technques. At each hop, the network reles on only one sensor. Ths often results n a sgnfcant decrease n the energy of some sensors and thus lmts the lfetmeofthenetworkwhlealargenumberofsensors are stll n workng condton. The man dea of cooperatve communcaton conssts n relyng, at each hop, on the resources of multple nodes or relays (called cooperatve nodes) to transmt data from one sensor to another, nstead ofusngonlyonesensorasrelay.thus,bysharngresources between nodes, the transmsson qualty s enhanced. Itsalsoobvousthattheuseofacooperatvescheme mproves the relablty of communcaton n case of fre

The Scentfc World Journal propagaton. Indeed, the presence of several relays for each possble hop ensures the further communcaton of nformaton and therefore the possblty of detecton and trackng of potental wldfre. Thus, cooperatve mechansm s the key to the performance of cooperatve communcaton protocols. However, t s challengng to fnd the optmal cooperatve polces n dynamc WSN, where renforcement learnng (RL) algorthmscanbeusedtofndtheoptmalcontrolpolcywthout the need of centralzed control. Recently, a cooperatve communcaton protocol for qualty-of-servce (QoS) provsonng has been proposed and named MRL-CC, a multagent renforcement learnngbased cooperatve communcaton routng algorthm []. The RL concept conssts n consderng the cooperatve nodes as multple agents learnng ther optmal polcy through experences and rewards. MRL-CC has been based on nternode dstance and packet delay to enhance the QoS metrcs. However, t does not care about energy consumpton and network lfetme whch are mportant components for energy effcency. In ths paper, we desgn cooperatve communcaton routng protocol based on both energy consumpton and QoS. The QoS s measured by the absolute receved sgnal strength ndcator (RSSI). To ntegrate these two parameters n the routng protocol, we use a compettve/opponent mechansm mplemented at each node by the multagent renforcement-learnng (MRL) algorthm. Our proposed algorthm (RSSI/energy-CC) s also an energy and QoS aware routng protocol snce t ensures better performance n terms of end-to-end delay and packet loss rate, takng nto account the consumed energy through the network. The rest of the paper s organzed as follows. Secton descrbes the RL algorthm and the desgn and mplementaton of MRL-CC algorthm and our algorthm, the RSSI/energy-CC. The performance analyss s presented n Secton 3. Fnally,Secton 4 concludes the paper and gves future research dscussons.. Cooperatve Communcaton n WSN Usng Renforcement Learnng In ths secton, the background nformaton on RL s provded. Then, we gve an overvew about the archtecture and desgn ssues of our concept of cooperatve communcaton n WSN. Then, we descrbe the archtecture and desgn ssues of MRL-CC, a cooperatve communcaton algorthm usng RL. After that, we explan the archtecture of new algorthm, RSSI/energy-CC, takng nto account both QoS and energy consumpton... Renforcement Learnng. RL provdes a framework n whch an agent can learn control polces based on experences and rewards. In the standard RL model, an agent s connected to ts envronment va percepton and acton, as shown n Fgure. On each step of nteracton, the agent receves as an nput,, some ndcaton of the current state, s, of the envronment; the agent then chooses an acton, a, to generate as an output. The acton changes the state Acton a A System envronment (State: S) Reward: R Agent: A Reward r R Fgure : Renforcement learnng model. State s S of the envronment, and the value of the state transton s communcated to the agent through a scalar RL sgnal, r. Dependng on ts behavor, the agent should choose actons that tend to ncrease the long-term sum of values of the renforcement sgnal [4]. The man dea of RL s to strengthen the good behavors of the agent whle weakenng the bad behavors through rewards gven by the envronment. The envronment of the agent s descrbed by a Markov decson process (MDP). An MDP models an agent actng n an envronment wth a tuple (S,A,P,R),whereS s a set of states and A denotes a set of actons. P(s s,a) s the transton model that descrbes the probablty of enterng state s S after executng acton a A at state s S. R(s, a, s ) s the reward obtaned when the agent executes a at s and enter s. The goal of solvng an MDP s to fnd an optmal polcy, π:s A,thatmapsstatestoactonssuch that the cumulatve reward s maxmzed [4]. Multagent systems (MASs) are systems showng that multpleagentsareconnectedtotheenvronmentandthat they may take actons to change the state of the envronment. The generalzaton of the Markov decson process to the multagentcasesthestochastcgame(sg)[5]. In MAS case, each agent assumes tself as the only one that can change the state of the envronment and does not consder the nteractons between tself and other agents. Therefore,thestatetranstonsaretheresultofthejontacton of all agents, a =[a,...,a n ],wherensthenumberofagents. Consequently, the rewards for each agent R, =,...,n,also depends on the jont acton. The polces π :S Aform together the jont polcy Π. If R = = R n,alltheagentshavethesamegoal (to maxmze the same expected return), and the SG s fully cooperatve. If n = and R = R,thetwoagentshave opposte goals, and the SG s fully compettve. Mxed games are stochastc games that are nether fully cooperatve nor fully compettve.

The Scentfc World Journal 3 () State: thecngroupsaremodeledtobetheenvronment states: s t s n = {k}, where k {...,V n,v n,v n+,...}. () () Acton:anagentcanoperateoneofthesetwoactons: Cooperatve nodes Fgure : Multhop mesh cooperatve structure for data dssemnaton n WSNs. Source V V n V n V n+ V M Snk Fgure 3: Cooperaton between adjacent groups of cooperatve nodes... Cooperatve Communcaton Concept n WSN... Adopted Archtecture. For relable data dssemnaton n WSNs, we use a multhop mesh cooperatve structure. It conssts n formng groups of cooperatve nodes (denoted as CN)betweenthesourcenodeandthesnknode.Thedata packets orgnated from a source node are forwarded towards the snk by these CN groups (Fgure ) usngamulthop transmsson. When a data packet s receved by a CN group, a node from that group wll be elected to broadcast the data packet to the adjacent CN group. The other nodes of that CN groupwllhelpnthepacketforwardngncasetheelected node fals n data packet transmsson or n case the packet s corrupted. Therefore, we can show the group of nodes connected to each other n a multhop mesh cooperatve structure n Fgure 3. In fact,the set of nth cooperatve group (denoted by V n )sconnectedwthv n and V n+,whchareonehop farther and closer towards the snk than V n,respectvely,that s, each node n V n s connected wth all nodes n V n and V n+. To construct a multhop mesh cooperatve structure, a set of nodes, termed as reference nodes (denoted as RN), between the source node and the snk node s frst selected. After that, a set of nodes around each RN wll be selected as CN, and thus a multhop mesh cooperatve structure s constructed n ths phase [6].... WSN Modelng wth RL. From the pont of vew of RL, we can consder a WSN as multagent system. In fact, sensor nodes can be consdered as agents nteractng wth the envronment whch can be represented for node V n as follows. a f : forwardng of the packet from V n to V n+, a m : montorng the forwarded packet; so: A={a f,a m }. In our study, we have consdered two approaches. The frst approach s proposed n [] where the RL strategy(polcy, behavors, and rewards) for the sensor nodes consders the packet delay and the packet loss rate. Ths technque has been called the MRL-CC algorthm. The goal of MRL-CC s to enhance packet delay and packet loss rate. The second approach s treated n our work n [7] wheretherlstrategy s based on the lnk qualty between sensor nodes and ther amount of energy consumpton. Our strategy goal s to enhance energy effcency and lfetme of the WSN, that s, to reduce network energy consumpton and to maxmze network lfetme..3. Multagent Renforcement Learnng-Based Cooperatve Communcaton Routng Algorthm (MRL-CC).3.. MRL-CC Implementaton. Node electon n the CN group s based on a multagent RL algorthm, performng a fully cooperatve task usng a Q-learnng algorthm. The strategy s descrbed as follows. () Behavor: each node mantans Q-valuesoftselfand ts cooperatve partners whch reflect the qualtes (transmsson delay, packet delvery rato) of the avalableroutestothesnk. () Polcy: when a packet s receved by the nodes n a CN group, each node wll compare ts own Q-value wth those of other nodes n the CN group; the node whch determnes that t has the hghest Q-value wll beelectedtoforwardthedatapackettotheadjacent CN group towards the snk. The other cooperatve nodes wll montor the packet transmsson at the next hop. () Reward: the reward functon s defned as follows: r = ((d V n,snk d Vn+,snk)/d Vn,snk), (a) ((T Vn+ T Vn )/T rmn ) r = T rf. (b) T rmn Equaton (a) susedtocalculatetherewardwhenthe packet forwardng s successful, where d Vn,snk s the average dstance between V n andthesnk,whchcanbecalculatedas d Vn,snk = N Vn V n d,snk (3)

4 The Scentfc World Journal where N Vn s the number of cooperatve nodes n V n, T Vn+ and T Vn are the packet forwardng tme at V n+ and V n, respectvely; T rmn s the maxmum amount of tme that can be elapsed n the remanng path to the snk to meet the QoS requrements on end-to-end delay. The postve reward reflects the qualty of the packet forwardng. Equaton (b) susedtocalculatetherewardwhenthe packet forwardng fals; T rf s the packet reforwardng tmer used for faled forwardng packets. The negatve reward reflects the delay caused by the unsuccessful packet transmsson from V n to V n+. () Q-value update: n MRL-CC, for -hop forwardng, at teraton t, node V n forwards a packet to V n+, and then j V n+ selectedtocontnuepacket forwardng. Therefore, node updates ts Q-value as Q t+ (s t,at ) = ( α) Q t (st,at ) The average dstance between node and V n+, denoted by d,vn+,canbecalculatedas d,vn+ = N Vn+ d,j. (7) j V n+.3.. Interpretaton. We can conclude that MRL-CC algorthm s consderng each CN group as one sngle node because t s performng a fully cooperatve task. In fact, all nodes of one CN group get the same postve/negatve reward after each transmsson procedure. The value of that reward represents the qualty of packet forwardng n terms of delay and packet loss rate. Besdes, the Q-values of the cooperatve nodes are ntally based on average dstance. Therefore, by electng a node wth the hghest Q-value, we also understand that the polcy adopted n MRL-CC s based on node electon wth the shortest dstance and the lowest packet delay. Thus, MRL-CC ensures communcaton relablty. However, t has no nformaton about energy consumpton that can be a useful parameter to be consdered n RL. +α(r t+ (s t+ ) + γω(, j) max a j A Qt j (st j,at j ) +γ ω(, ) max a A Qt (st,at )), V n = where γ [,]s the dscount factor, α [,]s the learnng rate parameter and ω(, j) and ω(, ) are, respectvely, factors that wegh the maxmum Q-value for node j n V n+ and the maxmum Q-value of node (neghbor of node )n V n. Equaton (4) shows that the Q-value of node s a weghed sum of the Q-value of node at the prevous state, the acton smmedatereward,themaxmumq-value of j whch s elected as the forwardng node n V n+ at the next hop, and the Q-values of all of s cooperatve partners n V n. Note that n the ntalzaton phase, each node s assgned wth an ntal Q-value. For node V n, ts ntal Q-value (denoted as Q n )scalculatedbasedontherelatvedstance (comparedwthtscooperatvepartnersnv n )fromnode to the nodes n V n+, as shown n the followng: Q n (4) = d V n,v n+ d,vn+, (5) where d Vn,V n+ s the average dstance between V n and V n+, whch can be calculated as d Vn,V n+ = d N,Vn+, (6) Vn V n where N Vn s the number of cooperatve nodes n V n..4. WSN Modelng wth Renforcement Learnng n RSSI/Energy-CC Algorthm.4.. Man Idea. Nodes n a CN group wll be consdered as opponents to each other, so that, each node wll mantan a Qvalue whch reflects the payoff that would have been receved f that node selected the acton a f and the other nodes jontly selected the acton a m.afterthat,thenodewththehghest total payoff wll be elected to forward the data packet to the next CN group towards the snk. For the rewardng procedure, there are two cases. () Transmsson succeeded: the Q-values of each node wll be updated accordng to ts energy consumpton compared to ts neghbors n ts CN group. () Transmsson faled: the Q-valueofthenodethat faled to forward the data packet wll be updated wth a negatve reward, whereas for the other nodes, ther Q-value wll be updated accordng to an ndcaton about ther sgnal qualty. In our work, we have chosen to use the RSSI as an avalable ndcaton about sgnal qualty for each packet receved at a sensor node..4.. RSSI/Energy-CC Algorthm Strategy. Node electon n the CN group s based on a multagent RL algorthm, performng a fully compettve task usng an opponent modelng algorthm [8]. The strategy s descrbed as follows. () Polcy: nodeelecton,forpacketforwardng,forthe node wth the best lnk qualty and the lowest energy consumpton, or a tradeoff between the two crtera. () Behavor: each node mantans Q-values whch reflects the payoff that would have been receved f that node selected the forwardng acton a f and another node n ts CN group selected the montorng acton a m.

The Scentfc World Journal 5 () Reward: Each tme a packet s forwarded, all the nodes wll receve mmedate rewards from the envronment, whch represent a tradeoff about energy consumpton and qualty of the receved sgnal..4.3. Algorthm Intalzaton Phase. In the ntalzaton phase, each node s assgned wth an ntal value regardng ts opponents n V n.thentalpayoffofnode V n compared to ts neghbor s the Q-value calculated based on ts absolute RSSI n dbm measured from the next cooperatve group V n+. The Q-value s defned as follows: Q n, = RSSI,V n+ RSSI,V n+ RSSI Vn,V n+, (8) where RSSI Vn,V n+ s the average RSSI between V n and V n+, whch can be calculated as RSSI Vn,V n+ = RSSI N,Vn+, (9) Vn V n where N Vn s the number of cooperatve nodes n V n. The average RSSI between node and V n+, RSSI,Vn+,can be calculated as RSSI,Vn+ = N Vn+ RSSI,j. () j V n+.4.4. Data Dssemnaton Phase. When a data packet s receved by a CN group V n,eachnodewllcomparetsown total payoff, regardng all ts opponents, wth those of other cooperatve nodes. The node whch determnes that t has the hghest total payoff wll forward the data packet to V n+, and other nodes n V n wll deduce whether the packet forwardng s successful or not, by overhearng the packet transmsson from V n+ to V n+. () Q-value update: the updatng of Q-value terates at each node n each forwardng procedure. For -hop forwardng, at teraton t, node V n forwards a packet to V n+ and nodes ;neghborsof n V n montor the packet forwardng. Then, j V n+ s elected to contnue packet forwardng. Therefore, node updates ts Q-values as Q t+, (s t,at f,at m ) = ( α) Q t, (st,at f,at m ) +α(r t+ (s t+ )+γ ω s t V(s t )+γω sj t V(st j )), () where ω s t and ω s t j are, respectvely, factors that wegh the total payoff n V n and V n+ and V(s t ) s the maxmum payoff expressed by V (s t ) = max a t f C t (st,a t m ) Q t N (s), (st,a t f,at m ), () a t m where C t (st,a t m ) counts the number of tmes agent observed agent takng acton a m n state s at packet t and N(s) s the total counts for all agents takng acton a m n state s. Therefore, C t (st,a t m )/N(s) s the probablty n whch the nodes other than wll select jont acton a m for packet t based on past experence. So, for V n f agent chooses a m acton, then C t+ (st,a t m )=Ct (st,a t m )+, N(s) =N(s) +. (3) Equaton () shows that the Q-value of node s a weghed sum of the Q-value of node at the prevous state, the acton s mmedate reward and the maxmum payoff of the group V n+ and the maxmum payoff of the group V n. () Reward functon: the reward functon s defned as follows: r = (( j Vn E j /N Vn ) E ) (( j Vn E j /N Vn ) mn j Vn E j ), (4a) r = RSSI,V n+ RSSI Vn,V n+ σ N Vn. (4b) Equaton (4a) susedtocalculatetherewardwhen the packet forwardng s successful, where E represents the consumed energy for node of the group V n. So, nodes wth less energy consumpton wll receve postve rewards, and nodes wth more energy consumpton wll receve negatve rewards. Equaton (4b) susedtocalculatetherewardwhenthe packet forwardng fals. The parameter σ takes for the node that faled to forward data packet, whereas for the other nodes, t takes. So, the forwardng-node wll receve a negatve reward. The other nodes n V n wll receve postve reward accordng to ther RSSI values. In the opponent modelng case, all nodes n V n are actng n a fully compettve task. So, the total sum of the attrbuted rewards to all cooperatve nodes s zero. After a certan number of teratons, nodes n V n are able tousethelearnedpolcytotakeapproprateactons..5. Complexty Analyss. As notced n the prevous subsectons, RL algorthms are composed of two man phases: () updatng phase of the Q-values for each agent; () node electon for data forwardng. For the Q-learnng algorthm, the updatng phase s realzed through (4). The algorthm complexty concernng the QvalueupdatngsthenequaltoN. Forthenodeelectonphase,thenodewththehghestQvalue s elected for data forwardng: a f = arg maxq. (5)

6 The Scentfc World Journal So, the algorthm complexty concernng node electon equals N. Therefore, the algorthm complexty of the Qlearnng algorthm equals to N+N. For the opponent modelng algorthm, the updatng phase s realzed through (). The algorthm complexty concernng the Q-value updatng s then equal to N (N ). Forthenodeelectonphase,thenodewththehghest payoff s elected for data forwardng: a f = arg max a t m C (s t,a t m ) Q t N (s), (st,a t f,at m ). (6) So, the algorthm complexty concernng node electon equals N. Therefore, the algorthm complexty of the Qlearnng algorthm equals N N. 3. Performance Evaluaton 3.. Smulaton Envronment. For performance evaluaton, we use TOSSIM smulaton platform n order to evaluate parameters of nterest such as energy consumpton. TOSSIM s a dscrete event smulator for TnyOS sensor networks that bulds drectly from the same TnyOS code wrtten for the actual motes. We smulate dfferent topologes, szes of WSN, and channel envronment parameters (path loss and shadowng effects). The snk node s also placed n dfferent postons. Smulaton results concern network lfetme, packet delay (average delay to the snk, percentage of delayed packets, and percentage of lost packets), and energy consumpton (network energy consumpton and maxmal energy consumpton per node). Performance of RSSI/energy-CC algorthm s compared each tme to MRL-CC algorthm. The applcaton of wldfre requres specal measurement and transmsson of temperature. Other parameters may be useful as mosture but are not consdered n ths paper. The amount of nformaton transmtted s therefore lkely to be low data rate. The area to cover, the forest, can be of dfferent shapes. It can even be sparse. In ths paper, we consder two dfferent deployment archtectures: unform deployment and crcular deployment. In the forest envronment, the transmsson of nformaton between dfferent sensors can be sgnfcantly affected by thepresenceoftrees.toevaluatetheeffectofthsdstorton on the qualty of the proposed approach, we have also smulated the network n the presence of shadowng effect modelng ths type of fadng. In Table, we gve the parameters fxed for smulatng the dfferent versons of the algorthms. 3.. Smulaton Results 3... Unform Deployment. We smulate a WSN where 8 sensor nodes are unformly dstrbuted n a 8 m 8 m area (dstance between successve nodes s m). The snk node s placed accordng to three dfferent topologes (Fgure 4). Table : Smulaton parameters. Packet delvery every mllseconds Packet sze 7 Bytes Reforwardng tme ms Communcaton range 3 m Intal battery energy L-on AA batteres Path-loss exponent Shadowng standard devaton db MAC object CSMA protocol Node used for smulaton Mca platform Table : Network lfetme (n days) tll the frst node des. Network archtecture A B C MRL-CC 5 69 5 RSSI/energy CC 58 77 3 (a) Packet Delay Analyss. We compute n Fgure 5 the average delay to the snk, percentage of delayed packets, and percentage of lost packets. The smulaton results show that for noncooperatve algorthm, the percentage of lost packets s huge compared to the MRL-CC algorthm and the RSSI/energy CC algorthm. However,ntermsofpercentageofdelayedpacketsand average delay to the snk, the RSSI/energy-CC algorthm s lower than the MRL-CC algorthm. Ths s due to the fact that RSSI/energy-CC algorthm reles on the average lnk qualty between the CN groups, whch s performng at the same tme n a compettve context. Ths compettve task allows a CN group to elect the node wth the best RSSI for packet transmsson. (b) Energy Consumpton n a Cooperatve Node Group. Fgure 6 presents the selected CN groups for data transmsson from node 4 to the snk node (topology B s consdered). We dsplay the resdual battery energy for each selected CN group n Fgure 7, and we compare energy consumpton behavor between the MRL-CC algorthm and the RSSI/energy-CC algorthm. Fgure 7 shows that the behavor of energy consumpton for each CN group s dfferent when comparng MRL-CC algorthm and RSSI/energy-CC algorthm. For nodes whch belong to the same CN group, the resdual energy s more balanced for the RSSI/energy-CC algorthm. Thus, energy consumpton s saved for each node n each CN group. (c) WSN Lfetme. Network lfetme s defned as the tme when the frst node s battery s out of energy. For our case, we have compared the MRL-CC algorthm to the RSSI/energy- CC algorthm, computng at the same tme the total energy consumednthewsn(nj).resultsaregvenntable. We also present n Table 3 the maxmal lfetme durng whch all sensors can transmt to the snk node. We can notce from Tables and 3 that network lfetme s enhanced when comparng MRL-CC algorthm to

The Scentfc World Journal 7 (A) (B) (C) Fgure 4: Snk node (n black) placement for topologes (A), (B), and (C). Average packet delay to reach the snk node (ms) 8 6 4 3 4 5 Number of hops from the snk node Delayed packets (%) 3 3 4 5 Number of hops from the snk node RSSI/energy algorthm MRLCC algorthm Noncooperatve RSSI/energy algorthm MRLCC algorthm Noncooperatve (a) (b) Lost packets (%).5.5.5 3 4 5 Number of hops from the snk node RSSI/energy algorthm MRLCC algorthm Noncooperatve (c) Fgure 5: Average delay to the snk, percentage of delayed packets, and percentage of lost packets by averagng on the number of nodes beng away wth the same number of hops from the snk node. Table 3: Network lfetme (n days) tll the WSN cannot transmt to the snk node. Network archtecture A B C MRL-CC 78 5 RSSI/energy CC 87 75 RSSI/energy-CC algorthm. Ths enhancement s certanly duetosomeenergysavngsnthenetwork. (d) WSN Energy Consumpton. We frst nvestgate energy consumpton n the whole network. A comparson between the dfferent network archtectures for the two algorthms s presented n Fgure8. Comparng network archtectures, we conclude that C has the lowest energy consumpton compared to A and B. So, network lfetme for C s the longest. Smulatonresultsalsoshowthatwhencomparngnetwork energy consumpton between the two algorthms for the same network archtecture, network energy consumpton s saved for the RSSI/energy CC algorthm compared to the MRL-CCalgorthm.ThssbecausetheRSSIsconsdered for the decson of the node electon for packet forwardng.

8 The Scentfc World Journal 7 73 74 75 76 77 78 79 8 5 7 63 64 65 66 67 68 69 7 7 54 55 56 57 58 59 6 6 6 45 46 47 48 49 5 5 5 53 36 37 38 39 4 4 4 43 44 7 8 9 3 3 3 33 34 35 8 9 3 4 5 6 Network energy consumpton (J) 6 5 4 3 5.% 3.33%.8% 8.96%.8% 8.34% 4.% 8.7% 9% 5.43% 9 3 4 5 6 7 3 4 5 6 7 8 Fgure 6: Selected CN groups for data transmsson from source node 4 (n red) to snk node 76 (n black). Node resdual energy (J) 4.5.5.5 MRL-CC RSSI/energy- CC MRL-CC RSSI/energy- RSSI/energy- CC MRL-CC CC 5 5 5 3 Tme (days) MRL-CC algorthm for network A MRL-CC algorthm for network B MRL-CC algorthm for network C RSSI/energy CC algorthm for network A RSSI/energy CC algorthm for network B RSSI/energy CC algorthm for network C Fgure 8: Network energy consumpton, comparson between network archtectures for MRL-CC and E/RSSI CC algorthm. Table 4: Network lfetme (n days) tll the frst node des. Network archtecture 9 9 3 3 MRL-CC 66 38 6 RSSI/energy CC 9 6 46 3 3 39 4 4 39 4 4 57 58 59 57 58 59 Node ID After 6 days After 4 days After days Intal battery energy Table 5: Network lfetme (n days) tll the WSN can not transmt to the snk node. Network archtecture 9 9 3 3 MRL-CC 9 6 5 RSSI/energy CC 3 9 5 Fgure 7: Energy consumpton comparson for each selected CN group between MRL-CC algorthm and RSSI/energy-CC algorthm. Network energy consumpton s saved from 3.33% to 5.9% for network A, from.8% to 6.3% for network B, and from 5.38% to 9.76% for network C. Atthesametme,wecomparethemaxmumenergy consumpton per node n the network, for the two algorthms. For each archtecture, we obtan the charts presented n Fgure 9. Thesmulatonresultsshowthatthemaxmumenergy consumpton per node s reduced for the RSSI/energy CC algorthm compared to MRL-CC algorthm. Ths s due to takng nto account the energy consumpton for the cooperatve group before makng the decson for node electon. The maxmal energy consumpton s saved from 9.56% to.6% for network A, from.5% to 3.3% for network B, and from.79% to 4.76% for network C. So, we can conclude that network lfetme enhancement s due to the enhancement of node s lfetme wth maxmal energy consumpton. In a second analyss of energy consumpton, we propose to show results for extended grd networks where the snk s placed n the center (alke to topology C). Results about lfetme are shown n Tables 4 and 5. We can notce from those tables that network lfetme s also enhanced for the RSSI/energy-CC algorthm. We also dsplay results about network energy consumpton n Fgure, and the maxmum energy consumpton per node n the network n Fgure. Comparng network archtecture, we conclude that 9 9 network has the lowest energy consumpton compared to 3 3 and networks. So, network lfetme for 9 9network s the longest. Smulaton results, n Fgure, also show that when comparng network energy consumpton between the two algorthms for the same network archtecture, the

The Scentfc World Journal 9 Maxmal energy consumpton (J) 4.5.5.6%.97%.89% 9.63%.%.57%.9%.47% 9.4% 4.4% 4.5% 4.37% 3.3%.5 9.56%.87%.79% 5 5 5 Tme (days) MRL-CC algorthm for network A MRL-CC algorthm for network B MRL-CC algorthm for network C RSSI/energy CC algorthm for network A RSSI/energy CC algorthm for network B RSSI/energy CC algorthm for network C Fgure 9: Maxmal energy consumpton n the whole WSN, comparson between MRL-CC and E/RSSI-CC algorthms for dfferent network archtectures. Maxmal energy consumpton (J) 4.5.5.5 4 6 8 4 6 8 Tme (days) MRL-CC algorthm for a 9 9network MRL-CC algorthm for a 3 3 network MRL-CC algorthm for a network RSSI/energy-CC algorthm for a 9 9network RSSI/energy-CC algorthm for a 3 3 network RSSI/energy-CC algorthm for a network Fgure : Maxmal energy consumpton n the whole WSN, comparson between MRL-CC and E/RSSI CC algorthms for dfferent network archtectures. 8 Network energy consumpton (J) 5 8 6 4 5 5 5 Tme (days) MRL-CC algorthm for a 9 9network MRL-CC algorthm for a 3 3 network MRL-CC algorthm for a network RSSI/energy-CC algorthm for a 9 9network RSSI/energy-CC algorthm for a 3 3 network RSSI/energy-CC algorthm for a network Fgure : Network energy consumpton, comparson between network archtectures for MRL-CC and E/RSSI CC algorthms. 7 6 5 4 3 3 4 5 6 7 8 Fgure : WSN topology n crcles; snk node s at the center. network energy consumpton s saved for the RSSI/energy CC algorthm compared to the MRL-CC algorthm. Network energy consumpton s saved up to 9.49% for 9 9network, up to 6.78% for 3 3 network, and up to 6.8% for network. In Fgure, the smulaton results show that the maxmum energy per node s reduced for the RSSI/energy CC algorthm compared to MRL-CC algorthm. Thus, the maxmal energy consumpton s saved up to 7.7% for 9 9 network, up to 4.% for 3 3 network, and up to 4.% for network.

The Scentfc World Journal Network energy consumpton (J) Network energy consumpton for network n crcles for 5 MRL-CC and RSSI/energy CC algorthms 8 7.7% 5.9% 6 8.47% 7.36% 4.89% 4 8.4% 9.89% 3.69% 34.36% 39.4% 5 5 5 Tme (days) MRL-CC algorthm RSSI/energy CC algorthm (a) Maxmal energy consumpton (J) Maxmal energy consumpton for network n crcles for 4 MRL-CC and RSSI/energy CC algorthms.5 8.7%.5 9.88%3.57%3.4% 3.57% 3.9% 8.7%8.6%.5 35.53% 33.35% 5 5 5 Tme (days) MRL-CC algorthm RSSI/energy CC algorthm (b) Fgure 3: Network energy consumpton and maxmal energy consumpton for network n form of crcles, for MRL-CC algorthm and RSSI/energy CC algorthm. 5 4 3 Network lfetme (days) 9 8 7 6 5 3 4 5 6 Shadowng standard devaton (db) MRL-CC n=3 MRL-CC n=4 RSSI/energy CC n=3 RSSI/energy CC n=4 Fgure 4: Network lfetme (archtecture C) for dfferent number of path losses, n, and dfferent shadowng standard devaton, for the MRL-CC and the RSSI/energy algorthms. 3... Energy Consumpton for Crcular Topology. We also smulated our algorthms n the form of crcles presented n Fgure. The dstance between crcles s meters. Energy smulatons for the network n crcles for the two algorthms are presented n Fgure 3. The network lfetme for MRL-CC algorthm s 8 days. However, for the RSSI/energy CC algorthm, the network lfetme s 47 days. The gan n network lfetme s very valuable due to the specal network topology. Network energy consumpton savngs go from 4.69% up to 39.4%. Also, for maxmal energy consumpton, savngs are gong from 8.6% up to 35.53%. 3..3. Shadowng and Path-Loss Effect. We propose to use the network archtecture C (unform deployment) to smulate the network lfetme when path-loss number takes the values: n= 3 and 4, and shadowng devaton takes the values: σ=,4 and 6 db. Smulaton results are shown n Fgure 4. It s obvously clear that the network lfetme s reduced when the path-loss value ncreases and when the shadowng devaton ncreases. Ths result s both for the MRL-CC and the RSSI/energy CC algorthms. From that fgure, we can also conclude that the RSSI/energy CC algorthm performs better than the MRL-CC algorthm n terms of network lfetme.

The Scentfc World Journal 4. Conclusons To help automatc montorng of wldfre, we propose n ths paper to deploy WSN. To desgn and optmze the routng protocol used for data aggregaton n ths network, we propose a new algorthm: the RSSI/energy-CC. Ths algorthm corresponds to the renforcement learnng optmzaton approach takng nto account energy consumpton and lnk qualty measured by the RSSI, performng n a compettve task. Smulatons had shown that ths algorthm s effcent n terms of percentage of lost packets, network energy consumpton, maxmal energy consumpton per node, and network lfetme. In future research, we wll consder both the case of multple snks n the WSN n order to better process network energy consumpton and better enhance the network lfetme and sparse deployment whch descrbes better the forest envronment. [8] W. Uther and M. Veloso, Adversaral renforcement learnng, n Proceedngs of the AAAI Fall Symposum on Model Drected Autonomous Systems,997. Acknowledgment The research leadng to these results has receved fundng from the European Communty s Seventh Framework Programme (FP7-ENV-9-) under Grant agreement no. FP7-ENV-4488 FIRESENSE Fre Detecton and Management through a Mult-Sensor Network for the Protecton of Cultural Hertage Areas from the Rsk of Fre and Extreme Weather. References [] X. Lang, M. Chen, Y. Xao, I. Balasngham, and V. C. M. Leung, MRL-CC: a novel cooperatve communcaton protocol for QoS provsonng n wreless sensor networks, Internatonal Journal of Sensor Networks,vol.8,no.,pp.98 8,. [] C.-K. Tham and J.-C. Renaud, Mult-agent systems on sensor networks: a dstrbuted renforcement learnng approach, n Proceedngs of the Intellgent Sensors, Sensor Networks and Informaton Processng Conference, pp. 43 49, Melbourne, Australa, December 5. [3] A. Nosratna, T. E. Hunter, and A. Hedayat, Cooperatve communcaton n wreless networks, IEEE Communcatons Magazne,vol.4,no.,pp.74 8,4. [4] L.P.Kaelblng,M.L.Lttman,andA.W.Moore, Renforcement learnng: a survey, Journal of Artfcal Intellgence Research,vol. 4, pp. 37 85, 996. [5] L. Buşonu, R. Babuška, and B. De Schutter, A comprehensve survey of multagent renforcement learnng, IEEE Transactons on Systems, Man and Cybernetcs C,vol.38,no.,pp.56 7, 8. [6]M.Chen,T.Kwon,S.Mao,Y.Yuan,andV.Leung, Relable and energy-effcent routng protocol n dense wreless sensor networks, Internatonal Journal on Sensor Networks,vol.4,no., pp. 4 7, 8. [7] M. Maalej, H. Besbes, and S. Cherf, A cooperatve communcaton protocol for savng energy consumpton n WSNs, n Proceedngs of the IEEE 3rd Internatonal Conference on Communcatons and Networkng (ComNet ), pp. 5, Hammamet, Tunsa,.

Internatonal Journal of Rotatng Machnery Engneerng Journal of The Scentfc World Journal Internatonal Journal of Dstrbuted Sensor Networks Journal of Sensors Journal of Control Scence and Engneerng Advances n Cvl Engneerng Submt your manuscrpts at Journal of Journal of Electrcal and Computer Engneerng Robotcs VLSI Desgn Advances n OptoElectroncs Internatonal Journal of Navgaton and Observaton Chemcal Engneerng Actve and Passve Electronc Components Antennas and Propagaton Aerospace Engneerng Internatonal Journal of Internatonal Journal of Internatonal Journal of Modellng & Smulaton n Engneerng Shock and Vbraton Advances n Acoustcs and Vbraton