Renforcement Learnng for Qualty of Servce n Moble Ad Hoc Network (MANET) *T.KUMANAN AND **K.DURAISWAMY *Meenaksh College of Engneerng West K.K Nagar, Cheena-78 **Dean/academc,K.S.R College of Technology,Truchengode ABSTRACT: It s very dffcult to fnd feasble QoS (Qualty of servce) routes n the moble ad hoc networks (MANETs), because of the nature constrans of t, such as dynamc network topology, wreless communcaton lnk and lmted process capablty of nodes. In order to reduce average cost n floodng path dscovery scheme of the tradtonal MANETs routng protocols and ncrease the probablty of success n fndng QoS feasble paths and It proposed a heurstc and dstrbuted route dscovery new method supports QoS requrement for MANETs n ths study. Ths method ntegrates a dstrbuted route dscovery scheme wth a Renforcement Learnng (RL) method that only utlzes the local nformaton for the dynamc network envronment; and the route expand scheme based on Cluster based Routng Algorthms (CRA) method to fnd more new feasble paths and avod the problem of optmze tmng n prevous smart net Qualty of servce n MANET. In ths paper proposed method Compared wth tradtonal method, the experment results shoitd the network performance s mproved optmze tmng, effcent and effectve. Index Terms: Qualty of Servce, Renforcement Learnng, Genetc, Routng Algorthms and Moble Ad hoc Networks. 1. INTRODUCTION Moble Ad hoc network s a computer network n whch the communcaton lnks are wreless and the devces on t communcate drectly wth each other. Ths allows all wreless devces wthn range of each other to dscover and communcate n peer-topeer fashon wthout nvolvng central access ponts. An ad hoc network tends to feature a small group of devces all n very close proxmty to each other. Performance degrades as the number of devces grows, and a large ad hoc network quckly becomes dffcult to manage. Ad hoc network routng protocol s challengng to desgn, and t s even more dffcult for a secure one. There are many research focus on how to provde effcent [1, 2] and secure [3, 4, 8] communcaton n ad hoc networks. It has been proved that f QoS contans at least two addtve metrcs, then the QoS routng s a NP-hardness problem. So t s acceptable and necessary to develop heurstc algorthms to deal wth the problem, whch s to search optmze solutons wth accepted cost [5-6]. Currently, many studes have been done about QoS routng for MANETs by some researchers such as MANET workgroup of IETF [8]. The proposed a QoS route dscovery method n ths paper to solve same problem, whch s based on ant algorthm and smulated annealng algorthm [7]. In ths study, propose a ISSN: 1790-5117 250 ISBN: 978-960-474-162-5
novel adaptve QoS route dscovery method for MANETs, based on Renforcement learnng (RL) and Clusterng based Routng Algorthm (CRA). Ths method ntegrates a dstrbuted route dscovery scheme wth a Renforcement Learnng (RL) method that only utlzes the local nformaton for the dynamc network envronment; and the route expand scheme based on Cluster based Routng Algorthms (CRA) method to fnd more new feasble paths and avod the problem of optmze tmng n prevous smart net Qualty of servce n MANET. To remedy t, It propose a CRA-based path extend algorthmc based, whch can help remedy the RL based route dscovery process and overcome the problem of local optmzaton soluton to avod stagnaton route. It s also demonstrate random networks and Route dscovery scheme on dynamc network envronment. In ths paper can mprove the effcent of data transfer n feasble path usng clusterng. 2. QoS ROUTING MODEL on SEDA FOR MANETS: Secure Effcent Ad hoc Dstance Vector (SEAD) [11] s a proactve routng protocol, based on the desgn of Destnaton Sequenced Dstance Vector routng protocol (DSDV) [9]. Nodes mantan dstances to destnaton and keep nformaton about the next hop n the optmal path to a destnaton. The routng protocols for MANETs may be broadly classfed as table drven protocols and on demand drven protocols. The protocols need to mantan the global routng nformaton about the network n every moble node for all the possble source-destnaton connecton and acqure to exchange routng nformaton perodcally [9, 10]. Ths knd of protocol has the property of loitr latency and hgher overhead. Ondemand routng protocol creates routes only when the source nodes request. When a node requres a route to a destnaton, t ntates a route dscovery process wthn the network. On-demand routng protocols are characterzed as hgher latency and loitr overhead. A majorty of exstng research about the QoS route n MANETs s based on the two knds of route protocols. HoItver, exstng studes show that table-drven QoS protocols request globe network state nformaton; and on demand QoS protocols need ntates a route dscovery based on floodng, whch are not ft the dynamc and capablty constran n MANETs. QoS route dscovery can be mplemented wth dstrbuted routng or source routng. In dstrbuted routng, all the nodes ncludng the source node n the QoS path wll run the route algorthm to select the next hop node. In source routng, a QoS route s predetermned by the route algorthm only at the source node. The new proposed method knd of heurstc dstrbuted algorthm mxed some features of source routng. Our QoS route dscovery algorthm s mplemented wth route explore and dscovery based on renforcement learnng from source to destnaton and all the nodes at the reverse path as Itll as the QoS measurement data are stored and returned by acknowledgment packets to source nodes and the data packets are source routed. The QoS route dscovery scheme s llustrated n Fg. 2.1. The characterstc of SEAD s usng a one way hash functon. Each node computes a lst of hash values 1 h,, n h where ( )1 = h H h, n < 0, gven an ntal 0 h. If a node knows H and a ISSN: 1790-5117 251 ISBN: 978-960-474-162-5
value n h, then t can authentcate any other value h, n < 0. Let a node s hash chan be the sequence of values 0 h, 1 h,, n h where ( )1 = h H h and n s dvsble by m, then for a sequence number n some routng update, and let m n k =. An value from the group of hash values km h, 1 + km h,, 1 +m km h s used to authentcate the node. The example n Table 1, m = 5 and n = 20 n whch denotes the sequence number, j denotes metrc, m denotes network dameter, and n denotes length of hash chan. In SEAD n QoS Routng protocol, the receved node can verfy the metrc accordng to the receved hash value [10, 11]. Some malcous node can ncrease the metrc and compute the correspondng hash value. The method used by SEAD for authentcatng an entry n a routng update uses the sequence number n that node to determne a contguous group of m elements from that destnaton node s hash chan, one element of whch has to be used to authentcate that routng update [11]. In Fgure 2.1, the dfference n the table of node C s the column of hash values. The partcular hash value from ths group of hash values that have to be used to authentcate the node s determned by the metrc value beng sent n that node. Fg 2.1. QoS Route Dscovery on SEDA Routng Protocol n MANETs Destnaton Node Short IP Address Hop ways C C,A 1 192.168.10.3 A D,B,A 2 192.168.10.1 D D 0 192.168.10.4 B C,D,E 1 192.168.10.2 E F,G 2 192.168.10.5 F G,H 2 192.168.10.6 G E,F 2 192.168.10.7 H F 2 192.168.10.8 Table.2.1. QoS route dscovery n MANETs A MANETs can be modeled as a undrected graph wth Itght, G, G= [V, E] V, s the set of the moble nodes and the E the set of the bdrectonal wreless lnk, for a lnk e E.. have e= (v, v j ) and node v V,v j V, j. v, v j are the neghbor nodes. G and V are both dynamc set. For s V s the source, p {v-{s}}s the destnaton, f e E, defne the metrc functon: Bandwdth functon: B(e) : E R +, Delay functon: D(e) : E R +, The QoS parameters of path l can be represented as: k 1 DI ( ) = D (, = 1 BI ( ) = mn(( B, v v ) + 1 v v + 1 ), B( v, + 1 v + 2 ),...,( B v, k 1 -------------- Equ (1) The QoS route can be descrbed as a optmze problem: ( D( I) d) ( B( I) b) ------------------- ----- Equ (2) vk )) ISSN: 1790-5117 252 ISBN: 978-960-474-162-5
Where, d and b s the constant parameters. 3. MULTICAST SEDA ROUTING PROTOCOL WITH QoS MANETs The propose a routng protocol that uses forward group to apply multcast routng wth QoS from source(s) to a cluster of destnatons. In the proposed system, wll try to take all lmtatons of MANETs nto account and provdes a general cluster based framework for mplementaton of QoS. 3.1. Sesson Intaton and Destructon A node that has data to send starts sesson by broadcastng a sesson ntaton as a qualty of servce route request (QRReq) wth Tme-To- Lve (TTL) greater than zero. Intermedate nodes rebroadcast QRReq f they have avalable bandwdth untl arrvng at destnatons or TTL equal zero. Destnaton nodes receve QRReq and send route reply (RRep) to the source. Source nodes and destnaton nodes can leave the sesson by not sendng QRReq and RRep respectvely. 3.2. Forward group and member management When an ntermedate node receves QRReq from source node, t stores the source IP and the sequence number n ts message cache to detect any potental duplcates. If the message s not a duplcate, ntermedate node has avalable effcency and the TTL s greater than zero, then the node rebroadcast QRReq; routng table s updated wth node ID that receves from t. The destnaton node wll receve QRReq from several paths; t selects one path wth the best QoS condtons and sends route reply (RRep). When an ntermedate node receves RRep from destnatons, t checks f the node ID n RRep matches ts own ID. If t does, the node realzes that t s on the reverse path to the source and t s a part of the forwardng group, so t sets the forwardng group flag. The next hop node ID feld s flled by extractng nformaton from ts routng table. In ths way, each ntermedate node propagates the RRep untl t reaches the multcast source va the selected path. Ths whole process constructs or updates the routes from sources to recevers. Through ths process, all paths to destnatons wll be defned and source node can start sendng data packet. 3.3. Admsson control Its use dstrbuted admsson control at every ntermedate node, when ntermedate node receve QRReq packet, t must calculate ts avalable bandwdth and rebroadcast QRReq packet f t has avalable bandwdth. QRReq forwarded as long as QoS requrements are met. The packet s dropped f QoS requrements cannot be met any more, avodng floodng the network unnecessarly. Before QRReq packet rebroadcast, each ntermedate node temporarly updates ts QoS nformaton wth the current QoS condtons. Wth ths rule, nodes do not accept more traffc than the bandwdth avalable. Fgure 2.1 shows structure of route request wth QoS requrement phase and the route reply phase and forward group establshment. In our framework, the propose to compute the avalable bandwdth based on the channel status of the rado to determne the busy and dle perods of the share wreless meda. By examnng the channel usage of a node, t able to take nto account the actvtes of both the node tself and ts surroundng neghbors ISSN: 1790-5117 253 ISBN: 978-960-474-162-5
and therefore obtan a good approxmaton of the bandwdth usage; t wll use the standard IEEE 802.11. 3.4. Route recovery and prevent congeston Most multcast applcatons belong to category that number of senders s less than number of recevers. In ths stuaton, sender advertsng s more effcent than recever advertsng [12], so n our proposed routng protocol use sender advertsng. Each source perodcally sends QRReq that make route recovery by updatng forward group. The problem wth the admsson control soluton n most prevous studes s that a one-tme procedure performed before the flows starts. It does not take nto account the change n the wreless network over the duraton of the flow s operaton. Capacty of channel may change dramatcally and avalable bandwdth that estmated by ndvdual nodes wll be change wth dynamc moblty of nodes (move to each others) and due to fadng and out sde nterference [12]. In our approach when source update forward group, paths wll update also and nodes re estmate the avalable bandwdth, so all changes that appear as a result of node movement wll be taken. Any forward node can detect congeston usng perodc traffc measurement. When a node detects such congeston, t starts sendng a prop packet to source or destnaton. If t sends to source to update forward group, overhead wll be hgh because of control packet [12, 13]. If t sends to destnaton node, destnaton nodes need to process and save all alternatve paths when route request are receved. In addton, destnaton nodes measure packet delay, f a packet takes long delay; destnaton node drops t and sends update to source. 4. REINFORCEMENT LEARNING Renforcement Learnng s a general method n machne learnng, whch deals wth the problem of how a system n a dynamc envronment can learn to choose optmal actons to acheve ts goals and through the learnng of tral-and-error nteractons, the system can then attempt to determne the output wth the nput data [15]. The dea s to adjust parameters n the drecton of the emprcally estmated gradent of the ncensement reward. The renforcement learnng procedure ncludes: envronment state set, S; actons set, A; a renforcement functon δ: S S ; the reward functon R: S X A S state transton polcy: T: S X A (S). (S, A) s the expected nstantaneous renforcement from acton a (a A) n state s(s S). (s) s the set of functon over the set S and the learnng results, how to make a transton from state s to state s usng acton a. Fg. 2: Renforcement learnng model Fgure 2 llustrates the form of a standard renforcement model. The acton changes the state of the system and the value of ths state transton s represented by value of renforcement. Renforcement learnng problems are usually modeled as Markov decson processes. The model s Markov f the state transtons are ndependent of the hstory of the system. In MANETs, each ISSN: 1790-5117 254 ISBN: 978-960-474-162-5
moble node acts as a router and a host at the same tme and routng nformaton s exchanged perodcally or on-demand [14, 15]. Furthermore, the route nformaton depends on the dynamc network state s not very accurate. And t s dffcult to collect the accurate QoS nformaton; each moble node only can obtan the local envronment relatvely that s not complete and accurate for QoS route compute. The route algorthms based on RL have receved some attenton n wre less network. In the RL algorthm s combned the proposed novel algorthm, accordng to the QoS metrc, each moble node acts as an agent, t must make certan decsons, how to fnd a feasble path for some new connecton arrval [15]. The agent outcome (route selecton) of a decson s used to reward or punsh the correspondng decson of the routng algorthm so that good decsons are selected va rewards, whle bad decsons are elmnated va the punshment. Then, the acknowledgement packets that store the reverse route and the QoS measurement data return to the source node. The value of followng a polcy wth parameters s the expected cumulatve dscounted (by a factor of δ [0, 1]) reward value that can be wrtten as: V π = 0? S? = λ t r = 0 t+ --- Equ 3 ξ t+ --------------------- G( ξ ) = λ ----------- Equ 4 The evaluate functon s: Q ( s, a ) = max ( s, ) t t ξ + λ Q t t+ 1 a --- t + 1 -------------- Equ 5 and the n-step truncated return s n 1 = = 0 V ( n) + n r ( ) t rt+ r t+ n X ----------- t+ n ------ Equ 6 The optmal polcy can only be determned by teratve compute wth the renforcement value. The qualty of the polcy s drectly lmted by the qualty of the model that t s calculated from. It must sample the state transtons suffcently often to establsh a good model. In our approach, route exploraton establshment s done usng lmted floodng. When t need to establsh a route and the destnaton s not n the source s neghborhood lst, the source node wll search for ts route table and network state table frst. If there s enough nformaton to execute RL-based decson algorthmc [13, 14, 15], ntermedate nodes wll forward a route request packet. If no reply arrves at the explored node n tme, the route entry wll be deleted at the node and late comng reply packets wll be gnored. 5. Novel Algorthm Based Routng Protocol Extends The route extends based on Clusterng Routng Algorthm wll only run at each source node, to generate and select paths for data packets based on the QoS requrements. The CRA populaton wll consst of ndvduals, whch represent paths betiten the source node and potental destnaton nodes. It wll use a varable length representaton, whch s expected to allow the CRA more flexblty to evolve n response to changes n the network. The ftness of a path s determned from the QoS measurement data returned by acknowledgement packets, whch s receved n response to sendng a data packet along that path. In ths paper wth help of topologes, routng algorthms and clusterng technque. In ths paper ISSN: 1790-5117 255 ISBN: 978-960-474-162-5
propose method s Qualty of Servce and Renforcement Learnng based on clusterng technque. In ths proposed method s manly reduce the lowcomplexty and avod the problem of optmze tme n smart net Qualty of Servce n MANET. It s also demonstrate random networks and Route dscovery scheme on dynamc network envronment. In ths paper can mprove the effcent of data transfer n feasble path usng clusterng. 5.1 Structure of framework Fgure 5.1.1 shows QoS route request phase from source to destnatons and descrbes how ntermedate node behaves when t receves QoS route request. It shows route reply phase from destnatons to source, descrbes how ntermedate node behaves when t receves route reply and when t sets to be a forward node. The data packet phase from source to destnatons, descrbes how ntermedate node behaves when t receves data packet. It gves an overvew about clustered network (nteracton betiten network, routng and MAC layers) and acton performs at ntermedate node. Descrbes how ntermedate node checks QoS requrements and estmates avalable bandwdth dependng on nformaton that come from IEEE Standard. Fg 5.1.1 Route Request on QoS Model Base on Clusterng Algorthm CONCLUSION In ths paper propose method s Qualty of Servce and Renforcement Learnng based on clusterng technque usng the Routng Algorthm. In ths proposed method s manly reduce the low-complexty and avod the problem of optmze tme n smart net Qualty of Servce n MANET. It s also demonstrate random networks and Route dscovery scheme on dynamc network envronment. Therefore varous Renforcement Learnng algorthms, to fnd & mantan routes n MANET s dffcult and mportant task for effcent routng, mnmze exchange of routng nformaton, avod routng loop as It wll as to reduce route dscovery overheads. In ths paper can mprove the effcent of data transfer n feasble path at routng usng clusterng. The future enhancement of the paper should feel that measurements of metrcs lke the channel wse optmze dle tme are mportant to corroborate capacty energy effcent predctons of the capacty model and to make correctons n stuatons where the capacty model s naccurate. ISSN: 1790-5117 256 ISBN: 978-960-474-162-5
REFERENCE: [1] A. Perrgy, R. Canett, D. Song, and J. D. Tygar, Effcent and Secure Source Authentcaton for Multcast, Proceedngs of Network and Dstrbuted System Securty Symposum, NDSS 2001, February 2001. [2]Y. C. Hu, A. Perrg, and D. B. Johnson, Effcent Securty Mechansms for Routng Protocols, Proceedngs of the Tenth Annual Network and Dstrbuted System Securty Symposum, NDSS 2003. [3]Y. C. Hu, A. Perrg, and D. B. Johnson, ARIADNE: A Secure On Demand Routng Protocol for Ad Hoc Networks, MobCom 02, Atlanta, Georga, USA, September 23-26, 2002. [4]Y. C. Hu, A. Perrg, and D. B. Johnson, Rushng Attacks and Defense n Wreless Ad Hoc Network Routng Protocols, ACM Workshop on Wreless Securty (WSe 2003). [5] Davd, R. and G.N. Ignas, 2003. Ad hoc networkng n future wreless communcatons. Computer Communcatons, 6: 36-402. [6] Anonymous, 2004. IETF: Moble ad hoc networks charter, http://www.etf.org/html.charters/manetc harter.html [7] Chen, K., S.H. Shah and K. Nahrstedt, 2001. Crosslayer desgn for data accessblty n moble ad hoc networks. J. Wreless Personal Communcatons, 21: 49-76. [8] Introducton on Ad-hoc Networks presentaton by WAN Chunfeng. [9] Dynamc Source Routng n Ad Hoc Wreless Networks by Davd B. Johnson Davd A Maltz, June 1996. [10] Routng n ad hoc networks of moble hosts Johnson, D.B. Moble Computng Systems and Applcatons, 1994. Proceedngs, Workshop on Volume, Issue, 8-9 Dec 1994. [11] I-SEAD: A Secure Routng Protocol for Moble Ad Hoc Networks, Chu-Hsng Ln, We-Shen La, Yen-Ln Huang, Me-Chun Chou, 2008 Internatonal Conference on Multmeda and Ubqutous Engneerng. [12] Dstrbuted Channel Access Schedulng for Ad Hoc Networks, Lchun Bao, J.J. Garca-Luna-Aceves. [13] Dstrbuted Opportunstc Schedulng for Ad Hoc Networks Wth Random Access: An Optmal Stoppng Approach, Dong Zheng, Weyan Ge, and Junshan Zhang, IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 1, JANUARY 2009. [14] Multcast Routng wth Qualty of Servce n Moble Ad hoc Networks, Mohammed Saghr Tat Chee Wan Rahmat Budarto, Natonal Computer Scence Postgraduate Colloquum 2005 (NaCSPC 05). [15] Desgn and a New Method of Qualty of Servce n Moble Ad Hoc Network (Manet), Jtendranath Mungara, S.P. Sett, G. Vasanth, European Journal of Scentfc Research ISSN 1450-216X Vol.34 No.1 (2009), pp.141-149 EuroJournals Publshng, Inc. 2009. ISSN: 1790-5117 257 ISBN: 978-960-474-162-5