Peer-to-Peer over Ad-hoc Networks: (Re)Configuration Algorithms

Peer-to-Peer over Ad-hoc Networks: (Re)Configuration Algorithms Fernanda P. Franciscani Marisa A. Vasconcelos Rainer P. Couto Antonio A.F. Loureiro Computer Science Department, Federal University of Minas Gerais Av Antonio Carlos 6627, Belo Horizonte MG, Brazil 31270-010 fepaixao, isa, rainerpc, loureiro @dcc.ufmg.br Abstract A Peer-to-Peer network over an ad-hoc infrastructure is a powerful combination that provides users with means to access different kinds of information anytime and anywhere. In this paper we study the (re)configuration issue in this highly dynamic scenario. We propose three (re)configuration algorithms especially concerned with the constraints of the environment presented. The algorithms aim to use the scarce resources of the network in an efficient way, improving the performance and the network lifetime. The algorithms were simulated and used a simple Gnutella-like algorithm as comparison. The results show that the algorithms achieved their goals, presenting a good cost-benefit relation. 1. Introduction Mobile computing has the goal to allow people access the distinct types of information anytime and anywhere. In general, the client-server architecture is not adequate to satisfy this demand due to many reasons: the server can be down (little fault-tolerance) or overloaded (scalability problems), or there is no infrastructure to access the server and other entities. Some of the problems above, like scalability and faulttolerance, can be addressed by Peer-to-Peer (p2p) applications, which build a virtual network over the physical one. The peers can act as servers or as clients, and are called servents. The servents can exchange data among themselves in a completely decentralized manner or with the support of some central entity, which usually helps servents to get in touch with one another. Thus, the success and popularity of the p2p networks reveal how important it is to satisfy the anytime demand. Regarding the anywhere issue, however, we can say that its mainly related to the network itself, that is, it depends This work is partially supported by CNPq-Brazil on the infrastructure provided. Once the lack of infrastructure is a problem, networks that do not need it turn out to be a solution. This is the case of ad-hoc networks, which, once endowed with mobility, are called MANETs (Mobile ad-hoc networks). In these networks, the communication between nodes is based on radio coverage and can be made directly or using other nodes as routers. Thus, the MANETs can be easily formed by a group of people who make use of current devices as cellular phones, PDAs and notebooks. From the arguments above, it is reasonable to think that p2p networks over ad-hoc networks would be a very good solution to both the anytime and the anywhere problems. However, this combination leads to a highly dynamic scenario in which references between nodes are constantly changing. The frequent reconfiguration may cause a great impact in the scarce resources of the network, such as energy and bandwidth. Aiming to control and diminish this impact, we designed algorithms to (re)configure a p2p network over an ad-hoc network and analyzed their behavior. Although there are some studies on p2p applications over ad-hoc networks, they usually are not concerned about (re)configuration issues. Many times the references between nodes simply show up or are created through the indiscriminate use of broadcasts. The relevance of the current work is therefore to study adequate ways of (re)configuring a p2p network over and ad-hoc network taking into account the serious constraints this scenario presents. The rest of this paper is organized as follows. Section 2 briefly describes p2p networks, whereas Section 3 discusses the p2p over mobile ad-hoc networks. Section 4 describes the main characteristics of a mobile ad-hoc network and the routing protocol used. Section 5 presents the related work. The four (re)configuration algorithms are described in Section 6 and analyzed through simulations in Section 7. Finally, Section 8 presents our conclusions and future work. 2. Peer-to-Peer Networks The p2p networks are application-level virtual networks. They have their own routing protocols that permit com-

puting devices to share information and resources directly, without dedicated servers. P2p members act as a clients and as a servers, thus being called servents [17]. File sharing applications such as Napster, Gnutella, Morpheus have become the p2p networks very popular. The great contribution of this kind of system is the scalability that allows millions of users to be online concurrently even at peak periods [7]. This is obtained thanks the user hybrid behavior (client + server), which yield a greater computing decentralization. The p2p systems are classified in three main categories: centralized, decentralized and hybrid [4, 9]. In a centralized p2p system, coordination between peers is managed by a central server. However, after receiving the information from the central server, the communication between peers happens directly. Some advantages of this kind of system are: the easiness of management and security. The disadvantages are the low fault-tolerance, since some data are held only by the central server and the scalability limitation by the capacity of the server. However, the scalability can still be achieved through the increasing processing power of the computers, which make possible one machine be able to serve a great number of users. Some examples are the search architecture of Napster [11] and the system SETI@Home [20], which has a central job dispatcher. In decentralized topologies, e.g. Freenet [2] and Gnutella [3], all the peers have equal roles. The communication is made through multiples multicasts, where peers forward messages on behalf of others peer. Some important advantages are extensibility and fault-tolerance. Scalability in these systems is difficult to measure: on the one hand, if you add more hosts, the system becomes more capable. On the other hand, however, the overhead of keeping the consistency of the data increases with the size of the system. In hybrid (centralized + decentralized) topologies, peers forward their queries to super-peers which communicate with each other in a decentralized manner. The advantages of this topology resemble the ones of decentralized systems except for data consistency, which was improved. This happens since part of the data is kept only by the superpeers. Examples of this topology are KazaA [5] and Morpheus [10]. Table 1 was derived from [9], in which the advantages and disadvantages of distributed topologies were listed. Centralized Decentralized Manageable yes no no Extensible no yes yes Fault-Tolerant no yes yes Secure yes no no Lawsuit-proof no yes yes Scalable depend maybe apparently Table 1. Topologies and their characteristics. In this work, only hybrid and decentralized configurations were adopted for two reasons. First, we consider these topologies closer to reality. Second, as it will be shown, the environment considered is highly dynamic, making extensibility an important issue. The developed algorithms, including the, also were inspired in the decentralized protocol Gnutella, especially in query mechanism part. Gnutella is a public domain p2p protocol used mainly for sharing, searches and retrieval of files and content. To be part of this network, the servents must connect to the neighbors who already belong to the network. After that, the servent will send messages through broadcasts for its neighbors and will act as router to the other messages that had been transmitted. The types of messages exchanged in this network are: ping to discover other member nodes, the query messages with information about the searched content and file properly said, which is transferred directly between the peers. Whenaservent wishes to search for some file, it sends a query message to its neighbors, which if possible, return the desired files besides forwarding the message for its own neighbors. In order to prevent the indefinite propagation, all the messages are flagged with the TTL (time-to-live) field and with the number of hops passed [17]. 3. P2p over Mobile Ad-Hoc Networks The wireless devices have, in general, a restricted transmission range due to its limited power supply. Thus, the search for data should be made in a small distance range. This search can be made of two distinct ways. The first one uses a fixed infrastructure, generally it involves a high cost and provides continuous access to a information network like the Internet or a private intranet. The second way of access does not need fixed infrastructure: the own set of the mobile devices, acting as routers and information servers, forms the network. This kind of network is called ad-hoc network. In this last case, the nearest devices become important sources of data for each other, what resembled the p2p paradigm, in which the network elements act, at the same time, as clients, servers and routers. One of the main advantages of a p2p over ad-hoc network would be easiness of forming the network, since it is not necessary to have infrastructure nor it depends on a central server. Examples of possible uses of p2p over ad-hoc networks include applications that alert us to the presence of friends at a crowded public space or identify people we want to meet taking into account our preferences and interests; systems that spread rumors, facilitate the exchange of personal information, or support us in more complex tasks [6]. On the other hand, p2p over ad-hoc networks is a very dynamic combination that demands, among other things, special attention regarding (re)configuration issues.

4. Mobile Ad-Hoc Networks The ad-hoc mobile networks are composed by wireless mobile devices, which its communication is based on radio coverage and can be made directly (point-to-point) or using other nodes as routers. The utilization of this kind of network is mainly in scenarios without a fixed network infrastructure. Some examples are conventions or meetings, where people, for comfortableness, wish quickly exchanging of information [18], and emergency operations. Ad-hoc mobile networks show, beyond the common restrictions to the wireless network, the additional challenge to deal with a very dynamic topology. The limited energy of the devices demands smart routing protocols. In our simulations, the chosen routing protocol was AODV. Comparing the performance of some routing protocols, it can be seen that each of them performed well in some scenarios for some metrics yet had drawbacks in others [13]. We, therefore, adopted AODV, which exhibited the best performance on high mobility scenarios. The AODV protocol is a demand routing algorithm for ad-hoc mobile networks, in which each node has the information about the next hop of a route. It is an on demand algorithm since the routes are only maintained if they are being used. When a link is broken, AODV updates the nodes that have this route saved. Advantages of this protocol are: quick adaptation to dynamic link conditions, self-starting, multi-hop and loop-free [16, 19]. 5. Related Work A comparison is made in [19] between ad-hoc networks and p2p networks routing. Besides presenting taxonomy of both, this work proposed the joint use of these networks, aiming at a synergetic effect. This theory passes to the practical one in [14, 15], where the 7DS (Seven Degrees of Separation) application is presented, a network for data dissemination among hosts in an ad-hoc network. Besides the implementation, these works present the effect (through simulation) of power conservation, radio control coverage and strategies to cooperation among host to data dissemination. In [6] it is presented a platform for development of peerto-peer applications in small range ad-hoc networks, more precisely, for PAN s - Personal Area Network. A study on sensor networks regarding cooperation and network formation is made in [21]. In [13] it is evaluated the performance of several routing protocols in a p2p over ad-hoc network. The works above are related to the world of p2p over ad-hoc networks and they all lack the concern about p2p (re)configuration issues. Our work focuses exactly on this problem, analyzing its impact on the performance of both p2p and ad-hoc networks. In [4] it is described some metrics for performance evaluation in a p2p network, using Gnutella and Freenet as case studies, based on four criteria: efficiency, speed, worse case performance and scalability. Gnutella was not considered scalable, since the required bandwidth increases linearly as the network grows. Conversely, Freenet scales logarithmically, using pathlength as metric. 6. (Re)Configuration Algorithms For (re)configuration of the network we proposed two types of algorithms: decentralized and hybrid. There are three different decentralized algorithms, which are called, and. The hybrid type has only one representative: the so called algorithm. In the description of all these algorithms, it will be said that the nodes are connected,tryingtoconnect, maintaining a connection etc. It is important to notice, however, that we are dealing with wireless networks and thus there are no real connections, e.g. a TCP connection, between nodes. Then it must be kept in mind that the so called connections actually are references, that is, they represent the knowledge of the addresses of some reachable nodes. Thus, a symmetrical connection is the one in which a node A keeps a reference to node B while B also references A. Asymmetrical connections also exist and are used in the Algorithm. 6.1. Decentralized Algorithms Our system s model is based on the use of messages that are forwarded over many hops from one peer to the next in order to establish connections and to search for data. Despite having this same basis, the three decentralized algorithms have distinct behaviors, as it will be seen below. The algorithm will be presented in the next section. After that, a very important concept will be briefly described: the small-world effect [8], which was the basis for the changes that turned the algorithm into the algorithm. Finally, there is the description of these last two algorithms. 6.1.1 The Algorithm The algorithm was meant to represent a simple (re)configuration algorithm and therefore to serve as a basis for comparison. Its main characteristic simplicity implies easy implementation but partially ignores the dynamic nature of the network. This algorithm, shown in figure 1, makes use of three constants named Å Æ ÇÆÆ, ÆÀÇÈË and ÌÁÅ Ê. The first represents the maximum number of connections per node. The second is the number of hops a message travels and the third stands for the time interval between two

Establishing connections while the node belongs to p2p network if number of connections Å Æ ÇÆÆ try toestablish new connections to nodes within ÆÀÇÈË away up to the limit of Å Æ ÇÆÆ connections; wait ÌÁÅ Ê before next try; Maintaining connections while this connection exists send a Ô Ò to the connected node; wait some time for the ÔÓÒ ; if the ÔÓÒ was received then wait some time before sending next Ô Ò ; else close this connection; Figure 1. Algorithm. attempts to establish connections. The algorithm works as described below. A node, when starting its participation in the p2p network, broadcasts a message to discover other nodes within ÆÀÇÈË away in the neighborhood. Every node that listens to this message answers it. As soon as a response arrives, the node establishes a connection to the neighbor who sent it, till the limit of Å Æ ÇÆÆ connections. In case the number of responses is lesser than Å Æ ÇÆÆ, and whenever else it has less than MAXNCONN connections, the node keeps trying to create the rest of the connections. Between the trials, the node waits for a time interval ÌÁÅ Ê in order to avoid traffic overload in the network. Once a reference 1 is created its validity is frequently checked by sending pings. Whenever a node receives a ping it answers with a pong. The receiving of a pong thus signals the connection still exists while its lack means the neighbor is not reachable anymore and then the connection is over. 6.1.2 The Small-World Model In a regular graph its n vertices are connected to their nearest k neighbors. Differently, in a random graph, the connections are randomly established and k stands for the average number of edges per vertex. Thus, two neighbors of a node have a greater chance of being connected to each other in regular graphs, that is, the average clustering coefficient is much greater in regular graphs. This coefficient is obtained as explained: let Ö Ð ÓÒÒ be the number of existent connections between all the neighbors of a node (these neighbors are connected to this given node); and let 1 Remember that the so called connections actually are references. ÔÓ Ð ÓÒÒ be the number of all connections that could exist between these neighbors. The clustering coefficient is given by Ö Ð ÓÒÒ ÔÓ Ð ÓÒÒ. Besides the clustering coefficient, the regular and random graphs also have very distinct characteristic pathlengths. In large regular graphs with n much larger than k for a k much larger than 1 the pathlength is approximately Ò ¾. In large random graphs this value decreases substantially and is given by ÐÓ Ò ÐÓ [4]. Interestingly, little changes in regular graphs connections are sufficient to achieve short global pathlengths as in random graphs. The rewiring of some connections from neighbors to randomly chosen vertices represented the creation of bridges between clusters a great distance away. These bridges diminish the pathlength without any considerable change in the clustering coefficient. The graphs that have high clustering coefficients and, at the same time, short global pathlengths are called small-world graphs. Our (re)configuration algorithm presented next aimed to construct the peer-to-peer networks as small-world graphs. Before presenting the and algorithms, we will list their variables and constants, most of which are present in both algorithms. There are three variables: Ò ÓÔ, Ö Ò ÓÔ and Ø Ñ Ö. The first one represents the number of hops a message looking for a Ö ÙÐ Ö connection can travel. It is initialized with the value ÆÀÇÈË ÁÆÁÌÁ Ä, which is greater than 1, and has Å ÆÀÇÈË as an upper limit. The second one has a similar meaning but it is only applied to Ö Ò ÓÑ connections; it does not need to be initialized. The third variable stands for the time interval a node waits between two attempts to establish connections. It is initialized with ÌÁÅ Ê ÁÆÁÌÁ Ä and can increase up to Å ÌÁÅ Ê. Finally, there are two remaining constants not explained yet: Å Æ ÇÆÆ, which is the maximum number of connections per node, and Å ÁËÌ, which is the maximum distance allowed between two connected neighbors (measured in number of hops). 6.1.3 The Algorithm Initially there is the ad-hoc network over which some (or all) of its nodes want to build the p2p network, following the algorithm presented in figure 2. As it can be seen, each of the nodes broadcasts a message communicating it is looking for establishing connections. The messages have specific number of hops (Ò ÓÔ ) they are expected to travel. When receiving this message, a node willing to connect starts a three-way handshake with the sender, aiming to establish a symmetrical connection. If, within that radius, less than MAXNCONN neighbors could be symmetrically connected to, the node will make another broadcast

with a higher number of hops Ò ÓÔ ¾. Before the new broadcast, however, it waits for a Ø Ñ Ö time interval. As in the algorithm, this interval is an attempt to avoid traffic overload. This mechanism is repeated till the maximum of MAXNCONN connections or the maximum of MAXNHOPS hops is achieved, whichever occurs first. When Ò ÓÔ is set to ¼ it means that node tried all the possible values for Ò ÓÔ without connecting to MAXNCONN neighbors. In this case, the time interval Ø Ñ Ö is doubled before the next cycle of trials, in which Ò ÓÔ will restart with the ÆÀÇÈË ÁÆÁÌÁ Ä value. The variable Ø Ñ Ö has its upper limit given by Å ÌÁÅ Ê, and its lower limit by ÌÁÅ Ê ÁÆÁÌÁ Ä. A : Establishing connections while the node belongs to p2p network if number of connections Å Æ ÇÆÆ then if Ò ÓÔ ¼then try to establish new and symmetrical connections to nodes within Ò ÓÔ away up to the limit of Å Æ ÇÆÆ connections; wait Ø Ñ Ö before next trial; else Ø Ñ Ö =min(ø Ñ Ö ¾, Å ÌÁÅ Ê); Ò ÓÔ Ò ÓÔ ¾µÑÓ Å ÆÀÇÈË ¾µ; A and A : Maintaining connections while this connection exists if it is the node that asked for the connection send a Ô Ò to the connected node; wait some time for the ÔÓÒ ; if the ÔÓÒ was received then if this is a Ö Ò ÓÑ connection then if the node is nearear than ¾ Å ÁËÌ then wait some time before sending next ping; else close this connection; else if the node is nearear than Å ÁËÌ then wait some time before sending next ping; else close this connection; else close this connection; else wait some time for the Ô Ò ; if the Ô Ò was received then send a ÔÓÒ ; else close this connection; Figure 2. and Algorithms. Once a connection is successfully built, the node starts its maintenance as presented in the algorithm of figure 2. The connection is frequently checked using pings. As we are dealing with symmetrical connections, only the vertex that started the process of establishing the connection will send pings. The reception of pings is controlled by the other node with the use of a timer; whenever it receives a ping,it answers with a pong and reschedules the timer. In case a timeout occurs, it closes the connection. When receiving a pong, the other node knows its neighbor is still reachable, but this is not enough to maintain the connection. To remain connected, the distance between the nodes must be less than MAXNHOPS hops. In case the distance is bigger than that, the connection is closed. The same occurs in the absence of a pong. This algorithm has four improvements compared to the algorithm. First, the number of hops a message looking for connections may travel is increased gradually. Once this kind of message is sent by broadcast, controlling the number of hops means less traffic in the network. The traffic is also potentially diminished by the control of the distance between connected nodes as the pings and pongs they exchange will span a narrower area. This was the second improvement, which is complemented by the third one: the number of pings and pongs was cut half because only one vertex checks the connection actively, that is, sending pings. As we are dealing with wireless networks, which have bandwidth constraints, these three actions added together may have a reasonable positive impact. Last, but not least, there is the fourth improvement, related to Ø Ñ Ö and which was inspired by the dynamic nature of our network, together with the traffic concern. As it can be seen, the time interval between two broadcasts has not a fixed value. Instead, it doubles every time a cycle of attempts to establish connections is over, diminishing the overall traffic. Besides, if it has been difficult to connect to other nodes, while waiting for a longer interval the network can change to a more favorable configuration. Then, it may be easier to finally establish the desired connections. One detail not presented in the pseudo-code is that, whenever a connection is done, the Ø Ñ Ö is reset to its initial value. This is done because this new connection may be a signal of a better network configuration. 6.1.4 The Algorithm Adopting the algorithm, each node would preferentially connect to its nearest neighbors. In a dense peerto-peer network, the connections thus would be established within a low number of hops. This would probably lead to a network whose characteristics would resemble the ones of regular graphs, mainly in the sense of long global pathlengths. Aiming to avoid this and to gain small-world characteristics, our algorithm suffered a little change,

leading to the algorithm. The establishment of the first Å Æ ÇÆÆ ½ connections follows exactly the same steps mentioned in the algorithm. For this reason, they will be called regular connections. The difference of the two algorithms lies in the last connection, as it can be seen in the algorithm in figure 3. As we have already seen, few rewiring can turn a Establishing connections while the node belongs to p2p network if number of connections Å Æ ÇÆÆ then if Ò ÓÔ ¼then try to establish new and symmetrical connections to nodes within Ò ÓÔ away up to the limit of Å Æ ÇÆÆ ½ Ö ÙÐ Ö connections; wait Ø Ñ Ö before next trial; else Ø Ñ Ö =min(ø Ñ Ö ¾,MAXTIMER); if a Ö Ò ÓÑ connection is needed then set Ö Ò ÓÔ to a randomly chosen value between Ò ÓÔ and ¾ Å ÆÀÇÈË; try to establish one new and symmetrical Ö Ò ÓÑ connection to the farest node possible within Ö Ò ÓÔ away; Ò ÓÔ Ò ÓÔ ¾µÑÓ Å ÆÀÇÈË ¾µ; Figure 3. Algorithm. regular graph into a small-world graph. To promote this rewiring, the node does not try to establish its last connection within Ò ÓÔ away. Instead, it chooses a random number Ö Ò ÓÔ between Ò ÓÔ and ¾ Å ÆÀÇÈË. Then it broadcasts a message looking for connections to all nodes within Ö Ò ÓÔ hops away. It waits some time for responses to arrive, analyzes them, and only continues the three-way handshake with the most distant neighbor. Once a connection is established this way, it is called a random connection and, whenever it goes down, it must be replaced by another random connection. The maintenance of the existing connections follow the scheme shown in figure 2. The final effect expected is that some of the overall connections will link distant peers and therefore would act as bridges, turning the pathlength shorter while maintaining the clustering coefficient high. Then we would have achieved the small-world effect. 6.2. Algorithm Decentralized algorithms were designed to work on a homogeneous network, that is, network that makes no distinction of their nodes. In this case, all effort should be distributed evenly among the nodes in order to increase lifetime. However, ad-hoc networks often will be formed by different types of devices and, in this case, the most part of all effort should be made by the most powerful devices. algorithm was developed for heterogeneous networks, that is, networks in which all nodes are differentiated by a qualifier. Thisqualifier can be related to any characteristic of the node, e.g. energy level or processor power. The principle is to form subnets with one master and a limited number of slaves. The slaves can only communicate to their master, but masters can communicate with each other, resulting in the hybrid network. To achieve this configuration, A defines several states that peers can present at any given time: master, slave, reserved and initial. Each peer starts at initial state and later, becomes a master or a slave. Reserved state is only used on transitions. The algorithm is described on figure 4. Initially, each peer tries to contact other peers that are within NHOPS INITIAL (ad-hoc) hops away. If there is no response, then the peer doubles the limit and tries new contacts. Eventually, if this limit exceeds MAXNHOPS, then the peer entitles itself a master and uses the regular algorithm to contact other masters. The type of message used in this first step is the capture one, with only one argument - the qualifier of the sender. If one peer in initial state and with a smaller qualifier receives this message, it will try (through a three-way handshake) to become a slave of the sender. If the qualifier of the receiver is bigger and its state is either initial or master, then it responds with a capture message. This step guarantees that new peers will always get some feedback from their neighborhood, either by discovering the masters already there or other peers in initial state. The peers in slave or reserved state don t communicate with any one else, except their masters or master candidates, respectively. The reconfiguration is done in two kinds of situation. The first one is when a master owns no slave after a period. This master could, potentially, be another peer slave. The second one is when a slave is too far away from its master. This peer should look for another master on its neighborhood. This self-reorganization ensures that the hierarchy remains balanced and that the elements of the clusters are not being scattered over a wide area. The maintenance phase is similar to decentralized algorithm s one. A ping message is sent to all neighbors and, if a response message is not received after a while or the neighbor is too far, then the peer closes that connection and, if it is a slave, the peer resets its state to initial. It, then, tries to contact other peers.

Establishing connections while this node belongs to p2p network switch state case INITIAL: if Ò ÓÔ ¼then try to find a master and become its slave or try to find slaves and become their master; else change its state to MASTER; Ò ÓÔ Ò ÓÔ ¾µÑÓ Å ÆÀÇÈË ¾µ; break; case MASTER: use the regular algorithm to contact other masters; if Ò Ð Ú Å ÆËÄ Î Ë then accept any incomming slave connection; if the node hasn t got any slaves during MAXTIMERMASTER then change its state to INITIAL; break; case SLAVE: just maintain the connection to the master; break; endswitch 7. Simulation Figure 4. Algorithm. We used the ns-2 simulator [12], a well known simulator for networking, which supports ad-hoc network models. We chose AODV as our routing protocol, which is already available in ns-2 v.9 distribution. Its implementation, however, lacks the support to broadcast messages. We added this feature to the existing code by including a controlled broadcast function such that each node has a cache to keep track of the broadcast messages received. This mechanism avoids forwarding the same message several times. Such an improvement is relevant since each message transmitted or received consumes energy, which is a restrict resource in a mobile ad-hoc network. The simulations were repeated 33 times. 7.1. Model In this section, the p2p network model used in the simulations is discussed. First of all, it must be highlighted that our model was not developed based in any specific application and it intends to work with generic algorithms for data search and exchange. In the decentralized algorithms, all nodes are considered identical since they play the same role: they are servents. The hybrid algorithm, however, is slightly different since a node can be either a master or a slave. It is important to restate that what we are calling connections are actually references. In other words, a node keeps a list of references to other nodes it believes to be its neighbors. Connections can be either symmetrical or asymmetrical. The connections are created and maintained in order to provide the p2p application with means to perform data search and exchange. 7.2. Scenarios We have simulated an ad-hoc network with 50 and 150 nodes, distributed over an area of 100 m 100 m and with a wireless range of 10 m. The p2p is formed by 75% of all nodes, thus some nodes belongs to the ad-hoc network only. For mobility, based on human walking, we used Way model [1] with maximum speed value of 1,0 m/s and maximum pause time of 100s. The node interleaves moving and pause periods. Initially, all nodes were randomly positioned over the area, following a uniform distribution. Unless mentioned, the default values for the parameters are taken from Table 2. All scenarios were simulated for 3,600 seconds. Parameter for simulation Value transmission range 10 m number of distinct searchable files 20 frequency of the most popular file 40% NHOPS INITIAL 2 ad-hoc hops MAXNHOPS 6 ad-hoc hops NHOPS ( Algorithm) 6 ad-hoc hops MAXDIST 6 ad-hoc hops MAXNCONN 3 MAXNSLAVES 3 TTL for queries 6 p2p hops Table 2. Parameters used and their typical values. The query system used in the simulation is based on Gnutella. A node sends a query to all nodes in its list of neighbors. The query contains the identification of the requested file and of the requirer the original source of the query. When a node receives a query, it processes and forwards the message even if it has the file. In order to control the message traffic, we stated the following three rules. First, each node only forwards a query or responds to a query once. Second, a node does not forward a query to the neighbor from which it was received. Third, a query is not forwarded to its original source. In case a node has the requested file, it sends a response directly to the requirer. After sending a query, the node waits for a response for 30

seconds. Then, the node waits for a random period between 15 to 45 seconds to send the next query. Different files are distributed in the network following a Zipf law [22] with maximum frequency MAXFREQ of ¼±. This means that the most popular file will be present in 40% of all nodes, the second most popular one in ¼± ¾ ¾¼±, the third in ¼±, and so on. 7.3. Metrics Average minimum distance 1.75 1.7 1.65 1.6 1.55 1.5 1.45 1.4 Average number of answers We have chosen a few metrics considered relevant to the performance of p2p networks [4] and ad-hoc networks. Number of hops: the minimum number of hops (p2p and ad-hoc) from the source to the peer holding the requested information. We evaluated the medium value for all simulated requests. Number of exchanged messages: the number of messages of each type, queries and pings, received by the nodes. We evaluated the medium number of messages received by each node. 7.4 Results This section contains the results considering the metrics described above. These results show the expected behavior of the four proposed algorithms. The graphics in Figures 5 and 6 show the average minimum distance to reach a node that has requested file and the average number of answers per file request. Those graphics were, respectively, for the scenario with 50 and 150 nodes, 75% of them belonging to the p2p network. Clearly, the number of answers decreases as the requested file becomes unpopular, reflecting the Zipf distribution of files. Despite some oscillations, the distance tends to increase. Average minimum distance 1.45 1.4 1.35 1.3 1.25 1.2 1.15 1.1 1 2 3 4 5 6 7 8 9 10 Files Figure 5. Distance to find the file and # of answers per file request (50 nodes, 75% p2p). Average number of answers 1.35 1.3 1 2 3 4 5 6 7 8 9 10 Files Figure 6. Distance to find the file and # of answers per file request (150 nodes, 75% p2p). Figures 7 and 8 show the medium number of connect messages received by each node. The nodes were ordered by the amount of messages received. The graphics show that the algorithm, which uses broadcasts indiscriminately, presents greater values for all nodes. The progressive connection method, used by the other three algorithms proposed, makes a controlled use of broadcasts and this resulted in less connect messagespernode. Itis also depicted that the curve of the algorithm is above the ones of the and the algorithms due to the random connection establishment phase, in which broadcast messages are sent with higher TTL values. Number of connect messages 180 160 140 120 100 80 60 40 20 0 5 10 15 20 25 30 35 40 Nodes decreasingly ordered by # of received connects Figure 7. Connect messages (50 nodes, 75% p2p). Figures 9 and 10 show how the p2p configuration impacts the traffic volume each node receives. It is known that the traffic of ping messages in p2p networks is the

Number of connect messages 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Nodes decreasingly ordered by # of received connects Figure 8. Connect messages (150 nodes, 75% p2p). Number of ping messages 50 45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 Nodes decreasingly ordered by # of received pings Figure 9. Pings (50 nodes, 75% p2p). higher one [17]. The best way to cope with lack of resources in ad-hoc networks is to distribute the work among all nodes. If the network in question is supposed to be homogeneous, the more uniform the distribution is, the best performance we will achieve and the longer the network will last. This is exactly what random and regular algorithms do compared to the basic algorithm. On the other hand, if the network is heterogeneous, we should assign a higher load to nodes with higher capacity. The hybrid algorithm accomplishes this task by putting a bigger burden on nodes with a high qualifier, which means that masters get more ping and query messages (Figures 11 and 12). The three improved algorithms (, and ) profited from the symmetrical connections: only one node sends pings to check the connection; the other vertex answers with pongs and uses a timer to control the receiving of pings. This feature diminishes the overall number of messages in the network. From these results we can in- Number of query messages 160 140 120 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 Nodes decreasingly ordered by # of received queries Figure 11. Queries (50 nodes, 75% p2p). 120 100 Number of ping messages 80 60 40 20 0 0 20 40 60 80 100 120 Nodes decreasingly ordered by # of received pings Figure 10. Pings (150 nodes, 75% p2p). 700 600 Number of query messages 500 400 300 200 100 0 0 20 40 60 80 100 120 Nodes decreasingly ordered by # of received queries Figure 12. Queries (150 nodes, 75% p2p). fer that nodes communicating through the algorithm

will have to spend more battery to sustain the network. This is an undesirable situation, since energy is a scarce resource in mobile ad-hoc devices. The excessive consume of battery may cause many nodes to go down, making it necessary to reorganize the network, which in turn causes the remaining nodes to spend even more energy. The algorithm was designed based on the smallworld concept. However, in the results presented it was not possible to detect any manifestation of the small-world characteristic. One possible reason for this is that the number of nodes should be much larger than the number of connections, as mentioned in Section 6.1.2, which was not the case. We intend to explore such a scenario in the future. Another explanation would be that, due to the dynamics of the network, the random connections go down before the nodes could benefit from them. 8. Conclusions and Future Work In this paper we studied the (re)configuration of p2p networks over ad-hoc networks. This combination is a natural solution in making data available to users anytime and anywhere. In the p2p paradigm, peers can act both as client and server, which increases fault-tolerance and data availability. In ad-hoc wireless networks, each node is able to establish a point-to-point communication to other nodes within its radio signal range, without the need of a fixed infrastructure. To promote an efficient sharing of information, we designed four algorithms that provide configuration, maintenance and reorganization of the p2p network over ad-hoc network. To analyze the performance of the algorithms we performed some simulations using the ns-2. The results obtained show that the and the algorithms had almost identical behavior despite the introduction of new features in the latter. They proved to be adequate to homogeneous networks. Focusing the heterogeneous scenarios, the algorithm had a performance similar to the and the algorithms in this different type of network. Finally, we showed that the algorithm, based on a traditional fixed-network solution, presented the worse overall performance regarding the effort made by each node to sustain its participation in the network. We will further investigate the performance and behavior of these algorithms under different scenarios. We are most interested in analyzing the effects of wireless coverage, density of nodes, energy, mobility and death/birth rate of nodes in ad-hoc and p2p layers. We are also developing a theoretical study on how the connectivity of nodes influences our metrics and how small-world properties could be better used in these systems. References [1] T. Camp, J. Boleng, and V. Davies. A Survey of Mobility Models for Ad Hoc Network Research. Wireless Communications and Mobile Computing, 2(5):483 502, 2002. [2] Freenet. http://freenetproject.org/. [3] Gnutella. http://www.gnutella.wego.com/. [4] T. Hong. Peer-to-Peer: Harnessing the Power of Disruptive Technologies, chapter 14 - Performance. O Reilly and Associates, 2001. [5] KazaA. http://www.kazaa.com/. [6] G. Kortuem, J. Schneider, and D. Preuitt. When Peer-to-Peer comes Face-to-Face: Collaborative Peer-to-Peer Computing in Mobile Ad hoc Networks. In Proceedings 2001 International Conference on Peer-to-Peer Computing, August 2001. [7] R. Lienhart, M. Holliman, Y. Chen, I. Kozintsev, and M. Yeung. Improving Media Services on P2P Networks. IEEE Internet Computing, pages 73 77, Jan./Feb. 2002. [8] S. Milgram. The Small-World Problem. Psychology Today, 1(1):60 67, 1967. [9] N. Minar. Distributed Systems Topologies. In The O Reilly P2P and Web Services Conf, 2001. [10] Morpheus. http://www.morpheus.com/. [11] Napster. http://www.napster.com/. [12] NS-2. http://www.isi.edu/nsnam/ns. [13] L. B. Oliveira, I. G. Siqueira, and A. A. Loureiro. Evaluation of Ad-hoc Routing Protocols under a Peer-to-Peer Application. In IEEE Wireless Communication and Networking Conference (to appear), 2003. [14] M. Papadopouli and S. H. A Performance Analysis of 7DS: A Peer-to-Peer Data Dissemination and Prefetching Tool for Mobile Users. In Advances in Wired and Wireless Communications. March 2001. [15] M. Papadopouli and H. Schulzrinne. Effects of Power Conservation, Wireless Coverage and Cooperation on Data Dissemination among Mobile Devices. In ACM SIGMOBILE Symposium on Mobile Ad Hoc Networking & Computing (MobiHoc) 2001, October 2001. [16] C. Perkins, E. Royer, and S. Das. Ad-Hoc On-Demand Distance Vector (AODV) Routing. IETF Internet draft, draftietf-manet-aodv-11.txt, June 2002. [17] M. Ripeanu, I. Foster, and A. Iamnitchi. Mapping the Gnutella Network. IEEE Internet Computing Journal, 6(1), 2002. [18] E. M. Royer and C. K. Toh. A Review of Current Routing Protocols for Ad Hoc Mobile Wireless Networks. IEEE Personal Communications, 2:46 55, April 1999. [19] R. Schollmeier and I. Gruber. Routing in Peer-to-peer and Mobile Ad Hoc Networks. A Comparison. In International Workshop on Peer-to-Peer Computing, May 2002. [20] SETI@Home. http://setiathome.ssl.berkeley.edu/. [21] R. Subramanian, L. e Katz. An Architecture for Building Self-Configurable Systems. In IEEE/ACM Workshop on Mobile Ad Hoc Networking and Computing, August 2000. [22] G. K. Zipf. Human Behavior and the Principle of Least- Effort. Addison-Wesley, 1949.