Placing Files on the Nodes of Peer-to-Peer Systems

Size: px
Start display at page:

Download "Placing Files on the Nodes of Peer-to-Peer Systems"


1 Placing Files on the Nodes of Peer-to-Peer Systems Dissertation zur Erlangung des akademischen Grades DOKTOR-INGENIEURIN der Fakultät für Mathematik und Informatik der FernUniversität in Hagen von Sunantha Sodsee Nonthaburi, Thailand Hagen 2011

2 Betreuer : Prof. Dr.-Ing. habil. Herwig Unger Gutachter : Prof. Dr.-Ing. habil. Herwig Unger; FernUniversität in Hagen Asst. Prof. Dr. Phayung Meesad; King Mongkut s University of Technology North Bangkok, Bangkok

3 III Acknowledgements There are many people without whom I would have never been able to complete this thesis. First of all, I have to say Thank you to my father and mother, Phayung and Chalai Sodsee, for their encouragement and support. My work could not be finished without their inspiration given to me. I highly appreciate to have the financial support of a DAAD Matching Funds grant for three years, what gave me the possibility to study and work in Germany. Moreover, I am thankful to both FernUniversität in Hagen as well as King Mongkut s University of Technology North Bangkok, since their cooperation gave me the chance to write my thesis as the first binational PhD student of both institutions. I am grateful to my supervisors, Prof. Dr.-Ing. habil. Herwig Unger and Prof. Dr. Phayung Meesad for their permanent support and help throughout the time of my research and during the writing of this thesis. Furthermore, I received many hints, suggestions and support from Prof. Dr. Dr. Wolfgang A. Halang, which enhanced my academic research work and also my private life. Last but not least, I could count at all times on many friends and colleagues who accompanied me and gave me passion throughout partially difficult times: in particular my landlord and landlady, Hans and Anna Siering, who taught me to speak German, offer me a good second home as well as a nice living in Germany.

4 IV

5 Contents V Contents Abstract VII 1 Introduction Towards an Internet-based Data Management Decentralisation of Services Contribution of Thesis Outline of Thesis State of the Art Content Delivery Networks P2P Systems for Content Distribution Searching for Information Evaluation of Search Results Hubs and Authorities PageRank PageReputation Coordination and Self-organisation P2PNetSim On Random Walkers Random Walkers Definition and Related Work Population of Random Walkers Ants PageRank Calculation with Random Walks General Principles Estimation of Network Size Convergence in Real Systems Simulation Results A Generalised Node Evaluation Influence of Network Parameters NodeRank: an Extension of PageRank Simulation of NodeRank and its Properties

6 VI Contents 5 Evaluation of User Activities Characterisation of User Activities Measuring and Propagating Node Activities General Remarks Representation of Peer Activities Identifying the Utilisation of Network Areas Activity-based Clustering of Peers Planar or Plane-embedded Environments Generalisation of Clustering Method Experimental Results Application to Video-on-Demand Systems Introduction Probability Functions for Picking and Depositing Files Calculation of Parameters Performance Evaluation Conclusion and Future Work Contribution and Review of Results Conclusion and Future Work Bibliography 85

7 VII Abstract Content Delivery Networks (CDNs) are tools to distribute a fast growing amount of information to users. Ensuring compliance with the required Quality of Service (QoS) parameters becomes a key problem for providers. To cope with insufficient bandwidth and capabilities of single-server architectures, more and more distributed and even (hybrid) Peer-to-Peer (P2P) system architectures are used for data servicing. A P2P system consists of a set of fully identical, uniform node computers, called peers. Since a peer can be both, client and server, any peer can also act as a server hosting a set of information. The tedious task is now to decide on which place which contents shall be located, to guarantee short response times for all user requests. This multicriterial decision cannot be made by a single or a group of human administrators, since they are not able to oversee or control complex network systems in their entirety. In this thesis, a new parameter NodeRank is defined, and used as a single value to compare allcapabilitiesofanoderelevantinprovidingservicetoallpeersinap2psystem. Compared to PageRank known from the Internet search engine Google, this parameter does not only evaluate the topological position of a node, but also its network and hardware resources as well as the impact of user activities to it. To handle user activities, special locally working recognition and propagation algorithms are devised. Random walking of a population of walkers appears to be an appropriate tool to compute NodeRank values in the required, fully decentralised manner. The methods proposed use the effect that all considered factors will influence the random walkers transition probabilities at each node. To adjust their impact, the parameters are combined in a linear or exponential way. After all peers have been assigned to a corresponding service node each, a weighted short-distance tree of all links leaving the service nodes is established as a side effect. It is shown that such trees are useful to improve message routing, and provide the information needed to move the service nodes function step by step to well balanced, optimised positions. Last but not least, a new algorithm, which controls random walkers to pick up, move and drop files on positions guaranteeing optimal response times, is able to manage all file operations in a decentralised system. The implementation of the results within a Video-on-Demand environment leads to the development of a concept for the first, fully adaptive CDN, which can automatically fulfill most administration tasks. In this approach, random walkers constitute the universal administration tool substituting for humans. Simulations of all suggested methods in different

8 VIII Abstract scenariosunderlineitsefficiencyaswellasthefactthatupto80%ofallhumanmanagement effort for CDNs can be saved by it.

9 1 1 Introduction 1.1 Towards an Internet-based Data Management Before the broad introduction of the Internet it was quite difficult to find out in a short time whether any information about a given problem exists, and/or to obtain all available information in a reasonable time. With the World Wide Web (WWW) this situation has drastically changed in the late 1990s. Voluminous libraries with their huge amount of books and catalogues could be stored on hard disks or other storage devices within comparably small spaces, and accessed from users form all over the world due to a set of transparent protocols within few milliseconds. Nowadays, the users face exactly the opposite problem: About any topic a huge amount of information is available, and it becomes hard to evaluate whether this information is from reliable sources, valid and correct and can be used for further work. In addition, no user is able anymore to oversee the almost infinite number of data sources. Consequently, search engines have been developed to filter out requested information by a set of keywords entered by the user. Naturally, the selection of such search queries is a key issue for the quality of information retrieved and significantly influences the selection of documents found. Another problem is that any new developments or events must be described with well established keywords, since newly emerging terms are not automatically propagated. Nevertheless, for most topics even extensive experience is not enough to limit the number of search results to a few, manageable ones. Consequently, scientists and practitioners dealt with the problem of ranking search results, which might today be strongly influenced by political and especially economic interests. Despite all technical and technological progress, the download of documents (as well as the use of any other services in general) may still take significant time. Some servers may even be overloaded and break down during peak access times. One reason for this effect is the irregular and more or less uncoordinated development of the Internet. In addition, the user behaviour and, therefore, the traffic to be expected is hard or impossible to predict, and the interests of users are changing in a similar, unpredictable manner. Consequently, to the single user the whole network appears as a huge, chaotic, almost non-understandable system, in which data or services are considered to be not available, if data/services are not available fast enough,

10 2 1 Introduction data are not updated within suitable intervals and, consequently, appear as not up to date, users are frightened by huge numbers of available alternatives, servers are not known or only accessible with complex management procedures or users cannot understand or cope with access methods. Also, due to technical reasons, there always exists a gap between the bandwidth needed and the existing one to bring contents to the users within a reasonable response time or the place where content is stored and needed by the users and available software that allows users to access the resources needed in a transparent manner. Figure 1.1: Mutual influences in a service network As shown in Figure 1.1, there is a permanent, mutual influence between content, user and user activities, and the network with its parameters and configuration. Modern computers and computer networks shall be able to learn from the behaviour of the users, how to adapt their configuration and services to the users needs and their environments. Content Delivery Networks (CDN) are one approach to deliver any information with a respected Quality of Service (QoS) to users. In this context, QoS mostly means to limit package transfer times, latency and jitter as well as error rates to a still tolerable or minimal amount (mostly following some real-time conditions). Video-on-Demand (VoD) systems are just one example, where a huge amount of data must be transferred in real time to a high number of users in front of their monitors.

11 1.2 Decentralisation of Services 3 Fully centralised, client-server-based and mostly manually managed architectures can normally react only slowly to changes in their environments and the behaviour of their users. Any breakdown of routers or computer networks as well as otherwise overloaded network connections may significantly influence the QoS of such networks and, consequently, limit their usability. 1.2 Decentralisation of Services Distribution of data or data duplication, i.e. the use of multiple servers mirroring the contents of one machine, farm or cloud of servers, is one way to overcome the problems described above. In these cases, information from one or more machines can be replicated and/or moved to any other one. This requires that each machine in a server farm or cloud knows at least one other participating machine and can communicate with it. The user s clients are either connected with one or more of these servers or are assigned to the closest or fastest server by a broker (see Figure 1.2). Figure 1.2: Typical architecture of a broker (a) and KaZaa (b) network Systems with such architectures are close to or are already so-called Peer-to-Peer (P2P) systems, which consist of numbers of equal machines, called peers. On each peer a respective software is installed, which enables communication between the participating peers. A shared communication protocol ensures cooperation and coordination of activities among the peers, which are also called servants (derived from the concatenation of parts of the words server

12 4 1 Introduction and client). A peer is part of such a community if it knows at least one other peer of the community. It is easy to understand that every (new) peer has to contribute with the execution of some basic services to the correct work of the entire P2P system it is connected to, and contribute with its resources to the success of the whole system [55]. Each peer can fulfill client tasks when it requests data, or server tasks in case it offers data like a server. As known from the literatures (e.g. [72, 76, 85]), such systems are robust, fault-tolerant, flexible and may easily adapt to changing needs and user requirements. In addition, a good scalability of these systems has been observed [55]. Most industrial IT systems can be characterised to be P2P systems [20, 38, 110], since due to the globalisation process companies are either geographically distributed and also have a distributed (and often independent) IT management or have been built from a set of stakeholders having been separate before, changes in system structure are made locally depending on the local administrators decision, nobody (i.e. no user or administrator) can oversee systems in their entirety and nobody can control entire systems and force or initiate changes to optimise their (global) behaviour. P2P systems might become the most advantageous IT systems if automatic reconfigurations can be made by fast and automatically performed content relocation, replication and updates of their logical structures. This requires that a system is able to evaluate its current state by changes in a set of parameters, which can be determined in an automatic and self-adapting manner and which result in respective improvements of the entire system. Therefore, respective parameters and metrics must be found and it must be defined, how they can be determined and applied for system optimisation in a fully decentralised manner. 1.3 Contribution of Thesis Several indicators like Hubs and Authorities [47], PageRank [64] or PageReputation [69] have been developed to evaluate and rank the search results provided by major, centralised search engines. The determination of all these indicators requires global knowledge about the topology of the so-called web graph [16] (or at least bigger parts of it), and powerful computing facilities for their calculation. So far, a big disadvantage of these methods is

13 1.4 Outline of Thesis 5 that they only consider content-oriented aspects, and do not combine them with capacities to access these data or other user-relevant information. Consequently, the first major goal of this thesis is to show that ranking parameters may be obtained in a fully decentralised manner by algorithms based on so-called random walkers with satisfactory real-time performance. It is clear that in large networks populations of random walkers must be used in order to increase the computing speed by parallelism. Different methods for adequate coordination and synchronisation will be derived and discussed, too. The second objective is to show that aspects of content, network or users may not only be considered in an isolated manner, but can be computed and combined into a single parameter or evaluation set. It will be shown that in most cases weighted linear combinations of normalised parameters can be used for this purpose. In this manner, simplicity and fast computability can be ensured and extensions can be added easily. Last but not least, simple, weighted expressions allow fast adaptations by learning algorithms. To facilitate understanding, the principal mechanisms are introduced and presented in a rectangular grid environment, first. These planar graphs allow to easily visualise and understand the functionality of the derived methods. Later, most of these methods will be generalised for so-called small world communities. For the corresponding simulations the tool P2PNetSim [22] is employed, which has been developed for the local investigation of algorithms working in P2P systems. To validate the concepts developed, the application of the newly developed methods in a Video-on-Demand system shall be discussed. 1.4 Outline of Thesis In the subsequent Chapter 2, the state of the art will be discussed. First, decentralised (P2P) systems are introduced and, in particular, existing methods to search for data and services in them are outlined. Then, qualitative and quantitative methods for the evaluation of search results are presented. An overview of Content Delivery Networks (CDN) and the file management in them completes this part of the thesis. In Chapter 3, the introduction of the methods proposed will be prepared by a short review of random walking techniques and populations of random walkers. Also, some new, special adaptations to random walkers are derived and evaluated by simulations. It is shown that the well known PageRank may also be used to determine the importance of a peer for the communication of a whole community/system. For this purpose, the PageRank value may also be calculated by a population of random walkers.

14 6 1 Introduction Chapters4and 5constitutethemain part ofthethesis. Inthesechapters, themainmethodology is presented, how a rating of peers depending on content, user activities and network parameters can be obtained. First, the well known PageRank algorithm is extended by network (exemplary bandwidth-related) considerations to obtain a more general parameter, called NodeRank, for this purpose. Later, user activities are taken into account and included in the methodology described. All methods are evaluated by sets of simulations. Since all methods are working in fully decentralised environments, the required normalisations cannot be carried out with global knowledge; several considerations on these problems will complete these two chapters. Chapter 6 is dedicated to the application of the methods newly introduced. To this end, CDNs and especially VoD systems are selected. The derived peer comparison methods will be used to place files in a performance-optimised manner by an random walk-based drag-and-drop mechanism, which is then applied to VoD systems. A summary combined with a description of how to use the results achieved in other research projects is given in Chapter 7. An outlook on future, on-going research will finish the thesis. Most results presented in this thesis have already been published in scientific journals [83] and proceedings of international conferences [82, 84]. The main contributions of these publications (i.e. ideas and algorithms) have been found and elaborated by the author of this thesis.

15 7 2 State of the Art 2.1 Content Delivery Networks As mentioned above, with the tremendous growth of the Internet, Content Delivery Networks (CDNs) [65, 101]), such as Akamai (see [87] and Figure 2.1, have to support highperformance delivery of content to requesting users. A CDN is a technology implemented as a virtual network to distribute contents and resources. It is formed by server groups on different locations, regularly distributes content to the clients and handles the traffic according to the demand for contents. Finally, user requests will be forwarded to servers in such a way that users can access information from the nearest server with the fastest speed. Its contents can, however, be of various kinds of media such as audio, video, documents, images and web pages. One of the most challenging applications are Video-on-Demand services such as YouTube [33] because of the quality of service requirements to satisfy. Figure 2.1: Main functionality of the Akamai system In fact, CDNs are expensive system architectures, because network connections with high bandwidth and a number of surrogate servers must be provided in order to ensure high video quality. Also, bandwidth, storage capacity and surrogate server structure must be adapted

16 8 2 State of the Art to a growing user population[54]. As most CDNs are(still) mainly based on the client-server model, they can face bottleneck problems, and the servers represent single points of failure and sensitive points for attacks. It is known for a long time that popular web services often suffer from congestion due to largenumbersofrequestsissuedtothem. Ifaparticularserverisworkingatfullcapacityand its traffic is increasing, then the web site holding the content turns temporarily unavailable. Secondly, nodes of most CDNs are located at fixed locations in the Internet [90]. Thus, they do not address the need of individual users to distribute their own content. In contrast, P2P networks are dynamic networks in which nodes can be added or removed anytime. They also benefit from users that share content among each other (peers) on the Internet by using their own resources. The reliability and performance of traditional CDNs are affected by the distributed content locations, caching strategies, routing mechanisms and by data replication [79]. In addition, data consistency is also important for their performance. As clients assume that they obtain content directly from the original servers, they expect to receive current content. Consequently, to support their reliability and performance requires high cost of maintenance. Last but not least, uploading contents to and updating them on a centralised server may be an exhausting and time consuming work for all system users. Finally, the centralisation of data may cause tremendous security problems: not only that a central server as a single point of failure is attacked, but the big amount of sensitive data available may also attract hackers. To deal with the problems of fixed-location machines and the bottleneck of CDNs, server farms, server clouds and cluster-based hybrid P2P systems or hybrid P2P systems have been proposed to be used in CDNs. Somehow, they are all combinations of fully centralised and pure P2P systems. Clustering is often used to keep similar data close together, or to store data close to a user group with the respective demands for them. Nevertheless, long-distance access is still possible using respective connections between the servers. The concept allows for fast access to information when searching and prevents the bottleneck problem. Last but not least, it already represents basic issues of the small-world concept [46, 91] to be discussed later. The most popular example for hybrid P2P systems is the file sharing system KaZaA [34]. It includes features both from the centralised server model and the P2P model. To cluster nodes, certain criteria are used. Nodes with high storage and computing capacities are selected as supernodes. The normal nodes (clients) are connected to the supernodes, which communicate with each other via inter-cluster networks. In contrast, clients within the same cluster are connected to a central node. The supernodes carry out query routing, indexing and data search on behalf of the less powerful nodes. Hybrid P2P systems provide

17 2.1 Content Delivery Networks 9 better scalability than centralised systems, and show lower transmission latency (i.e. shorter network paths) than unstructured P2P systems. Last but not least, KaZaA provides a new protocol generation for P2P systems, which are able to restart a broken connection at the point of interruption either from the previously used or an alternative server. Figure 2.2: Application of a VoD system in modern airplanes For a long time, many researchers have been proposing different techniques to increase the QoS in existing content distribution systems. These are the surrogate server concept: providing content stored in several servers located near users [66], content replication and distribution: reducing congestion and improving query response times by selecting frequently used machines and routing hubs as well as the computers of frequent requesters as replica nodes, and dynamically adapting to nonuniform and time-varying content popularity and node interest [80], partial indexing search: building a partial index of shared content in order to improve the success rate and search speed in locating content by maintaining two types of information: the top interests of nodes and unpopular content [111], optimal path selection: a new routing algorithm based on the concurrency in Petri nets and the techniques of ant algorithms; it can achieve high routing efficiency and reliability with its hardware costs far less than that of traditional CDNs [52], reduction of the server s load: analysing user activities to reduce the server load; when a live video is watched by users, these users can help each other to reduce the load on the server by providing the content themselves, consequently the QoS is increased [40].

18 10 2 State of the Art VoD services, which are of increasing importance for certain businesses such as airlines, as shown in Figure 2.2, and in education are one timely example for CDNs. Their typical structure is shown in Figure 2.3. Lates in this thesis, they will be used to present solutions to find suitable servers for content placement. As P2P systems are often employed in CDNs, and because they have to handle extremely large amounts of data, they should make use of these techniques to enhance content distribution. Figure 2.3: Typical architecture of Video-on-Demand systems as used, e.g., for This is especially true, since VoD services must additionally fulfill low latency constraints [54, 107], allow random frame seeking and access to provide a user experience at the same level of quality as known from local file playback. Due to their inherent scalability, P2Pbased approaches can overcome the disadvantages of client-server-based architectures, since each peer can act as streaming client and server at the same time. Cluster-based hybrid P2P systems are considered as solutions which combine the advantages of P2P technologies and client-server models [108]. Video files should be placed on nodes as follows: 1. nodes with a central position. 2. nodes with high speed and low-latency network connections, which support the above mentioned QoS requirements, 3. nodes, which are close to those users, who frequently access files. In order to solve these placement problems, new and innovative algorithms are needed, preferably to run in a decentralised, flexible environment.

19 2.2 P2P Systems for Content Distribution P2P Systems for Content Distribution Peer-to-peer (P2P) systems as already mentioned in the introduction have been successful and popular for large-scale content distribution especially in the area of music files and videos. Their service costs are lower than those of CDNs and they can overcome some problems of CDNs. To increase quality of service(qos) in content distribution systems, including response time, latency, throughput and accessibility, the advantages of client-server models and pure P2P systems are combined in so-called hybrid P2P systems [108]. They provide better scalability than centralised systems and show lower transmission latency than unstructured P2P systems. Moreover, their infrastructures are also similar to CDNs in case that there are network clusters containing supernodes providing services to their clients. Consequently, the supernodes are acting quite similarly to surrogate servers in CDNs. According to the benefits of hybrid P2P networks, we focus here on content distribution services based on hybrid P2P networks. In order to place content, three factors, viz. content, network parameters and user activities, influence the performance of P2P networks. Both content and network parameters have been studied to distribute content: [80] and [111] focused on the aspects of distributing content replica and of content popularity, respectively, whereas[83] paid attention to network parameters such as bandwidth of communication links to identify suitable locations for storing distributed content. Napster (see [35] and Figure 2.4) is a pioneer of P2P content distribution systems and is often referred to as P2P file sharing system. Its goals are to facilitate the exchange of files among a large group of independent users connected through the Internet. There are central servers to maintain an index of files that are shared by the peers connected to one of the servers. Queries for files of peers are sent directly to a connected server. Later, that server returns a list of matching files and locations and the receiving peers exchange the files directly. A few years later, FreeNet (see [21] and Figure 2.5) has been implemented. It is also a well-known distributed content storage and retrieval system based on the P2P paradigm. It conducts searches sequentially along random paths. Contents being forwarded are propagated from node to node and replicated on each node along the path. New links are established between the nodes participating in a request for a content, and the chain of nodes visited up to the eventual source of data. In addition, location-independent keys are used to store and retrieve contents. FreeNet does not use broadcast searches nor centralised location indices, but searches serially. FreeNet has some significant advantages compared to all other P2P file sharing systems: 1. The participating peers cannot know what kind of information they store, since only hashes are used for file identification. Therefore, FreeNet is juristically less problematic than any other P2P systems.

20 12 2 State of the Art Figure 2.4: The classical Napster architecture 2. Content is replicated automatically depending on the demand for it, i.e. content of high interest exists more often than unwanted spam and also freeriders automatically obtain not so fast access to resources. 3. Also, the system structure will automatically be adapted depending on the user and download activities. Figure 2.6 summarises the role of P2P systems in a contemporary system hierarchy. All network services start on the hardware and TCP/IP levels, which allow the routing of any data package form one machine to another. Unfortunately, there is no links between network addresses and content hosted per se. Only on the application level, these links can be created, e.g. using P2P communication protocols, to overcome this problem. By using links or neighbourhood information, any two documents or nodes with similar or related content may be connected easily. From the system point of view, they are then considered as close to each other. Even when similar information may now be found in a direct neighbourhood, there is no method to locate any content in the system within a relatively short time. Therefore, most systems use broadcast-based requests to find suitable information flooding them with huge amounts of messages. Researchers are addressing this problem for already several years and try to improve search in P2P systems. They mostly add an overlay layer on top of the anarchically grown P2P system level. Normally, for this purpose regular structures are artificially built, which allow to apply standard search mechanisms or support the needed management in another manner.

21 2.3 Searching for Information 13 Figure 2.5: File replication and topology evolution in FreeNet The main results achieved to particularly improve the search for data shall be shortly reviewed in the next section. 2.3 Searching for Information In almost all cases, search subjects are described by existing keywords and ontologies [74]. Hereby, the main disadvantage is that new search subjects must be described by known keywords, too. Only a few approaches like the FXResearcher [48] allow to analyse an entire document s text in order to find similarities by comparing semantically related items. As discussed in the previous section, flooding [75] a network with messages is a fast and simple, but relatively expensive procedure to spread information or search requests in a system. Thus, usual P2P systems may generate approximately 60% of the traffic in several

22 14 2 State of the Art Figure 2.6: Structural levels in a computer architecture network parts, and many system administrators have banned P2P application from their systems so far. Due to this fact, which is supported by many lawsuits following copyright fraud by file sharing systems using the P2P paradigm, the term P2P system became a synonym for almost criminal system operating principles. Differing from the flooding approach, serial search as in FreeNet[21] may be more economic, but eventually requires a lot of time until a satisfying result is found. In [21] the case is considered that a set of serial search chains working in parallel explores a distributed network. Of course, a speed-up may be achieved this way, but it is still limited by the number of serial searches and not big enough for huge networks. Currently, P2P systems usually employ search techniques based on Distributed Hash Tables (DHT) [8] as content management tool efficient in terms of network bandwidth. However, DHT causes considerable overhead with respect to indexing files. Although DHT does not adapt to highly dynamic networks and dynamic content stored in nodes, its main working principle is quite interesting. As it is shown in Figure 2.6, DHT generates other, new structural layers on top of P2P overlay networks in order to manage key-location pairs (see Figure 2.7). For this, mainly ring-like structures as in Chord [85] or more complex tree-

23 2.3 Searching for Information 15 Figure 2.7: Visualisation of typical P2P networks [100] Table 2.1: Look-up performance and state space sizes of DHT algorithms with N nodes and dimension d Algorithm Look up State Grid CAN O(dN 1 d) O(d) Chord O(logN) O(logN) Kademlia O(logN) O(logN) Pastry O(logN) O(logN) Tapestry O(logN) O(logN) like structures as in Pastry [76] or Kademlia [58] are used. Content Addressable Networks (CAN) [72] are one of the most advanced DHT systems, since it may use the successively divided, entire two-dimensional x y coordinate space to manage such key-node pairs. The working principle is always the same. SHA 1 hashes of key-node pair values are located on a well defined place in an expandable but fixed and sorted structure, which allows the fast application of standardised search functions for their location. As Table 2.1 shows, for each search procedure cost and overhead can significantly be reduced by the application of DHTs. Nevertheless, DHT-based methods show bad performance in case peers leave a system without proper log-out notification. Also, peers in the system are forced to store information on content that cannot be influenced or understood by their peers. Consequently, advanced methods have been investigated, which are not based on hash tables. Some of these methods are based on so-called random walks and are discussed in Section 3.1 in more details. Mostotherapproacheslike[53,103]trytosetupregularstructuresontopofananarchically grown, unstructured P2P network. Naturally, as for DHT the respective structure building

24 16 2 State of the Art is carried out in a fully decentralised manner, i.e. the locally working algorithms only use data which are either available on the nodes considered or which may directly be obtained from their neighbours by a simple, direct communication. One of the first of these approaches was described in [99]. Here, the possibility to build an n-dimensional hypercube on top of a P2P system by using an algorithm working locally on each peer of the system could be shown. Delaunay triangles [53], hypercast [103] or Resilient Overlay Networks (RON) [4] are just some more examples for these approaches. The resulting systems take benefit from the fact that the naming of the nodes in these structures allows to separately calculate on each node routing or data location information from local information or the regular structure allows for simple embedding into any kind of coordinate system and, therefore, for easy determination of the direction from the source to the destination of a search or routing procedure. Figure 2.8: An idea for a future Internet operating system using grids as basis for virtual maps Rectangular, two- and three-dimensional grids play a special role among the network structures (see also Figure 2.8), since they occur quite often in daily human life, e.g. as basis for maps. A first try to use a decentralised, map-based navigation approach was made in [24]. The main goal of this work was to develop an interface allowing intuitive user navigation in the Internet by usual maps and hereby obtain the same advantage as by the use of desktop symbols in a window environment. As [12, 13, 88] have figured out, grids benefit from the always (at least) implicitly given absolute or relative coordinate system defined by them. If coordinates are chosen depending on information and/or content [22], similar information may be located close to each other

25 2.3 Searching for Information 17 (as in standard P2P systems). Given any information or node ID, however, there is always a direction how to reach this point from the current position. This is a big advantage of grids and can be used for many practical applications [13, 88]. Beside that, the connectivity and, therefore, the number of different paths between any two nodes is rather high in a grid. Only a few more, innovative search methods are addressed in the literature, which can prevent the above cited disadvantages without building another layer of overlay structures. These are, for instance: Content-based topology evolution [77], whereby the structure of an existing, unstructured and anarchically grown P2P network/community is permanently changed and updated such that, for a given percentage p, peers try to keep the peers with the most similar contents in their neighbourhoods; Thermofield-based search [51, 97], whereby the analogon of a hot temperature spot is used to model peers with a huge amount of changes or new data, and a (scalar) thermofield is used to guide a searcher to the respective local maxima; Collaborative Filtering [94, 96 98], whereby the search activities of a group or community of users are coordinated by the use of buffers, labels or structures in order to 1. prevent that some locations are searched several times, 2. keep frequently used search results accessible for several users, 3. establish and store user evaluations, and to thus build group consensus / memory / conscience and establish work division and cooperation. Although structures, cooperation and coordination can speed up search procedures, they cannot prevent a huge amount of suitable results to be found. Therefore, it is necessary to rank the search results by a metric to determine how well results found match the given keywords and/or the wanted contents AND an availability estimation related to the expected availability and network speed for the location found AND an evaluation of other users and their frequency of access to this information also giving hints about trustworthiness and correctness of the search results. The ranking algorithms known so far will be discussed in the next section of this chapter.

26 18 2 State of the Art 2.4 Evaluation of Search Results Hubs and Authorities One link-based algorithm for web document retrieval is called Hyperlink-Induced Topic Search (HITS) [47]. It maintains a hub and authority score for each page, which are computed by the linkage relationship of pages. Hereby, authority pages often focus on one topic only, and have many incoming links, while hubs point to many authorities 1. It is clear that search engines shall return the most authoritative pages back to their users. While searching, a non-negative authority weight a(x) and a hub weight h(x) is assigned to each node x V of a considered web graph or subgraph of it (which can also be expressed in a vector-like manner). In the HITS algorithm, the hub and authority values are calculated in an iterative replacement process. The authority value is calculated as sum of the hub values of all predecessor nodes v linking to x, a(x) = (h(v)), v x while the hub value can be determined by the sum of the authority weights of all successors w of x, i.e. h(x) = (a(w)). x w In order to make these values comparable, a normalisation is needed, such that and (a(x)) 2 = 1 x V (h(x)) 2 = 1. x V These normalisation processes require global knowledge of entire graphs or all results, respectively, and, therefore, centralised servers are needed for their computation 2. Consequently, HITS cannot be applied easily in decentralised systems like P2P networks to rate content found on a given peer by determining the peer s hub or authority value. 1 Note, that it will become clear later that considering the in-degree of a given node alone is not sufficient to calculate these values. 2 Later, it will be shown that both values are strongly connected with the weblink structure, and converge to the principal eigenvector of (M T M), where M is the adjacency matrix of the webgraph

27 2.4 Evaluation of Search Results PageRank To ease the search for information, several web search engines were designed, which determine the relevance of keywords characterising the content of web pages and return all search results to querying users (or nodes) similar to an ordinary index-based keyword search method. Usually, more results are returned than users expect. As a consequence of this, a ranking of query results according to keyword relevance is needed to help searchers to access the lists of search results. In particular, the search engine Google processes queries consisting of one or more terms. To be able to return a ranked result list, a link analysis algorithm called PageRank [64] is used to define a rank for any page by considering the page s linkage. The importance of a web page is assumed to correlate with the importance of the pages pointing to it. Indetail, PageRankisbasedonuserbehaviour: auservisitsawebpagefollowingahyperlink with a certain probability η, or jumps randomly to a page with probability 1 η. The rank of a page correlates to the number of visiting users. Classically, for PageRank calculation the whole network graph needs to be considered. Let i represent a web page v i on graph G = (V, E), where V is the set of web pages and E is the set of links, and J be the set of pages pointing to page i. Further, let the users follow links with a certain probability η (often called damping factor) and jump to random pages with probability 1 η. Then, with the out-degree N j of page j, where N j = {v i V (v i, v j ) E}, PageRank PR i of page i is defined as PR j PR i = (1 η)+η N j J j. (2.1) The damping factor η is empirically determined to be 90%. Due to its efficiency in the most widely used search engine, the link analysis algorithm PageRank for determining the importance of nodes has become a significant technique integrated in distributed search systems and, consequently, in [112], [78] and [41] distributed PageRank computations were proposed. The approach [112] is based on iterative aggregationdisaggregation methods. Each node calculates a PageRank vector for its neighbouring nodes by using links within sites. The local PageRank is updated by communicating with a given coordinator. In [112] and [78], nodes compute their respective PageRank locally by communicating with their linked nodes. Moreover, [41] argues that each node may only exchange its PageRank with nodes to which there are links and, thus, pays attention them, only.

28 20 2 State of the Art PageReputation The quantity PageReputation, defined by Rafiei and Mendelzon [69], is quite similar to PageRank and relies on the model of the random web surfer browsing the web. Different to PageRank, the surfer is looking for pages with a topic related to a term τ only, or follows a random outgoing link from the current page. The reputation value corresponds to the number of visits of a given webpage w containing the term τ by a random surfer. Therefore, a high PageReputation value can also be understood as an estimation, how authoritative a webpage w is related to the term τ. Let N τ be the total number of pages on the web containing the term τ, and d x the number of outgoing links from page x. If p is the probability that a random surfer selects, uniformly at random, a page from a set of pages containing the term τ, then (1 p) describes the probability that a random surfer follows any other outgoing link from the current page. Furthermore, let R be a matrix whose rows correspond to web pages and whose columns corresponds to the term occurring on those pages. Then, PageReputation is iteratively calculated as the probability that a random walker visits a page w k steps after having visited x by R k (x, τ) = 1 p R k 1 (x, τ). d x Due to the computation of R, also this process needs some global webgraph knowledge and, therefore, requires a centralised server. Consequently, its use cannot easily be adapted for fully decentralised working systems like peer-to-peer (P2P) networks, either. Later considerations will show how new, random-walker-based techniques may help to enable fully decentralised computations of ranks and evaluations for nodes and search results. 2.5 Coordination and Self-organisation As contemporary systems are growing bigger and more complex, system configuration and optimisation become more tedious tasks. Systems administrators need to synchronise their work more intensely than ever before, and need more efficient tools to support their duties. Coordination is a solution for computers, programs, people etc. to work together for a goal or effect [15, 57] and clarifies the use of shared resources. Collaboration [104] and competition are just the two extreme but, in general, interleaving aspects of one and the same phenomenon [104]. It is clear that coordination of activities needs communication and processing overhead for its organisation, comes along with a cooperative division of work,

29 2.5 Coordination and Self-organisation 21 requires group building, arguing for the establishment of common goals and the solution of conflicts in different manners [62], and can be based on support specialisation. Therefore, research needs to deal with the question, under which circumstances there can or must be cooperation. In most cases, the emergence of cooperative processes depends on the goals, characters and behaviour of individuals. In game theory, strategic situations, or games, are modeled, where an individual s success in making choices depends on the choices of others [31]. Nash equilibria [6, 31] describe a balance in such systems with a current set of strategy choices (and, of course, the corresponding rewards), such that each player has chosen a strategy and no player can benefit by changing his or her strategy while the other players keep theirs unchanged. Normally, even for simple game set-ups it is a non-trivial task to decide, whether such a Nash equilibrium exists and for which configuration it will be obtained. That is why only for a few technical systems-theoretical foundations exist and can be applied. Consequently, the consideration of self-organszation and processes resulting in selforganisation, self-configuration or self-maintenance gain more and more importance. Selforganisation is a process in which 1. an optimal (or at least suboptimal) behaviour or state of a system is reached 3, 2. a structure or pattern is present in a system, 3. no central authority or external element is involved in planning or control, 4. the local interaction of a system s elements exhibit a globally coherent pattern, and 5. organisation is achieved in a way that is highly parallel (all elements act at the same time) and distributed (no element is a central coordinator). Many examples for self-organisation are known from nature (see Figure 2.9), with their processes being results of long-lasting evolution and selection. Consequently, employing (with adaptations) processes from nature in computer systems may often result in successful solutions [9, 29, 39, 73]. In technical systems, agents replace biological individuals, whereby an agent is an actor and decision maker, normally owning some artificial intelligence [59]. It is an autonomous entity which observes its environment using sensors, 3 Note, that this does not necessarily means that an optimum is reached for each participant.

30 22 2 State of the Art Figure 2.9: Self-organisation in nature [92, 105] acts upon its environment by actors and directs its activity towards achieving goals (i.e. it is rational). Intelligent agents [106] may also learn or use knowledge to achieve their goals. They may be very simple or very complex; rule-based, goal-oriented or utility-based agents [106] can be distinguished. Mobile agents [19] may move in a given computer environment similar to their natural counterparts. Nevertheless, Figure 2.10 shows that stationary agents may fulfill the same tasks and replace mobility by a set of communications. Hereby the messages themselves can sometimes be considered as agents or so-called random walkers, as it will be shown later in this thesis. As a consequence, an agent population and its network environment must be considered together in a simulative investigation. 2.6 P2PNetSim To consider the performance of all algorithms developed, a powerful simulation tool is needed. A whole set of requirements must be fulfilled by such a tool, in order to be able to make repeatable experiments in a well defined environment: 1. The tool must be able to set up a realistic IP-based network with adjustable transmission parameters like bandwidth and latency. Therefore, a larger amount of machines must be usable, which are organised in a set of hierarchically ordered subnets. 2. P2P overlay networks shall be easy to set up. This concerns

31 2.6 P2PNetSim 23 Figure 2.10: Comparison of stationary and mobile agents [49] the connection between peers, which should be generated by one of the accepted small-world models, i.e. either the Watts-Strogatz model [86] or the preferential attachment model [10], the peers services should be easy to program within one interface class, and should be able to use a set of standard P2P communication and management routines like broadcast, PING and PONG services [35]. Networks shall be generated quickly, and the parameters for all peers automatically. 3. As usual for simulations, activities shall be protocolled, repeated runs with various parameters and influences on each single peer shall be possible as well as statistical evaluations for each run in a graphical user interface. The P2PNetSim tool [22] fulfills all the requirements above. Figure 2.11 presents a screenshot made during the use of this tool. In addition, it allows fast simulation of up to 2 million peers on a distributed architecture. It runs on up to 256 machines and provides various libraries. A peer can store arbitrary information based on key-value pairs to model its state. This state can be changed during simulation and, if wanted, written back to the simulation set-up XML-file, to be recalled if a simulation is to be resumed at a certain point. Naturally, all events occurring during a simulation can be protocolled in a log-file. Due to intense use of this tool, many network configurations and comparison methods have been implemented and are available now.

32 24 2 State of the Art Figure 2.11: Screenshot of a simulation using P2PNetSim For simulation, usually a standard network configuration with equal communication times among all peers is assumed in order to avoid influences from the network architecture on the algorithms, and to support clear performance analysis of the methods developed. A P2P network can be initialised with a given number of peers in a configuration file and using random graphs, the Watts-Strogatz small-world model or the preferential attachment model (cp. [2]) to establish the neighbourhood relations between nodes. For the simulations in this work, normally the Watts-Strogatz model has been used. It generates an initial ring of N nodes, with chords to the d nearest neighbours. Then all n d 2 edges are re-wired with a probability of p. Normally, for these simulations, N ranges from 500 to 10, 000 constituting a good compromise between fast simulation and networks big enough to properly show all relevant effects. Normally, d is chosen to be 4, 6, 8 for network sizes up to 500, 5, 000 and above. Allsimulationsinthelaterchaptersareusuallyconducted timesinordertoensure that the measurement results are in the respective confidence intervals. If not indicated otherwise, random network structures are re-generated for every experiment with the same parameters in order to exclude dependencies on any special network topology. Partially, rectangular grids are used for the experiments due to their symmetry and support good visualisation of the results achieved. First, amorphous P2P systems are generated and after that grid overlay structures are built. These settings are the basis for all developed algorithms and simulation results to be discussed in the subsequent chapters.

33 25 3 On Random Walkers 3.1 Random Walkers Definition and Related Work Many routing, search and management approaches in distributed search systems seek to optimise search performance. The objective of a search mechanism is to successfully return desired information to a querying user and to prevent too much overhead. In order to meet these goals, several approaches, e.g. [14, 81], were proposed. Random walks are a popular alternative [14]. Let G = (V, E) be an undirected graph representing a network topology, where V is a set of nodes v i, i = {1, 2,..., n}, E = V V is a set of links e ij and n is the number of nodes in the network. In addition, the neighbourhood of node i is defined as N i = {v j V e ij E}. Typically, a random walker on G starts at any node v i at a time step t = 0. At t = t+1, it moves to v j N i selected randomly with a uniform probability p(v i, v j ), where p(v i, v j ) = 1 N i is the transition probability of the random walker to move from v i to v j in one step. Many standard problems and questions have been considered in the literature. The most important ones related to this work are: 1. How long will a single random walker need to see all nodes of a graph [3, 7]? 2. How long does it take on average until two (or more) random walkers meet [3, 23]? 3. Do the answers to the two questions above depend on the graph topology [25]?

34 26 3 On Random Walkers Population of Random Walkers Many methods employ random walkers as described, for instance, in [17, 60]. A random walker is a data structure or message generated by one node. After that, it is successively forwarded to one randomly chosen neighbour each depending on its current location. Thus, it can be expected that all nodes of a (connected) network will be visited periodically. Since a round trip through a network may last long, in the literature mainly populations of (more or less synchronised) random walkers are used. The size of such a populations can easily be controlled in a decentralised manner [94]. Therefore, each node has to define two times t min and t max, describing the length of the shortest and of the longest interval between which the node shall be visited by any two subsequent random walkers of the population. Every node has then to execute the following algorithm: 1. Set the mean visiting time of that node to t avg =. 2. If t avg > t max : Generate a new random walker. 3. If t avg t min : Send all random walkers (i.e. an arrived and/or any newly generated random walker) to any neighbour with equally chosen probability, and start measuring the time t until the next random walker s arrival. 4. If t avg < t min : Cancel current random walker. 5. Wait until a new random walker arrives at the measured time t and calculate t avg from a fixed number of last visits. 6. GoTo 2. The algorithm above has been simulated in a small-world P2P community using P2PNetSim. The typical results obtained are shown in Figure 3.1. It can be seen that the population size is almost constant. Only a few random walkers are generated or canceled in each time step. The size of the oscillations around the mean population size mostly depends on the difference t max t min. Bigger intervals result in lower numbers of newly generated or canceled random walkers and higher oscillations of the populations size and vice versa. For most applications, however, this kind of fully decentralised population control will yield suitable results. If random walkers are meeting on one network node, they may execute several operations on that node and with each other, like merging data in order to combine them and reduce network traffic, exchanging or sorting data attached to them and generating more random walkers to carry some new data.

35 3.1 Random Walkers 27 Figure 3.1: Population size of random walkers controlled by a decentralised algorithm [94] There are some approaches like [95, 98] used by populations of random walkers to build tree-like structures, where a node is represented by one random walker having a set of nodes or another group of random walkers as its child. These trees may be sorted according to some key information [95, 98] and/or tuned depending on the circulation times needed or other processing parameters. In [94], an approach to use random walkers for a stochastic group authentification in a P2P system is shown Ants Natural ants are social insects. They use a stigmergy [30] as an indirect way of coordination between them or their actions to organise themselves and to produce intelligent structures without any plans, controls or direct communication. A single ant in an environment behaves like a random walker: it moves to any randomly selected direction and may change its direction at any moment. The behaviour of ants becomes more interesting if a group, a colony or a pile of ants is considered. Naturally, no ant may oversee or control the whole pile, but an interesting work division, cooperation or coordination will arise. The most interesting reason for many coordination processes is found in the so-called pheromones that are used by ants in the following ways: 1. each ant may generate pheromone trails, see Figure 3.2, (usually using different pheromones for different tasks and situations); each ant moving on a trail may reenforce its pheromone concentration; 2. in the environment, the pheromones may decay, following a decay rate λ;

36 28 3 On Random Walkers 3. these pheromones can be recognised by other ants of the same pile and used for navigation; 4. normally, with a very high probability, ants are following the trail labeled with the highest pheromone concentration or split up when they encounter differently labeled alternative paths; if ψ(v i, v j ) denotes the pheromone concentration of a path from v i to v j, where v i and v j are the nodes of a graph G = (V, E), then the probability p(v i, v j ) that an ant chooses exactly this way is given by p(v i, v j ) = ψ(v i, v j ) (vi,v k ) E ψ(v i, v k ) (not considering the very small probability that this ant moves randomly); 5. withaverylowprobabilityanantmovesrandomly, evenifthereisalabeledpheromone trail; this property is the key to explore the environment exhaustively. Figure 3.2: Ant street in a natural environment[44] Dorigo was the first who found that this behaviour may automatically find shortest ways in a given environment between any two points. Moreover, imitating the behaviour of ant societies can be used to solve optimisation problems, like the traveling salesman problem [29] or scheduling problems [39]. Another phenomenon of ant societies is also interesting for the use in computer science and engineering. Ants can help each other and coordinate their activities to form piles of items such as corpses, larvae or grains of sand by using stigmergy. The algorithm for doing so follows quite simple instructions: 1. Initially, items are deposited at random locations. 2. An ant population is moving randomly through the network.

37 3.1 Random Walkers For every place x exists a probability p pick (x) that an ant collects an item from that place when passing by. This probability is high, if there are only a few or no items on x and in its direct neighbourhood. The lower the number of items is, the higher p pick (x) will be. 4. In the same manner, there is for every place a probability p drop (x) defined. It determines the probability that an ant drops an item carried on that place when passing by. This probability is high, if there are already many items on x and/or in its direct neighbourhood. The more items can be found, the higher p drop (x) will be. If this algorithm is executed several times, one big pile of corpses will be built as shown in Figure 3.3. Obviously, this example corresponds to cluster building, e.g. in distributed computer networks. Figure 3.3: Simulation from [56] for building piles of dead ant corpses Deneubourg et al. [27] first proposed a clustering and sorting algorithm mimicking ant behaviour. It is implemented based on corpse clustering and larval sorting of ants. In this context, clusters are collections of items piled by ants, and sorting is performed by distinguishing items by ants which place them at certain locations according to item attributes. According to [27], isolated items should be placed at locations of similar items of matching type, or taken away otherwise. Thus, ants can pick up, carry and deposit items depending on associated probabilities. Moreover, ants may have the ability to remember the types of items seen within particular durations and move randomly on spatial grids. Later, Lumer and Faieta [56] as well as others proposed several modifications to the work above for application in data analysis: One of their ideas concerns a similarity definition. They use a distance such as the Euclidean one to identify similarities or dissimilarities between items. An area of local neighbourhood, at which ants are usually centered, is defined.

38 30 3 On Random Walkers Another idea suggested for ant behaviour is to assume short-term memory. An ant can remember the last m items picked up and the locations where they have been placed. A last idea deals with pheromone labeling during the collection process. Two possibilities have been considered. In the first case, only the dropping places are labeled, and in the second one the whole paths to them. In both cases a sufficiently fast convergence of the method has been observed [56, 70] The above mentioned contributions are pioneering ones in the area of ant-based clustering. At present, the well-known ant-based algorithms are being generalised, e.g. in [70], for different purposes and applications. Also, externally enforced and problem-driven modifications of ant algorithms may become interesting. One example of such modifications from the literature shall be discussed here due to its importance for the later contributions in this thesis. This modification concerns navigation using pheromones. As mentioned, ants normally follow the strongest concentration of pheromones and only sometimes try to go new ways. For many search problems this is just the wrong behaviour: agents shall be motivated to go ways, which have never been or less frequently used before. This idea resulted in the definition of so-called minority ants [96]. A minority ant moves from any node v i to v j N i selected on the basis of the lowest pheromone concentration. The pheromone of v j is represented by p j, which is as usual increased when any ant visits the node v j. Now the transition probability is defined, however, as ψ(v i, v j ) p(v i, v j ) = 1 ( (vi,v k ) E) ψ(v i, v k ) which changes the behaviour of these ants drastically. The tendency to follow the lowest pheromone concentration results in an almost equal search over their whole environment. Figure 3.4 shows simulation results. The trail of a single ant out of a population of 20 ants moving on a small-world graph of 4, 096 nodes was observed. A cross denotes the node number (y-axis) on which the ant has been at a given time t (x-axis). In Figure 3.5 it can be seen that the ant (and also all other members of the ant population) will visit all nodes with almost the same probability, and only very rarely any two of these types of ants meet at one and the same place. The behaviour of a minority ant was simulated, and the respective results were also confirmed by a mathematical analysis. Figure 3.6 illustrates the advantage of a minority ant group in a search process as compared with random walkers. In a small-world network environment of 10, 000 nodes, a group of random walkers and minority ants shall search a fixed number of target nodes, which was distributed all over the network. It can be seen that

39 3.2 PageRank Calculation with Random Walks 31 Figure 3.4: Trail of a single minority ant of a population of N ants on a minority graph [96] pheromone labeling of the minority ants significantly helps in the search process, especially in the beginning. It prevents nodes to be visited several times and, therefore, improves the behaviour of minority ants compared to random walkers, which are unable of navigation or path control. 3.2 PageRank Calculation with Random Walks General Principles In the publication about a node s PageRank in a network by Page and Brin [64] the hint is given that it also corresponds to and can be represented as the node s probability to be visited in the course of a random walk through the network. If the node is visited many times by random walkers, then the node is assumed to be more important than the ones visited less often. To the best of our knowledge, however, this possibility has never been used or investigated seriously by other authors in the past, since PageRank was mostly used to evaluate the position of a webpage in a webgraph, which is available on a central server or set of server machines. Therefore, here this use is to be extended. NodeRank NR 1 is introduced as a more general, more broadly usable parameter. It shall be applied to P2P systems in order 1 The name is chosen to show on one hand its origin from the PageRank idea, and on the other to distinguish it by its different application in decentralised systems, where no webpages but nodes or peers shall be compared/rated.

40 32 3 On Random Walkers Figure 3.5: Frequency of node visits using a minority ant coordination mechanism to determine the role and importance of each peer in a community. It is argued that to any peer functionality to support the community can be assigned depending on its position in the community graph (that could be determined with the classical PageRank). Nevertheless, many more parameters influence the importance of a peer. Peers thus exposed are called service nodes with the intention to avoid the misleading expression server, since the special role in and tasks for the community are given to those nodes only temporarily. If the state of the system is changed, a service node may loose its specially assigned role and tasks and become a normal peer, while other peers will then fulfill these tasks. Consequently, one problem is to identify such nodes in a fully decentralised system. For this purpose, the use of random walkers is interesting, since they do not require any global knowledge about the network structure, and are attractive to be applied in large-scale dynamic P2P networks, because they use local up-to-date information, only. Moreover, they can easily deal with connections and disconnections occurring in networks. Their shortcoming, however, is time consumption, especially in the case of large networks [3]. To address this problem, it is proposed to utilise a set of random walks carried out in parallel. The first objective here is to define and to prove that the performance of determining PageRanks with this approach is equivalent to the one of PageRank [64]. After that, PageRank is generalised in Chapter 4. Applying the above ideas, the importance of a node i V in a graph G = (V, E) given by its PageRank at time t > 0 is defined as the number of times that random walkers have visited the node so far, i.e. PR i (t) = f i(t) step(t).

41 3.2 PageRank Calculation with Random Walks 33 Figure 3.6: Time for finding a fixed number of nodes in a small-world network by a random walker and a minority ant Note that i PR i (t) = 1 when t, where f i (t) is the number of visits to v i and step(t) is the number of steps up to time t, respectively. Ifthenumberofrandomwalkersisincreasedto k N, thenthepagerankcanbecalculated by f ik (t) PR i (t) = k step k (t), (3.1) where f ik (t) is the number of all k random walkers visits taken place so far in the step k (t) steps until time t. The most immanent problem is now to find answers on the questions: How many random walkers are needed for this purpose in a community? What are the tasks for these random walkers? How is the work of the random walkers coordinated? The algorithm for population control introduced in Section controls a set of random walkers in a fully decentralised manner. It has also the advantage, that the number of random walkers is not constant, but will vary with network size due to the limits for the visiting time required. Hereby, random walkers will permanently be generated as well as canceled. Thus, it may happen that a node does not see a random walker again, which has visited the node before and has been used for PageRank calculation. The following rules may prevent this problem from occurring:

42 34 3 On Random Walkers Due to the network size, the random walkers cannot carry a list of all nodes visited nodes during their life times. So a random walker only has a counter of the number of its steps performed and a unique identification string derived from the IP address of the generating node and the generation time. Each node chooses a (subset) of random walkers (typically the first one seen) by its identification for decentralised calculation of its PageRank. The number of random walkers used depends on the node s capacities, only. If a random walker does not return to the node within a given time out time, obtained as constant visiting interval, it is assumed to be removed from the system and will be replaced by any other, typically the next arriving, yet unknown walker. Each node executes the following algorithm: 1. execute the algorithm for random walker population control in the background; 2. wait, until a new random walker is received; 3. increase the internal step counter of that random walker; 4. process the random walker step counter information as described above; 5. send out the random walker to a randomly chosen neighbour; 6. GoTo 2. The last problem concerns the normalisation of the PageRank values, in order to make them comparable and not depending on the network size. This topic will be handled in the following subsection Estimation of Network Size In case the PageRank is defined as average visiting probability of a node, it is clear that PR(i) = 1. i V Consequently, the PageRank of each node not only depends on its importance in the network, but also on the size of the network. Therefore, a normalisation is needed, such that 1 characterises the average nodes, values > 1 characterise nodes with a better connectivity and position in the graph and values < 1 are used for badly positioned nodes.

43 3.2 PageRank Calculation with Random Walks 35 This normalisation is also needed for highly dynamical networks, whose nodes may come and go without any further notification at any time. In order to normalise, the size of the network must be known, which usually requires global knowledge of the network, at least for this parameter. A small trick using a property of mean values helps to cope with this situation, viz. the mean value of a small number of samples already approximates the real mean value normally quite well. Based on the knowledge above and some basic mathematics it is known that the average PageRank of all nodes in a community PR is given by PR = i PR i = 1 n n. (3.2) Hence, to calculate the average PageRank, n can be estimated from a smaller number of samples than by considering the entire number of nodes by n = i=1(1)k PR i PR = 1 PR, (3.3) with K < V. In other words, the network size is estimated from a sample of PR values whose mean value will converge to 1. Now, only a good estimation for K is needed. This n can be replaced, however, by considering the deviation of the calculated mean value. The calculation can be stopped when the deviation is small enough and/or the mean value is stable enough. Theoretical Considerations on Random Walks Since any random walk is a Markovian process, in this section, some basic definitions and some simple facts about Markov chains are collected. Mostly, these definitions will be used to explain the convergence behaviour of the PageRank calculation under the consideration of bandwidth. Let M be a finite set. A probability distribution on M is given by a function p that associates a non-negative number p(m) with any element m M such that p(m) = 1. m M Then, the probability P(S) of an event, i.e. a subset S of M, is defined by P(S) = p(m). m S Thus, P is a function from the powerset of M onto [0, 1] and is called a probability measure. The pair (M, P) is called a finite probability space with ground set M. (This is mentioned as there is no need to introduce sigma-algebras because of the finiteness of M

44 36 3 On Random Walkers and, consequently, probability spaces can be defined as pairs instead of triples.) Since all probability spaces occurring in this thesis are finite, the adjective finite shall be dropped henceforth. Let A, B M be two events such that P(B) > 0. Then, the conditional probability P(A B) of A given B is defined by P(A B) = P(A B). P(B) A random variable on a probability space (M, P) is a function X from M into some set B. If X is a random variable on a probability space (M, P) that takes values in a set B and U B, then P[X U] is the probability of the event X(m) U, i.e. P(X U) = P({m M X(m) U}). A sequence (X i ) i=0,1,2,... of random variables on a probability space (M, P) all taking values in some set B is a Markov chain, if for any positive integer n and any finite sequence (b i ) i=0,1,2,...,n of elements of B P(X n = b n X n 1 = b n 1,..., X 0 = b 0 ) = P(X n = b n X n 1 = b n 1 ). Let (X i ) i=0,1,2,... be a Markov chain, a, b B and t a positive integer. Then, the transition probability π t (a, b) from a to b at time t is defined by π t (a, b) = P(X t = b X t 1 = a). A Markov chain (X i ) i=0,1,2,... is said to be stationary (or time-homogeneous), if for all a, b B and all positive integers s, t π t (a, b) = π s (a, b). The function π is called the transition function of the Markov chain (X i ) i=0,1,2,.... Since all Markov chains considered in this thesis are stationary, also the adjective stationary and subscript t will be dropped henceforth. Let (X i ) i=0,1,2,... be a Markov chain with transition function π, and assume for the sake of simplicity that B = {1, 2,..., n}. Then the n n-matrix A = (π(a, b)) is called the transition matrix of the Markov chain. Let a, b B. It is not hard to see that the probability P(X t+2 = b X t = a) to go from a to b in exactly two steps can be computed by P(X t+2 = b X t = a) = P(X t+2 = b X t+1 = c)p(x t+1 = c X t = a). c B Using the transition function π, this formula can be written as P(X t+2 = b X t = a) = π(a, c)π(c, b). c B

45 3.2 PageRank Calculation with Random Walks 37 For a positive integer k let π k (a, b) denote the probability to go from a to b in precisely k steps. An easy computation shows that π k (a, b) is the b-th element of the a-th row of the k-th power A k of the transition matrix A. A Markov chain is ergodic, if for any two elements a, b B there is an integer k such that π l (a, b) > 0 for all integers l k. A probability distribution µ on B is called a limit distribution of the Markov chain (X i ) i=0,1,2,... with transition matrix A if µ = Aµ (In this equation, µ is considered as a column vector with index set B = {1, 2,..., n}.). The following theorem shows that any ergodic Markov chain has a unique limit distribution. Theorem. Let(X i ) i=0,1,2,... beamarkovchainwithtransitionmatrix A, thenthefollowing two assertions hold. 1. The Markov chain (X i ) i=0,1,2,... has a unique limit distribution µ. 2. For any probability distribution p on B lim t At p = µ. For random walks determining the PageRank under consideration of hardware parameters like bandwidth this theorem is important, since it shows that there is always a limit value for the PageRank and the convergence to it assured Convergence in Real Systems When random walkers are starting to circulate in a network, the PageRank values calculated by the methods described in the last subsection will change quite often. Consequently, the convergence behaviour must be studied and methods shall be considered to quickly stabilise the PageRank values, at least with an estimation. These two topics shall theoretically be studied in this section and later be confirmed by simulation experiments. The convergence time t conv is defined as the duration until a given value is stable within a certain margin ǫ. This usually small margin [78] is defined as the maximally allowed difference between PageRank values in two time-subsequent steps. Convergence is reached when PR i (t) PR i (t 1) ǫ is fulfilled for all nodes. In order to prevent chaotic changes of PageRank values, a mean value (rather than 0) is estimated and used as initial PageRank value of all nodes. The final PageRank values can be higher or lower than the initial one and they will be changed smoothly. Then, the PageRank is calculated as PR i (t) = 1 n e ct + f ik (t) k step k (t) (1 e ct ), (3.4)

46 38 3 On Random Walkers where n is the estimated number of nodes in the network, c is a damping factor and f ik (t) is the number of the random walkers visits to v i after step k (t) steps until time t. As a first estimation, the term 1 n e ct represents the initially assigned PageRank value. For t = 0, the term e ct assumes the value 1, 1 e ct vanishes and, thus, the initial PageRank of all nodes becomes PR i (0) = 1. On the other hand, for t, e ct n vanishes, 1 e ct approaches 1 and the PageRank obtains the same value as in Eq. (3.1), f ik (t) viz. PR i (t) =. In this case, the PageRank calculations of all nodes starts with k step k (t) the same initial value, the parameter c may range within 0 < c < 1 and its value also effects the convergence time. The decentralised calculation of PageRanks by a population of random walkers will be considered in the following simulations Simulation Results The objective pursued in this subsection is an empirical proof of concept. The following issues are addressed: 1. Is the PageRank generated by sets of random walks equivalent to the one rendered by the algorithm of Page and Brin? 2. Can the average PageRank of a network be estimated by considering only a part of the network and, if so, which size does this network need to have? 3. What is the convergence time, and does it depend on network size, network structures and number of random walks? Due to its frequent use and simple visualisation possibilities, here the simulations are carried out on regular, two-dimensional overlay network structures, i.e. a grid and a torus. For the grid structure, the maximum degree of a node is four and the minimum is two. In contrast, the degree of all nodes is four for the toroidal grid structure. To conduct comparative simulations, a rectangular network (or grid) of size was used and the margin ǫ selected as First, considering the PageRank algorithm, Eq. (2.1) was applied. At time t = 0, the PageRank of all nodes was set to an initial value. Each node calculated its PageRank and, then, distributed its updated PageRank to its set of neighbours N i. At every time step, the updated PageRank was compared with the previous one. If its difference turned out to be below the margin ǫ, the obtained value was regarded as the node s PageRank. On the other hand, to investigate the calculation of PageRanks based on k random walkers, Eq. (3.1) was considered and k selected to be 50. The random walkers visited nodes until t=120, 946, and then convergence of the PageRank

47 3.2 PageRank Calculation with Random Walks 39 values was reached. The results obtained from both approaches are shown in Figure 3.7. Due to the structure of the grid, the PageRank of a node depended on its number of links. The node that had the lowest number of links had the lowest PageRank, too. Consequently, the results reveal that a set of random walks produces the same PageRank as the algorithm PageRank of Page and Brin. Figure 3.7: Comparison of ranking on a grid with size 20 20(ǫ = ) To show that a calculation of an average PageRank makes it possible to estimate the generally not known size of P2P networks, simulations were conducted on grids with the sizes and 50 50, respectively, and by using k = 50 random walkers, yielding as exact average PageRank PR = and PR = , respectively. For both simulations, only fractions of the networks were queried, with the fraction sizes ranging from just a small number of nodes to around 80% of the overall network size. Calculating mean PageRanks from these data indicated that they were close to the exact average PageRank values, which could be proved for fractions with a tenth of the networks size or larger. Figure 3.8 presents the simulation results obtained for network size The simulation was started by sampling the PageRank values from 50 nodes (0.2% of network size) and went on until taking 2, 000 nodes(80% of network size) into consideration. The approximate average PageRank reached the exact value with a deviation of just already for 250 or more nodes. To conclude, if the sample size of nodes would be large enough to calculate the approximate PR, then this value could be used to estimate the network size n = 1 PR. Convergence behaviour was studied based on three experiments. In the first one, the convergence time for a single walker was compared for different network sizes. Here, simulations

48 40 3 On Random Walkers Figure 3.8: Approximate average PageRank with different sample sizes for network size 2, 500 peers in both grid and toroidal grid structures were conducted with the margin ǫ = The number of nodes (n) was increased from small to large network size, and set to 100, 400, 900, 1, 600, 2, 500 and 10, 000, respectively. In the simulations, n represented the network size, while in real networks one has to settle for an estimated value. The comparison results are shown in Figure 3.9. For ǫ = and the toroidal grid, random walks led to faster convergence than for the grid structure, especially when the number of nodes exceeded 1, 600. In addition, for both grid and toroidal grid, random walks in small networks led to earlier convergence than in the bigger ones. Table 3.1: Convergence times for different numbers of random walkers(n = 400, ǫ = ) Walkers Grid c = Toroidal Grid c = , 011, 870 6, 164, , , , , , , 284

49 3.2 PageRank Calculation with Random Walks 41 Figure 3.9: Convergence times for different network sizes In the second experiment, the number of random walkers was increased to k = 50 in order to save time by parallel processing. Its convergence time was compared to the one obtained for a single random walker. Here, both a grid and a toroidal grid with nodes and the very small ǫ = were used. The results show that convergence was reached times slower for a single random walker than for the 50 walkers working in parallel, for both network structures considered. From this simulation it could be concluded that the number of random walkers effected the convergence time at ǫ = If ǫ was very small, it turned out that random walks in the grid reached convergence slower than in the torus. In the third experiment, the influence of the damping factor c was studied. Again, a grid and a toroidal grid with 400 nodes were considered. The margin ǫ was selected as and the number of random walkers increased to be k = {1, 10, 20, 50}, respectively. The simulation results for both network structures revealed that a suitable value for c value was important according to Eq. (3.4). If c was, for instance, too small, i.e. c , then Eq. (3.4) would not support PageRanking. The suitability of values for c was determined by the value of ǫ and the number of nodes. For n = 400 and ǫ = , suitable values for c were slightly greater than The convergence times for both grid and torus are given in Table 3.1. It shows that c and the number of random walkers effected the convergence time for both structures.

50 42 3 On Random Walkers It can be concluded that populations of random walkers are able to calculate the PageRank value of nodes in a graph in a fully decentralised manner without any global knowledge of the graph. The calculation process may take a while, even if populations of random walkers are used. Compared to the expected slow changes in commercial IT systems, however, the achieved speed seems to be reasonable and sufficient.

51 43 4 A Generalised Node Evaluation 4.1 Influence of Network Parameters So far, an iterative and a random-walker-based calculation of the PageRank has been considered. It was figured out that both methods generate the same results. Again, in all considered cases, PageRank reflects as intended in the original publication of Page and Brin [64] only topological aspects of node embedding into (web-) graphs. Nevertheless, also other factors may influence the role which a node and the contents stored on it may play in a system: Bandwidth and latency mostly determine how fast the data on a node may be accessed. A high bandwidth allows fast data transfer to a large number of users. The bigger transferred files are, the lower the influence of the latency will be, while it plays an important role for the access to relatively small files by a large number of users. Processor speed complements the bandwidth requirement. Without high computational speed, the ability of a high bandwidth cannot fully be utilised, since data cannot be made available for transmission as fast as necessary. Harddisk space describes the ability of a node to play the role of a (central) server in a system. Nodes which are to be accessed and known by a large number of clients shall also have the ability to store a huge number of files in order to avoid too wide a spread of information over the system. The bandwidth and latency of a connection to any destination may be tested by a node itself using available standard protocols (e.g. PING). For the processor speed and other hardware resources of a destination node, the problem becomes more complex. In this case, those parameters must either be frequently propagated by the destination node to its neighbour nodes (sender-initiated approach), or a communication protocol must be established by which any node may obtain the needed information on request (receiver-driven approach). However, to trifle with these data may cause a node serious problems. Therefore, the respective decision shall not be made by the user (as, for instance, in KaZaa the decision to be a supernode), but is to be an implemented property of the system.

52 44 4 A Generalised Node Evaluation 4.2 NodeRank: an Extension of PageRank The goal of this section is to show, how additional parameters may be included in the PageRank calculation. A new parameter called NodeRank is introduced, which 1. assigns the highest ranking to nodes having a central place in the graph topology (as the original PageRank) and being well connected in the network, i.e. they may offer high bandwidth connections (and possibly other advanced hardware properties), 2. assigns the lowest values to (almost) isolated nodes or nodes with low connection or hardware attributes (i.e. low bandwidth and low storage capacity) and 3. generates average values for the remaining nodes. Notethatalesscentralpositioninthegraphmaybecompensatedby,e.g.,ahighbandwidth. Of course, the new NodeRank shall be calculated in a fully decentralised manner for P2P networks. Consequently, the random-walker-based approach shall be used. Sofar,thetransitionprobabilityofarandomwalkerfromanode itoanode j p(v i, v j ) = 1 N i is the only parameter influencing the PageRank besides the topological properties of the underlying graph. Following the idea of path distribution of ants (which depends on the distribution of the pheromone concentrations, see [28]) this equal probability shall now be changed. At the beginning, only bandwidth influences shall be applied here to identify generally non-uniform transition probabilities of random walkers, i.e. if a node is connected by a low-bandwidth link, then the probability to be reached will be lower than via a high-bandwidth one. Let B(e ij ) be the bandwidth of the link connecting nodes v i and v j. Then, the transition probability of random walkers to move from v i to v j is defined as p(v i, v j ) = B(e ij ) j Ni B(e ij ), (4.1) where j Ni p(v i, v j ) = 1. The number of times that random walkers have visited the node f ik (t) influences the visiting probability of the random walkers. The NodeRank is calculated by Eq. (3.1). It is easy to see that now the random walker will prefer links with a higher bandwidth. The equation can also be applied when further network parameters are taken into consideration by replacing B(e ij ) by other quantities or combining it with other parameters by a weighted expression. For instance, B(e ij ) could be replaced by a general parameter P(e ij ) with P(e ij ) = α k parameter H k (j) H k,standard, (4.2)

53 4.3 Simulation of NodeRank and its Properties 45 where the α k are weighting factors for the respective parameters, H k (j) is the actual parameter value of (or for the connection to) node j and H k,standard the average value or minimum required value for that parameter (like a reference value). Alternatively, it might make sense to use a non-linear combination of the parameters, as suggested in [29, 32] by ( ) Hk (j) β P(e ij ) = α k, (4.3) parameter H k,standard where β is the exponential weight. Last but not least, also more complex parameter combinations using fuzzy logic or neural networks are possible. Since they may take much more time and effort for tuning and/or calculation, they have not been considered in this work. In the following section, the suitability and practicability of the approach described above shall be investigated using some simulations. 4.3 Simulation of NodeRank and its Properties Several times, grids have been proven to be efficient overlay structures for P2P systems. Details about their construction and use are presented in [12, 13]. In addition, compared to amorphous P2P structures they are easy to establish and to visualise, what make grids a perfect simulation environment. The following structural properties of grids are included in the simulation: the maximum out-degree of a node is four in the middle of the grid, there is a minimum out-degree of the nodes of two on the corners and of three along the grid borders, in contrast, an out-degree of all nodes is four for the toroidal grid structure, the sizes of networks are represented as the multiplication of the numbers of x-columns and y-rows, and a node is represented by a cross between x-columns and y-rows. Not necessarily, quadratic grids need to be used. Additionally, the influence of the communication links bandwidth shall be taken into account. The primary goal hereby is to demonstrate that a low bandwidth is correlated to a comparably low NodeRank, and a high bandwidth corresponds to significantly higher NodeRank values. The use of symmetrically, almost equally connected nodes supports the demonstration of the intended main effects. The following set-up is employed. Users of P2P networks may be confronted with various link bandwidths available. Consequently, node accessibility is also different. In order to keep our experiments as simple and clear as possible, only two different bandwidths are used:

54 46 4 A Generalised Node Evaluation Figure 4.1: NodeRanks determined by 50 random walkers within substructures of different bandwidths (ǫ = )

55 4.3 Simulation of NodeRank and its Properties 47 for a high bandwidth, the data transfer rate is assumed to be 100 Mbps and,in contrast, 30 Mbps is supposed to be a low-rate one, which is around three times slower than the high bandwidth one. The simulations considering the link bandwidths were carried out in the same settings as above, viz and nodes in both a grid and a torus, with 50 random walkers and ǫ = The effect of varying link bandwidths is shown in Figure 4.1. The results clearly show that NodeRanks are influenced by the bandwidth of communication links in such a way that the probability of a node to be visited by random walkers correlates to the bandwidth of the links leading to it. In addition, the influence of the network topology is to be seen on the borders of the grid (to compare with the torus): the lower connection bandwidth of the border nodes is reflected by a significantly lower NodeRank value, while the nodes close to the border area have a minimally increased NodeRank due to the higher number of (reflected and returning) random walkers. Hence, the link bandwidth influences the transition probability of the random walkers and, consequently, also the NodeRank values. The following chapter addresses the last, not yet considered influence factor, viz. user activities.

56 48 5 Evaluation of User Activities 5 Evaluation of User Activities 5.1 Characterisation of User Activities So far, the aspects of network topology and network parameters have been considered and a concept to evaluate them together within a new NodeRank value has been introduced. The characteristics of these parameters show some great advantages, which make their evaluation quite easy: They normally do not change at all or vary only within very large intervals and, therefore, all algorithms introduced so far do not need to deal with very short time constraints or hard real-time requirements. Properties of one node do not influence neighbours or groups of neighbours. This situation is changed drastically, when user activities are taken into account. Some of them have already been considered in the literature on network utilisation like [40]. There, mostly the following effects play an important role: Users may join and leave a system without notification [67]. User activities are not equally distributed over the whole day. There exist peak hours of activities which are also influenced by time zones and cultural aspects [67]. The user activity itself (not depending on time but on download volume) follows a power law distribution [18, 67]. User interests are not equally distributed, but follow Zipf s distribution [18, 102]. It is clear that system design, analysis or simulation cannot handle these manifold parameters for all users in an adequate manner. But this is also not necessary, since the normal (low traffic) utilisation does not generate too much problems for network administration. The most critical situations occur in extreme situations, i.e. when most users are active and trying to download quite similar contents at the same time with a high frequency. Such situations are, for instance, known from holiday traffic in road or railways systems, server access during special sales activities or news consumption after certain events (natural catastrophes, terror attacks etc.). In these cases, the activities of all users overlap and interfere which each other [68]. The above mentioned random-walker-based methods cannot handle such situations,

57 5.1 Characterisation of User Activities 49 because random walkers do not contain any history and are not able to predict or to consider mutual influences. Consequently, new methods must be developed to deal with the peculiarities of user activities. They shall always be able to handle two cases: the average (standard) use in normal situations and the investigation of extreme use cases as discussed above. In a P2P system, the users and the number of available peers offering contents are normally parameters that cannot be changed easily. The behaviour of the system may be influenced and changed, however, after an evaluation of the peers capabilities by relocating contents to more powerful or less accessed peers, suitable replication of contents in correspondence with mechanisms maintaining data consistency, like in [80], using a broker, which assigns peer to servers with the fastest response times and/or changing the peers logical link structure (neighbourhood relation) [45]. In the next section, mostly methods to relocate contents will be considered. The average and the extreme cases of user activities will be discussed in the same manner. In both cases the following subtasks must be solved (resulting in the structure of the subsequent section): 1. identify and propagate activity information about active nodes, 2. find suitable service nodes 1, 3. define their service ranges and 4. assign respective peers to them. Here we concentrate on the relocation of contents to find (optimal) service nodes and re-assignment strategies for peers to powerful service nodes. 1 Note, that the term service node is used instead of server. Although these nodes carry out server tasks, they must be distinguished from classical servers, since the server functionality may be assigned to them only temporarily and may be changed with time and varying network conditions.

58 50 5 Evaluation of User Activities Figure 5.1: Clustering users around a system s main servers 5.2 Measuring and Propagating Node Activities General Remarks CDNs are designed to meet the users growing need for high data volumes, as for audioor video-streaming. Corresponding to the active users, i.e. users with high rates for upload or download, active nodes may be understood as nodes being responsible for high network traffic caused by the respective users (consequently, the two terms active user and active node are used synonymously). Needless to say, active users or nodes do not generate constant flows of data. Users log into a system and leave it (normally without any further notification), and show periodic behaviour depending on the time of the day, the respective weekday and cultural particularities. Any adminstration, user management or self-organising behaviour of a system shall increase the system s ability to serve a user with a high QoS, what mainly means to ensure short response times and high bandwidth guarantees during data transfer. The main possibilities for improvement are increasing the number of servers or service nodes, replicating files and flexibly assigning clients to the servers or service nodes. It is easy to understand that servers or service nodes should be placed at points central in a network, which are usually machines minimising the (weighted) sum of distances to their assigned client peers 2. 2 Note that this is not necessarily a machine with the shortest distance to a certain peer.

59 5.2 Measuring and Propagating Node Activities 51 Normally, the communication time between a server or service node and a peer is used as distance measure, possibly weighted by the data volume uploaded or downloaded. QoS, response time, latency, throughput and accessibility also depend on the communication time. If the communication time is high, then QoS, response time, throughput and accessibility will be low. Clustering can be applied as one possibility to find a central point of active users and optimally assign active users to content servers. Figure 5.1 shows the general situation Representation of Peer Activities The first step towards a dynamic, performance-optimised CDN is to determine how node activities are defined and propagated in order to adequately represent mutual interferences in a neighbourhood. Different parameters can be used to determine either maximal or average node activities: The upload or download activities, expressed in kb or MB, in a given time interval will be used in most cases. Hereby, it plays a role, that these activities show periodic changes depending on time of day, different weekdays and cultural user background. A mean value might, therefore, be only interesting in periods of activity and not over whole day or week intervals. On-line times are also interesting to determine user activities, especially if a permanent, fixed data transfer volume per time unit is assumed. This can be the case for VoD systems, since on-line times are mainly equal to the download times of exactly one video stream per user. The distribution of on-line times may allow to classify users and to more precisely estimate the requested network utilisation. As a combination of the two cases above, intervals with data volumes exceeding a given limit may be stored, only. Users can only be roughly classified using a fixed set of classes for their time and/or data volume. The character of a network and the experience derived from its use are the main factors to determine which traffic characteristics are significant and in which manner they shall be processed. The respective choice must, however, be made for a given network or CDN in a consistent manner. Figure 5.2 shows a respective example for the measurement of server utilisation. Regardless of a traffic measurement chosen it is easily understandable that the activities on a set of nodes also influence the neighbourhood of these nodes. The reason for this are commonly used network sections or the needed forwarding and routing of data through intermediated network nodes or peers. Consequently, a high utilisation of two or three nodes

60 52 5 Evaluation of User Activities Figure 5.2: A typical server utilisation over a longer time interval [93] may result in a reduced quality of service for a whole part of the considered network. That is why, in the subsequent subsection, methods for the consideration of such interferences in a peer neighbourhood shall be considered Identifying the Utilisation of Network Areas As mentioned in a previous section, user activities of several nodes influence each other by commonly used network resources. Therefore, it is not enough to consider the activities of a single node, only. Instead, the utilisation parameters of several nodes in a given neighbourhood must be combined appropriately. This requires to investigate methods for the proper 1. communication as well as 2. combination of parameter values. For the propagation of parameters in a neighbourhood it is important to know that the impact of high activities of one node normally reduces with the distance to its source. In the following considerations, several methods will be investigated.

61 5.2 Measuring and Propagating Node Activities 53 Figure 5.3: Physical analogon to the temperature field [26, 71] One approach follows the analogy to temperature fields in thermal physics. It was introduced [97] to locate nodes managing contents of common interest in P2P networks. Each node features a temperature indicating its activity. The heat of each node radiates toward its direct neighbours and, thus, influences their temperature as well. Whenever a node s content is accessed or updated, its temperature is increased, whereas during periods of inactivity the temperature drops exponentially to align with the temperatures of the surrounding neighbours. Figure 5.3 shows the physical analogon. Baumann et al. introduced the HEAT routing algorithm [11] for large multi-hop wireless mesh networks to increase routing performance. By design, HEAT uses anycasts instead of unicasts to make better use of underlying wireless networks. HEAT relies on a temperature field approach to route data packets toward Internet gateways. Every node is assigned a temperature value, and packets are routed along increasing temperature values until they reach any of the Internet gateways modeled as heat sources. The distributed protocol establishing such temperature fields does not require flooding of control messages. Normally, every network node determines its temperature considering its own temperature and that of its direct neighbours, only. Thus, neighbourhood temperature values are either propagated with the usual data traffic, or a special protocol is established, which is scalable, i.e. works independently of network size. In the presentation of the algorithms, the temperature (θ) indicates the activity level of a peer. A node s temperature c is referred to as θ c. The possible values of θ c range from 0 to 1, where 0 denotes no activity at all and 1 indicates maximum activity of

62 54 5 Evaluation of User Activities the user utilising all resources. The temperature is calculated as shown in Eq. (5.1). θ c = Activity maxima of activity, 0 θ c 1 (5.1) This way, the temperature may represent any kind of node activity. Any decision making strongly depends on θ c being up to date. The temperature is recalculated with every time step or information that enters or leaves a node/peer. If the messages themselves act as temperature carriers, a node s temperature depends on real network activities. It is automatically prevented that any too high activity values are reported in order to fraudulently gain advantages from the automatic network management. This also underlines the analogy to convectional processes in thermal physics where temperature is conveyed by rapidly moving particles. Each node keeps information on the temperature of its neighbours. Let N(c) be the set of neighbours of c and k be the number of neighbours when 1 k 4 and the degree of distribution, e.g. in a mesh structure, is 4. Let i be the index of each neighbour N i in N(c), where 1 i k. In addition, let i be the number of messages sent from N i to c. Now, there are two cases to update a neighbour s temperature θ(n i ) in the dataset of c: 1. Whenever c receives a message from neighbour N i containing that node s temperature θ i, the previously stored value is overwritten: θ(n i ) = θ i, i f φ i > 0 (5.2) 2. If no message is sent from N i to c, θ(n i ) is decreased exponentially over time with a configurable time constant α: θ(n i ) = θ(n i ) exp αt, i f φ i = 0 (5.3) Figure 5.4 shows the idea described and its effect in a computer network environment. An advantage of the thermofield approach is that an additional (discrete) scalar information field is built, which can be used in later search procedures for the navigation of search agents or walkers. Therefore, random walkers shall work as follows: 1. Randomly select a start position. 2. Consider the temperature of all neighbourhood places. a) If there are neighbours having a higher temperature, move in the next step to those ones with the highest temperature (i.e. follow the gradient). If there are several neighbourhood places having the same highest temperature, choose one target randomly and GoTo 2.

63 5.2 Measuring and Propagating Node Activities 55 Figure 5.4: Temperature distribution applied on a computer network b) If all places around the random walker have lower temperatures than the current position, then keep the current position as local maximum, choose a random number R of steps, proceed R random steps away form the current position and GoTo 2. With the thermofield approach, a first method to propagate user activities has been discussed. Interferences cannot easily be observed using this approach, however, since the local maxima always remain at the heat s source nodes. A second approach again employing populations of random walkers may overcome this disadvantage. The use of random walkers is appropriate, since in complex networks any path planning to visit a given node s neighbourhood may be too difficult to establish. To prevent influences of complex network topologies, one-dimensional networks are used here in simulations. The first example (see Figure 5.5) studies the impact of the methods in a clear form using only one active node, while the other one allows to demonstrate the interferences of two active nodes. To represent different grades of user activities, values between 1 and 10 are used.

64 56 5 Evaluation of User Activities Figure 5.5: Node activities in a 1D environment to evaluate the impact of different propagation and interference methods We assume that each random walker is able to carry a set of K values h i, i = 0(1)(K 1), which it may have collected from the current node and the K 1 nodes before. These data can be used to calculate new scalar values, comparable to the above discussed node temperature θ in the thermofield approach. For this kind of processing, several methods shall be discussed and introduced with their impact in the following explanations. It is assumed that the level of real user activities of each node is stored in its parameter A. The calculated, neighbourhood activity level of a node is denoted by A N and computed from the node s current, real activities A and the activity level values A i of the K 1 nodes the random walker has visited before (i = 1(1)(K 1)). Simple neighbourhood gossiping In this case a node s activity A is simply communicated with the same value to all neighbours up to a given distance d. Gossiping is carried out by a population of random walkers, which periodically visit all nodes and propagate the respective values to all neighbourhood directions. Note that each of the random walkers may transport non-zero activity values from several nodes through the network, i.e. can carry more than one value to each neighbourhood node. Over time each network node thus receives several non-zero activity messages m i = A i from its neighbourhood. In order to reflect changes over time in the right way, only the T last values and A are processed, i.e. the mean neighbourhood activity value A N is calculated by A N = A+ T i=1 m i T+ 1

65 5.2 Measuring and Propagating Node Activities 57 Figure 5.6: Activity propagation by gossiping in a fixed neighbourhood Figure 5.6 shows how this method works for a one-dimensional field with one active node at position 8, and then two active nodes at positions 5 and 10. In the latter case the interference of both user activities in the middle between the two active nodes is well manner. The parameters are selected as A = 10 and d = 4. Distance-based gossiping This case is quite similar to the previous one. Again, the activity A of a node is simply propagated to all neighbours up to a given distance d. Differing from the first gossiping method, the content of the messages m i is decreased during each step by a fixed δ (i.e. m i = A i d δ) until m i < 0; then it is removed from the random walkers data 3. On each node, A and the T last values are processed again, i.e. the mean neighbourhood activity value A N is calculated by A N = A+ T i=1 m i T+ 1 Figure 5.7 shows how this method works for the one-dimensional field with one and two active nodes and again for the purpose of direct comparison with the other approaches with the parameter A = 10 for d = 4 and δ = 1. Sliding window Here, each random walker carries the activity information of the current and all the last K 1 nodes visited, i.e. A or A 0 for the current node and A i with i = 1(1)(K 1) for the K 1 nodes before. From this information the mean value is calculated by A M = (K 1) i=0 A i K 3 Note that also a decrease with a quadratic influence of the distance may be possible.

66 58 5 Evaluation of User Activities Figure 5.7: Activity propagation using distance-based gossiping and again the last T values carried by the random walkers are processed, i.e. the mean neighbourhood activity value A N kept on this node is calculated by A N = T i=1 A M T Again for a one-dimensional field with one and two active nodes (A = 10 and d = 4) Figure 5.8 shows how this method works. Modified sliding window In this approach, each random walker carries the activity information of the current and the last K 1 nodes visited, i.e. A or A 0 for the current node and A i with i = 1(1)(K 1) for the k 1 nodes before. Differing from the simple sliding window approach, here A M is built by where X i K is defined by A M = (K 1) i=0 X i K X i K = max{a i K, A N } whereby A N is again the average value built from the T last values carried by the random walkers to the node, i.e. A N = T i=1 A M T Figure 5.9 shows the impact of this propagation method. As before, a onedimensional field with one and two active nodes and with A = 10, d = 4 is used. The figure also reveals that smoothness of the propagation can be attained as compared to the standard sliding window approach.

67 5.2 Measuring and Propagating Node Activities 59 Figure 5.8: Activity propagation through a sliding window realised by random walkers Figure 5.9: Activity propagation using the modified sliding window approach

68 60 5 Evaluation of User Activities Figure 5.10: Impact of the computation of zero activity values Zero-value modifications All methods discussed may be modified depending on taking zero activities into account or not. In Figure 5.10 the results of computing zero activity values or not are compared using the simple, one-dimensional example. Employing one of the above propagation methods, each peer in a network obtains a scalar value, indicating the load of the respective node and its neighbourhood. It is important that a node may obtain a high utilisation value, even if it does not show own activities at all, but is influenced by a number of active neighbourhood peers. Furthermore, the activity parameter A N can easily be used in Eqs. (4.2) and (4.3) to compute a NodeRank considering user activities, too. This simple inclusion of a new factor into the calculation of NodeRank is another advantage of this parameter. Another big advantage of the methods described methods is that one technique random walking is used to fulfill several tasks in a network. In the previous chapters, their use for PageRank and NodeRank calculation has been discussed, now they have been used to consider user activities. It appears important to point out that a single population of random walkers may be used to work on several tasks in a network. In the following section, their use to evaluate user activities in two different environments shall be explained.

69 5.3 Activity-based Clustering of Peers Activity-based Clustering of Peers Planar or Plane-embedded Environments The developed methods can be used to evaluate the position or number of servers in a decentralised network or to help establishing and managing replications of data. This task is more difficult to solve in (generalised) small-world networks. Therefore, explanations commence with so-called planar graphs or plane-embedded environments. In graph theory, a planar graph is one that can be embedded in a plane, i.e. it can be drawn in the plane in such a way that its edges intersect only at their endpoints. In other words, it can be drawn with no edges crossing each other. These planar graphs have some advantages: They can easily be embedded in a two-dimensional plane and, therefore, a corresponding x y-coordinate system. Thus, a coordinate pair may be assigned to each node. With the embedding in the plane, distances can clearly be defined as Euclidean distances. Also, the triangle inequality is valid, i.e. a network connection over an intermediate node will always be more distant (and, therefore, longer or slower) than a direct connection. In most computer networks, this assumption is not true. Often, planar graphs can be embedded in the geographic coordinate system of the earth and, therefore, reflect geographic relations and distances well, which are often related to QoS parameters in a network, too Rectangular grids are simple examples of well usable, planar graphs. Moreover, in [12, 13] it has been shown that several kinds of rectangular grids can easily be built on top of any anarchically grown small-world P2P network structure. In addition, routing can easily be performed in grid structures, since the coordinates always indicate a direction and, therefore, the shortest path to the required target [51]. For simple environments such as grids or planar graphs, the proper placement of servers, service nodes or replicated data is equivalent to the standard clustering task. Normally, a set of initial assumptions can be made for the solution of this task: 1. Normal peers work as clients, while a few supernodes offer the respective contents. This corresponds to the result that even in unstructured P2P networks 98% of the content is hosted on just 5% of the machines. 2. The intensity of user activities follows a Zipf-distribution [102].

70 62 5 Evaluation of User Activities 3. Also, the requested content is power-law distributed, i.e. few files are searched quite often, while only few users are asking for all other files [61, 67]. 4. In a grid, several paths exist between any source and destination. 5. For routing, the coordinates always indicate a direction to the target. Coordinates may be derived from IP addresses or other content-related information as, for instance, in [22]. Furthermore, it can be assumed that the number of superpeers(or, in the context of classical client-server systems, servers or managed data copies) is known. The reason is that a server or the management of a document s copies normally causes costs in a network and, thus, limits their number. Thus, the placement of these servers or content copies can simply be derived by the solution of a classical clustering problem. This compliance is easy to be seen with the following assumptions: The number of servers (or superpeers or manageable copies) corresponds to the number of clusters or groups to be obtained. Note that most cluster algorithms require an explicit definition of this number as an input information. The number of objects to combine into clusters or groups refers to the number of active nodes. Active nodes are normally those nodes/peers, whose activities exceed an initially defined (and maybe later updated) threshold. The grid or the planar graph defines the distance between the nodes. The communication time needed corresponds to it. The goal of the algorithms is to clustering in such a way that each server is established close to a group of active users, and each user can be assigned to a server in its neighbourhood such that a good quality of service may be guaranteed. Later, placements can also take content aspects and other parameters into account. Here, clustering algorithms are applied to estimate the central points in order to locate servers or superpeers [43], because in real networks central points are complicated to identify by other techniques due to network size and dynamism. At present, there are several clustering approaches such as neural networks, K-nearest neighbours, hierarchical clustering, and K-means clustering [42], which are categorised depending on various aspects as follows [36]. Hard clustering, e.g. K-means, assigns each object to one cluster. Hierarchical clustering, e.g. hierarchical agglomerative clustering [42], splits clusters into sub-clusters by using dendrogram creation. Density-based clustering forms clusters by finding density-connected regions in feature space.

71 5.3 Activity-based Clustering of Peers 63 Neural network-based clustering, e.g. neural networks [42], clusters objects based on its tuning weights. Now, each active user must exclusively be assigned to one server, i.e. to the centroid node of exactly one cluster. The number of clusters is fixed to the number of available or deployable servers, i.e. equal a given K. Therefore, the K-means clustering algorithm is used to solve the assignment task. This algorithm builds K clusters and finds positions for the clusters centroid nodes. Each node is assigned to the nearest centroid node, as measured by the Euclidean distance, such that the value of the squared error function given by J = K n j=1 i=1 x j i c j 2 (5.4) is minimised, where K is the number of clusters, n the number of active users and x i i c j 2 the used distance measure between the active user x j i and the centroid of cluster c j. In our model, each pair of nodes is associated with an Euclidean distance representing a communication time or a response time between them. The Euclidean distance between a specific pair of nodes is measured directly by virtual coordinates of each node, denoted by (x i, y i ). On the other hand, a node representing the centroid of cluster has the coordinates (x ck, y ck ), where k = {1, 2,..., K} denotesthenumberoftheclusterand v k i itsmembers. Finally, the response time (R) between a member of cluster k and its centroid c k is given by R = (xi k x c k ) 2 +(yi k y c k ) 2 (5.5) It is clear that the cluster centroid represents the location of a server or service node. Weights w i may be used to consider the activities of the nodes: high weights are assigned to nodes with high activities and, thus, these nodes are placed closer to the centroids by the clustering algorithm (resulting in shorter communication time). Consequently, (x ck, y ck ) new = s i=1 w i(x i, y i ) s i=1 w i (5.6) where (x ck, y ck ) new is the position of the centroid computed considering the weights and s is the total number of active nodes in cluster k. Steps of Algorithm To identify the locations of servers or service nodes using the K- means algorithm, the following steps of the Planar Node Clustering Algorithm must be executed. 1. Determine the number of K clusters.

72 64 5 Evaluation of User Activities Figure 5.11: Considering locations of servers with and without weight association 2. Start network exploration by agents to look for active nodes using random walkers or minority ants and the thermofield method. 3. Find suitable initial positions for the K servers, superpeers or data copies (i.e. the centroids for clustering). 4. Calculate the distances between cluster centroids to each active node. 5. Determine new centroids by weighted means. 6. Compute the distances of all active nodes to the new centroids. 7. Assign all active nodes to the clusters based on the minimum distances. 8. Represent locations of supernodes as centroids of each cluster. 9. Calculate average response times by distance measurement between each centroid and its members. 10. If convergence is reached (i.e. the distances between the old and new positions of all centroids do not differ for more than a given small value ǫ) then terminate, otherwise GoTo 5. Finally, the positions of the servers and the nodes assigned will be computed. The servers will be located at the positions of each cluster s centroid such that the calculated response times for each member will be minimised. To summarise, each active node becomes a member of a cluster, whose centroid represents its nearest server assuring the highest possible QoS, fast response as well as a high

73 5.3 Activity-based Clustering of Peers 65 accessibility. Figure 5.11 shows the respective process as an example for a network of 400 nodes Generalisation of Clustering Method The application of the method described above is quite simple but also very limited. Planarity is a really strong limitation to most graph structures, and for some applications the generation of grid overlay structures may generate too high a time consumption or overhead. This is especially true if networks are highly dynamic, i.e. user nodes frequently join and leave without further notice. Then the following algorithm may be more efficient: 1. Start with an initial supernodes or server carrying all contents or, respectively, with a K-member group of them, where K is the number of supernodes, servers or copies of documents which can be used. 2. Determine any of the peers from step 1 to be the controller. 3. The controller has to initiate and control a permanent working population of random walkers evaluating the user activities following one of the propagation methods mentioned previously. 4. Let the controller find K initial serving nodes by the use of random walkers or minority ants. 5. Place the data on them. 6. Assign clients to the serving nodes (see the next algorithm below for cluster assignment) and, if need be, decide on a service node migration (following the last algorithm in this section). 7. Wait for a suitable fixed time in which no serious changes in the system are expected. 8. GoTo 6. The method above shows that the tasks are generally more complex. Mainly, it is hard to find an optimal server position in one step, because the graph cannot be embedded in a two- or three-dimensional space, scalar distances are difficult to define, any geometric distances and directions do not fit to those in a computer network (e.g. response times) and, consequently, (geographic) centre points are difficult to determine.

74 66 5 Evaluation of User Activities Therefore, it is intended to be performed in a successive manner. K service nodes are heuristically placed and serve clients, which they could find. After some time they may be moved to better positions until more or less optimal ones are reached. Eventually, clients are re-assigned to new service nodes. For this, the following settings are made: 1. each initially set server s or supernode has a counter size(s) containing the number of allocated peers, 2. each server determines a constant size max (s) of peers it may serve and dynamically adapts this number, in case an unproportional load is measured, 3. each client peer v knows the IP address SIP(v) of the server it is assigned to, 4. each client peer v and each server s have a distance value d indicating the (known) distance to the server; d(v) is initiated by and d(s) = 0, respectively. Now, the peer allocation, i.e. the assignment of a peer to its corresponding server or service node, can be carried out using the following Peer Allocation Algorithm: 1. Let a server or supernode s generate a random walker (or a population thereof with fixed size) 4. This can be done by the very first or any newly installed server or supernode as well as after the random walkers have been canceled if size(s) size max (s). Set v i = s. 2. The random walker is sent from its current position v i to one of the neighbourhood nodes v j, while v j is selected depending on priorities as follows: a) nodes of the own cluster, b) non-allocated neighbours, c) neighbours allocated from a server r with size(r) > size(s), d) allocated neighbours from clusters having a server r with size(r) > size max (r), e) nodes allocated by a server r with d(v j (r)) > d(v i (s)) If the node belongs to the own cluster, GoTo If the current node v j is not allocated, start an allocation procedure with an update of size(s) = size(s)+1, set the client peer s SIP = s and broadcast the new size(s) value to all nodes allocated form s. 5. In all remaining cases, 4 For simplicity, the algorithm is described with one random walker per service node, only. 5 Alternatively, it might be required that the response time of r is lower than that of s in order to prevent building of an asymmetric cluster around the server related to the realisable response times.

75 5.3 Activity-based Clustering of Peers 67 a) contact the node given in SIP(v j ) and request secession of v j, b) include node in own cluster by setting SIP(v j ) = s, c) set d(v j ) = d(v i )+1, d) force s and r to update size(s) = size(s) + 1 and size(r) = size(r) 1, respectively, and e) force s and r to send status update broadcasts to all nodes of their clusters. 6. If size(s) > size max (s), then cancel the random walker and GoTo Set v i = v j and GoTo 2. Using this algorithm after the recognition of activities in a network, almost equally sized clusters shall occur. Of course, enough servers (i.e. a high enough K in the previous algorithm) are required, otherwise some peers may remain unallocated. This problem may be solved, however, either by increasing size max (s) and/or by removing step 6 in the algorithm above. Nevertheless, the load in the network is still not well balanced. Therefore, another algorithm is needed. To correct the positions of the service nodes and to optimise their assigned clusters, the following computations must be carried out. 1. All nodes v i in the cluster periodically check the distance values of all neighbours v j to themselves. If d(v j ) d(v i ) > 1 then a message is sent to v j to update d(v j ) = d(v i )+1. As it is easy to see, shortest-path trees T i to all nodes from the service node are obtained (for each edge leaving the service node, one tree is constructed). 2. The service node has to trigger the computation of a weighted activity sum W i for each of the subtrees T i by W i = v k int i d(v k ) A N (v k ). This can be calculated in a fully decentralised manner, if the service node sends out a special message to all its neighbours. Each peer v i answers the message with the value d(v i ) A N if it is a leaf in the respective short-path tree or forwards the message to all its successors(sons) in its short-path tree and answers with the sum of its own m = d(v i ) A N and all obtained answers from the successors (sons).

76 68 5 Evaluation of User Activities 3. If there is a significantly higher W i value for some of the shortest-path trees T i of the service node, the service node is moved by one step to the respective next neighbour in T i (greedy approach). 4. After the service node movement, distances in all trees T i must be corrected. Usually, all distances d(v i ) must be increased by 1. There is an exception for the nodes in the special tree to which the service node has been moved: here the distance values need to be decreased by 1. Other changes, caused by the new relations, will automatically be obtained due to activity (1) above. Now, this complex suite of methods shall be investigated by several simulations. 5.4 Experimental Results In several simulations, the advantages of the newly derived methods shall be established. The following questions will be answered: How many servers shall be applied in a given network? While a single server may not be able to handle all requests, any new server will cause additional costs to maintain the system as well as to keep consistency of possibly replicated data. How well and with which additional overhead (costs) can the server be placed in a network? How does the system react on dynamic changes of traffic, configuration and/or user requirements? The answer to these questions mainly decides if the approach of self-organisation suggested can really manage a system and if it may replace a human administrator. How is the average response time changed? This is the most important question, since the user will evaluate a system mainly by its performance. Users are not willing to wait and most on-line streaming applications require fast and timely delivery of data packages. The figures, diagrams and discussions below present the experimental set-up used and the results obtained. Although most of the approaches may work in generalised small-world environments (for which the Watts-Strogatz small-world model may be used), most of the experiments were conducted in a rectangular grid environment with 10, 000 nodes for the purpose of easy and clear visualisation. Figure 5.12 shows the distribution of user activities. Areas with high activities as well as regions with low network load can be identified. The introduced activity propagation methods (gossiping and sliding windows) result in a clear marking of traffic areas. Now, those areas can easily be identified by random walkers

77 5.4 Experimental Results 69 Figure 5.12: Distribution of activity areas (100 active users) in a network with 10, 000 nodes Table 5.1: Comparison of convergence time and average response time of active users to a maximum-rank node based on Figure 5.13 Propagation approach Random walker (time) Minority ant (time) Average response time Sliding window 4, 465 1, Neighbourhood gossiping 4, 378 1, or minority ants (see the discussion at the end of Section 3.1.3), and the K service nodes may be placed at the points with the K highest activity values (see Figure 5.13). It must be mentioned that this service node distribution strictly follows the order of the maxima. If the fixed number of service nodes or servers is lower than the number of identifiable maxima, some areas remain without their own service node or server and will be assigned to the closest available service node or server. In Table 5.1 the convergence times between a population of random walkers and minority ants to search for a suitable location of a server(i.e. a maximum-ranked node) are compared. Hereby, the population size of random walkers or ants is 100 and the highest ranked server shall be found. To conclude, a location is assigned to a service node within a group of active users so that the server can provide a high QoS to most active users in a network. Figure 5.14 presents the client assignment after the execution of the described neighbourhood allocation algorithm. It can clearly be seen that clients are assigned to the closest server, while the development of the borders between the different service areas is a random process.

78 70 5 Evaluation of User Activities Figure 5.13: Suitable server locations obtained corresponding to the activity distribution of the nodes in Figure 5.12 (d=32) Another question is, how long the respective allocation process may take until all clients are assigned to a service node/server. The duration of this process depends, of course, on the number of service nodes. A higher number increases the parallelism of the process. Figure 5.15 shows the simulation results obtained. It can easily be seen that there is a fast increase of the allocated clients mainly limited by the use of random walkers, and from a given number of service nodes on (2 or 3 in our example of 10, 000 nodes in the grid), more servers do not result in a significant speed-up anymore due to coordination processes and a higher number of intercluster exchanges of nodes. Figure 5.16 shows a grid of nodes with the corresponding user activities. Only one service node is placed in the grid. The upper right part of the figure shows the initial computation of the shortest distances around the server, and the activity- and distancedepending weights W i for the respective service node. These weights show the (expected) imbalance due to the initial random position of the service node. The lower left diagram presents the movement of the service node until a final position, where the subtree weights W i are balanced. Also, the shift of the shortest distances can be seen resulting in a location of the server close to the zone of highest activities. Last but not least, the main question is, if QoS improvements (i.e. better average response times) may be obtained by the described methods. Figure 5.17 presents the influence of the number of servers on the average response time in a network with 2, 500 nodes. It can easily be seen that already a low number of service nodes is enough to ensure the QoS, and any higher number of service nodes does not improve the response times significantly.

79 5.4 Experimental Results 71 Figure 5.14: Client allocation around the servers using the neighbourhood allocation for 2 and 5 assigned service node(s)/server(s)

80 72 5 Evaluation of User Activities Figure 5.15: Obtained number of clients assigned to their service nodes Figure 5.16: Optimising a service node s position in a cluster

81 5.4 Experimental Results 73 Figure 5.17: Influence of the server number K on the response times In the following chapter, the described algorithms shall be used in VoD systems to show their practicability and advantages in an application area with a growing number of users and permanently increasing network utilisation.

P2P-VoD on Internet: Fault Tolerance and Control Architecture

P2P-VoD on Internet: Fault Tolerance and Control Architecture Escola Tècnica Superior d Enginyeria Departamento d Arquitectura de Computadors i Sistemas Operatius P2P-VoD on Internet: Fault Tolerance and Control Architecture Thesis submitted by Rodrigo Godoi under

More information

Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays

Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays AGH University of Science and Technology Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Ph.D. Thesis Michał Grega Performance Analysis of a Query by Example Image Search

More information

Anomaly Detection with Virtual Service Migration in Cloud Infrastructures

Anomaly Detection with Virtual Service Migration in Cloud Infrastructures Institut für Technische Informatik und Kommunikationsnetze Kirila Adamova Anomaly Detection with Virtual Service Migration in Cloud Infrastructures Master Thesis 263-8-L October 22 to March 23 Tutor: Dr.

More information

Overcast: Reliable Multicasting with an Overlay Network

Overcast: Reliable Multicasting with an Overlay Network Overcast: Reliable Multicasting with an Overlay Network John Jannotti David K. Gifford Kirk L. Johnson M. Frans Kaashoek James W. O Toole, Jr. Cisco Systems {jj,gifford,tuna,kaashoek,otoole}

More information

WTF: The Who to Follow Service at Twitter

WTF: The Who to Follow Service at Twitter WTF: The Who to Follow Service at Twitter Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, Reza Zadeh Twitter, Inc. @pankaj @ashishgoel @lintool @aneeshs @dongwang218 @reza_zadeh ABSTRACT

More information



More information

Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann

Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann Resource Management for Scientific Application in Hybrid Cloud Computing Environments Dissertation by Simon Ostermann submitted to the Faculty of Mathematics, Computer Science and Physics of the University

More information

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science

More information

Integrating Conventional ERP System with Cloud Services

Integrating Conventional ERP System with Cloud Services 1 Integrating Conventional ERP System with Cloud Services From the Perspective of Cloud Service Type Shi Jia Department of Computer and Systems Sciences Degree subject (EMIS) Degree project at the master

More information

NESSI White Paper, December 2012. Big Data. A New World of Opportunities

NESSI White Paper, December 2012. Big Data. A New World of Opportunities NESSI White Paper, December 2012 Big Data A New World of Opportunities Contents 1. Executive Summary... 3 2. Introduction... 4 2.1. Political context... 4 2.2. Research and Big Data... 5 2.3. Purpose of

More information

Network Monitoring with Software Defined Networking

Network Monitoring with Software Defined Networking Network Monitoring with Software Defined Networking Towards OpenFlow network monitoring Vassil Nikolaev Gourov Master of Science Thesis Network Architectures and Services Faculty of Electrical Engineering,

More information

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Rodrigo Tavares Fernandes Instituto Superior Técnico Avenida Rovisco

More information

Of Threats and Costs: A Game-Theoretic Approach to Security Risk Management

Of Threats and Costs: A Game-Theoretic Approach to Security Risk Management Of Threats and Costs: A Game-Theoretic Approach to Security Risk Management Patrick Maillé, Peter Reichl, and Bruno Tuffin 1 Introduction Telecommunication networks are becoming ubiquitous in our society,

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 11 Web Retrieval with Yoelle Maarek A Challenging Problem The Web Search Engine Architectures Search Engine Ranking Managing Web Data Search Engine User Interaction

More information

ADAPTIVE LOAD BALANCING AND CHANGE VISUALIZATION FOR WEBVIGIL. Subramanian Chelladurai Hari Hara. Presented to the Faculty of the Graduate School of

ADAPTIVE LOAD BALANCING AND CHANGE VISUALIZATION FOR WEBVIGIL. Subramanian Chelladurai Hari Hara. Presented to the Faculty of the Graduate School of ADAPTIVE LOAD BALANCING AND CHANGE VISUALIZATION FOR WEBVIGIL by Subramanian Chelladurai Hari Hara Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment

More information

Modeling and Optimizing Content-based Publish/Subscribe Systems

Modeling and Optimizing Content-based Publish/Subscribe Systems Modeling and Optimizing Content-based Publish/Subscribe Systems vorgelegt von Diplom-Ingenieur Arnd Schröter von der Fakultät IV Elektrotechnik und Informatik der Technischen Universität Berlin zur Erlangung

More information

Stakeholder Relationship Management for Software Projects

Stakeholder Relationship Management for Software Projects Stakeholder Relationship Management for Software Projects BY FRANCESCO MARCONI B.S., Politecnico di Milano, Milan, Italy, 2010 M.S., Politecnico di Milano, Milan, Italy, 2013 THESIS Submitted as partial

More information

Grid Workflow Scheduling based on Incomplete Information

Grid Workflow Scheduling based on Incomplete Information Grid Workflow Scheduling based on Incomplete Information vorgelegt von Diplom-Informatiker Jörg Schneider aus Berlin von der Fakultät IV - Elektrotechnik und Informatik der Technischen Universität Berlin

More information

Challenges and Opportunities of Cloud Computing

Challenges and Opportunities of Cloud Computing Challenges and Opportunities of Cloud Computing Trade-off Decisions in Cloud Computing Architecture Michael Hauck, Matthias Huber, Markus Klems, Samuel Kounev, Jörn Müller-Quade, Alexander Pretschner,

More information

Agile Methodologies and Tools for implementing and testing X-Agent Based Macroeconomic models

Agile Methodologies and Tools for implementing and testing X-Agent Based Macroeconomic models Ph.D. in Electronic and Computer Engineering Dept. of Electrical and Electronic Engineering University of Cagliari Agile Methodologies and Tools for implementing and testing X-Agent Based Macroeconomic

More information

The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers

The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers Ibrahim Haddad A Thesis in the Department of Computer Science and Software Engineering Presented in Partial Fulfillment

More information

On the selection of management/monitoring nodes in highly dynamic networks

On the selection of management/monitoring nodes in highly dynamic networks 1 On the selection of management/monitoring nodes in highly dynamic networks Richard G. Clegg, Stuart Clayman, George Pavlou, Lefteris Mamatas and Alex Galis Department of Electronic Engineering, University

More information


IEEE/ACM TRANSACTIONS ON NETWORKING 1 IEEE/ACM TRANSACTIONS ON NETWORKING 1 Self-Chord: A Bio-Inspired P2P Framework for Self-Organizing Distributed Systems Agostino Forestiero, Associate Member, IEEE, Emilio Leonardi, Senior Member, IEEE,

More information

Collective Intelligence and its Implementation on the Web: algorithms to develop a collective mental map

Collective Intelligence and its Implementation on the Web: algorithms to develop a collective mental map Collective Intelligence and its Implementation on the Web: algorithms to develop a collective mental map Francis HEYLIGHEN * Center Leo Apostel, Free University of Brussels Address: Krijgskundestraat 33,

More information

Journal of Computer and System Sciences

Journal of Computer and System Sciences Journal of Computer and System Sciences 78 (2012) 1330 1344 Contents lists available at SciVerse ScienceDirect Journal of Computer and System Sciences Cloud federation in a

More information

Future Internet Roadmap. Deliverable 1.1 Service Web 3.0 Public Roadmap

Future Internet Roadmap. Deliverable 1.1 Service Web 3.0 Public Roadmap Future Internet Roadmap Deliverable 1.1 Service Web 3.0 Public Roadmap Authors: Elena Simperl (UIBK) Ioan Toma (UIBK) John Domingue (OU) Graham Hench (STI) Dave Lambert (OU) Lyndon J B Nixon (STI) Emilia

More information

Performance Optimising Hardware Synthesis of Shared Objects

Performance Optimising Hardware Synthesis of Shared Objects Fakultät II, Department für Informatik Dissertation Performance Optimising Hardware Synthesis of Shared Objects von Eike Grimpe zur Erlangung des Grades eines Doktors der Ingenieurswissenschaften Gutachter/Supervisor:

More information

HP Performance Engineering

HP Performance Engineering HP Performance Engineering Best Practices Series for Performance Engineers and Managers Performance Monitoring Best Practices Document Release Date: 201 Software Release Date: 2014 Legal Notices Warranty

More information

THÈSE. En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE. Présentée et soutenue le 03/07/2014 par : Cheikhou THIAM

THÈSE. En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE. Présentée et soutenue le 03/07/2014 par : Cheikhou THIAM THÈSE En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE Délivré par : l Université Toulouse 3 Paul Sabatier (UT3 Paul Sabatier) Présentée et soutenue le 03/07/2014 par : Cheikhou THIAM Anti

More information

Risk assessment-based decision support for the migration of applications to the Cloud

Risk assessment-based decision support for the migration of applications to the Cloud Institute of Architecture of Application Systems University of Stuttgart Universittsstrae 38 D 70569 Stuttgart Diplomarbeit Nr. 3538 Risk assessment-based decision support for the migration of applications

More information