Transparent Componentisation: A hybrid approach to support the development of contemporary distributed systems

Transcription

1 Transparent Componentisation: A hybrid approach to support the development of contemporary distributed systems Shen Lin School of Computing and Communications Lancaster University A thesis submitted for the degree of Doctor of Philosophy September 2010

2 Acknowledgements I wish to thank the many people who have helped and encouraged me to reach the completion of my PhD. First of all, my supervisors Dr. François Taïani and Professor Gordon Blair, who has provided advice, encouragement, insightful criticism, and proofreading throughout my PhD and in the preparation of this thesis. Secondly, the members of the Middleware Research Group, who offered me invaluable support and a friendly work environment during my time at Lancaster. I am grateful to Geoff Coulson, Paul Grace, Bholanathsingh Surajbali, Carlos Flores Cortes, Barry Porter, Rajiv Ramdhany, Vatsala Nundloll, and Nelly Bencomo for their helpful advices and guidance. Finally, I would like to thank my parents Jianyong and Hua, my uncle Qiang, and my fiancée Xiaolu, for the love and supports that I received from them.

3 Abstract Distributed computing systems are increasingly pervading all aspects of daily life. This rapid growth is characterised by the growing complexity of these systems, which unfolds in three dimensions. First, contemporary distributed systems must often cater for computation nodes with heterogeneous computing and networking capacities; second, they must deal with dynamic changes such as network churns and mobile nodes; and finally, they are often large scale and must be able to grow elastically to meet evolving expectations. This thesis investigates how the above complexity dimensions can be made easier to control by using novel software development approaches and frameworks. In particular, the proposed work seeks to develop approaches that promote three key properties in contemporary distributed systems: 1) configurability to construct customised systems that target heterogeneous operating environments; 2) dynamic adaptability to adapt to dynamic changes; and, 3) understandability and simplicity to facilitate software reuse and to hide low-level

4 programming details. To address these issues, this thesis proposes a hybrid software development approach that combines the advantages of component frameworks with that of high-level protocol specification languages. This hybrid approach, termed Transparent Componentisation, automatically maps a high-level protocol specification onto an underlying component framework. It thus allows developers to focus on the programmatic description of a distributed system s behaviour in simple and high-level terms. Meanwhile, it transparently retains the benefits of a component architecture such as component reuse, configurability, and runtime adaptability. As a proof of concept, this thesis presents the Whispers/GossipKit framework for gossip-based distributed systems, a representative subclass of contemporary distributed systems. Whispers/GossipKit is evaluated to demonstrate that it successfully retains the simplicity and understandability of a high-level protocol specification language while encouraging component reuse and supporting transparent (re)configuration thanks to its component underpinnings.

5 Contents 1 Introduction Gossip-based Systems Software Development Approaches Hypothesis and Approach Thesis Outline Associated Publications Gossip Protocols Introduction to Gossip Protocols Gossip for Disseminating Information Peer Sampling Service Convergence of Properties Gossip on Mobile Adhoc Networks Gossip Patterns Underlying Elements Key Patterns Other Contextual Issues Coexisting Gossip Protocols Types of Coexistence Example of Complex Gossip Composite Discussion Summary iv

6 CONTENTS 3 Software Development Approaches Component-based Technology Overview of Component Approaches Lightweight Component Framework Component-based Middleware Platform Event-driven Systems Domain Specific Languages Domain Specific Languages for Components High-level Protocol Specification Languages Discussion Summary The Whispers/GossipKit Programming Framework General Principle Overview of The Whispers/GossipKit Programming Framework The GossipKit Component Framework Overview of the GossipKit Framework Common Architecture of Gossip Protocols Architecture Generalisation GossipKit s Event Engine Event Sources Event Data The Event Engine Configuration Runtime Reconfiguration The Whispers Protocol Specification Language Macro-level Programming Language Primitives Key Features Mapping Between Whispers Expressions and GossipKit Componentisation Mapping Decisions Case Study v

7 CONTENTS 4.6 Implementation Concerns Summary Evaluation Experimental Approach Primary Criteria Simplicity Reusability Configurability Reconfigurability Discussion of the Primary Evaluation Performance and Overhead Component Invocation Overhead Reconfiguration Process Overhead Memory Usage Discussion of the Performance Evaluation Summary Conclusion Summary Major Results Future Work Concluding Remarks A The Structure of GossipKit s XML Configuration 196 B GossipKit s XML Configuration File for the RPS Protocol 198 C Whispers Grammar 201 D Whispers Programs 206 E Java Implementation of the Random Peer Sampling Protocol 211 References 241 vi

8 List of Figures 2.1 Compositional gossip example OpenCom Component [Grace (2004)] Control-Forward-State Architectural Pattern [Grace et al. (2004)] Event-driven Architecture [Bhatti et al. (1998)] Development process Overview of the GossipKit framework GossipKit Common Architectural Pattern Compose two different components to realise a more complex Peer Selection service Use nested events to realise push-pull gossip in GossipKit GossipKit s Event Engine Use case study: configuring the RPS protocol Per-node program of RPS Component realisation of RPS A node running the Averaging application initially holds value The node successfully estimated the system size is SCAMP uses the reactive push pattern Configuration of RPS that uses periodic push-pull Customised configurations based on periodic push-pull to realised T-Man, Averaging, and Ordered Slicing 163 vii

9 LIST OF FIGURES 5.4 Anti Entropy uses the lazy push gossip pattern for data consistency Probabilistic broadcasting protocols used in mobile adhoc networks Initial random graph maintained by RPS th rounds since 1st reconfiguration Ring constructed at the 11th round Topology at the 20th round Grid constructed at the 23rd round Number of gossip rounds used to achieve ring topology. Gossip period is every 5 seconds Time length (measured in seconds) used to achieve ring topology Number of gossip rounds used to achieve grid topology. Gossip period is every 5 seconds Time length (measured in seconds) used to achieve grid topology Local process time of 8 protocols, measured in microseconds (µs) 176 viii

10 List of Tables 2.1 Reactive Push Pattern Periodic Push Pattern Periodic Pull Pattern Periodic Push-pull Pattern Lazy Push Pattern Decision Based Broadcast Pattern Sleep Based Broadcast Pattern A categorisation of gossip protocols Three main types of interactions The Control-Forward-State Pattern captures distributed algorithms in three different application domains Fields in a GossipKit event The structure of the Data Content field in GossipKit events Summary of Whispers language primitives The implemented gossip overlays overs all the gossip patterns The numbers of lines of code to implement the eight gossip protocols by using transparent componentisation, Java, and component configuration The complexity to implement the eight gossip protocols by using transparent componentisation, Java, and component configuration Components reused in the development of 8 gossip systems The number of lines of code in the reusable components and the specific components of each gossip system ix

11 LIST OF TABLES 5.6 The time used for reconfiguration Byte code size of GossipKit components Byte code size of the eight gossip protocols. The byte code size of the four composite protocols include the size of RPS GossipKit component size measured as the byte code size of the compiled Java class files Dynamic memory usage of the eight gossip protocols. The measurements of the two gossip protocols that run on wireless adhoc networks, Gossip1 and Gossip2, do not include the Jist/SWANS simulator x

12 Chapter 1 Introduction Distributed computing is increasingly pervading all aspects of daily life, significantly affecting human communication, business processes, and the way scientific experiments are carried out. For instance, Skype allows millions of users to make video calls and share files with other Skype users anywhere in the world; Amazon and Google infrastructures provide remote computing services and capacities (e.g virtual servers, data processing, storage) to individuals and businesses; and scientific testbeds such as PlanetLab [Peterson et Roscoe (2006)] and Weevil [Wang et al. (2005)] support research experiments that require large or distributed computation resources. As a result of this continuous increase of distributed applications, the complexity of contemporary distributed systems is rapidly growing in three dimensions Feiler et al. (2006). These systems are often heterogenous, involving computation nodes with varying capacities and operating over an increasing range of network- 1

13 ing technologies (e.g. fixed networks, mobile ad hoc networks, satellite links, etc.) that differ in their bandwidth and latency. This complexity is further compounded by highly dynamic network environments: nodes may be mobile; new nodes may join the network at any time; and existing nodes may leave, either voluntarily or by failures. Finally, contemporary distributed systems often grow dynamically to a large scale, involving very large numbers of participating nodes that are deployed over large areas (e.g. Internet-based applications such as Skype and BitTorrent, large sensor systems for environmental monitoring [Sheldon et al. (2005); Popa et al. (2005); Howard et Flikkema (2008)]). Gossip-based systems have been proposed as one of a number of key techniques in handling this complexity of contemporary distributed systems. These gossip-based systems are adaptive to node and network heterogeneity, possess self-organising properties to cope with dynamic environments, and offer scalable communication in potentially large-scale networks. However, in spite of these advantages, this thesis argues that the lack of development support is one of the main reasons that prevent gossip-based systems from being widely adopted in practice. As section 1.1 will argue, existing gossip-based systems lack development support in four main aspects: 1) the development process of these systems can involve complex issues such as design choices, networking idiosyncrasies, and protocol composition; 2) most gossip-based protocols/systems have been developed in a one-off and per-protocol fashion, making them hard to reuse; 3) they 2

14 1.1 Gossip-based Systems cannot be flexibly configured to provide customised systems for specific requirements or operating environments; and finally, 4) they provide limited support on runtime reconfiguration of system behaviours, which is an essential property for long running or adaptive systems. To facilitate the development of gossip-based systems, this thesis investigates new design patterns and architectural principles that can be integrated into a common platform to simplify the programming of gossip systems, promote code reuse, and ease system configuration and reconfiguration. Towards this aim, this thesis describes the design, implementation, and evaluation of a novel programming framework that combines the strengths of a high-level protocol specification language and a component framework, based on the survey of a variety of gossip-based protocols and software mechanisms that are potentially beneficial for developing gossip-based systems. In addition to its main objective, this thesis hopes that the lessons learnt from the building of the programming framework of gossip-based systems shed light on the development support of a wider range of contemporary distributed systems in general. 1.1 Gossip-based Systems Gossip protocols (also known as epidemic protocols) [Kermarrec et van Steen (2007); Friedman et al. (2007)] have emerged as a promising approach to address some of the problems of contemporary distributed systems. In a typical gossip 3

15 1.1 Gossip-based Systems protocol each node of a network randomly communicates with a small number of their peers, causing information to rapidly spread over the network in the way a rumour is gossiped amongst a group of people or a disease epidemic spreads over a population. Compared with more traditional systems, gossip-based approaches offer several advantages: i) many gossip algorithms [Frey et al. (2009); Haas et al. (2002); Nedos et al. (2007)] are aware of node and network heterogeneities, and are able to adapt the behaviours of individual nodes to cope with these heterogeneities; ii) gossip protocols are self-organising in dynamic networks since their randomised communication allows multiple routes to be explored in case of failures and they do not require centralised coordination; and finally, iii) because each node performs a limited set of operations at a fixed rate, they provide scalable communication in large distributed systems. Because of these benefits, gossip protocols have been considered as a promising approach for a wide range of services such as ad hoc routing [Haas et al. (2002)], multimedia streaming [Liu et Zhou (2006)], replicated database consistency [Demers et al. (1987); Holliday et al. (2003)], information dissemination [Birman et al. (1999); Chandra et al. (2001); Luo et al. (2003)], data aggregation [Kempe et al. (2003); Jelasity et al. (2005); Gupta et al. (2001)], topology construction [Jelasity et Babaoglu (2005)], and peer sampling [Voulgaris et al. (2005); Jelasity et al. (2007)]. This thesis argues that developing gossip-based applications can be a complex task because of insufficient software development support and, as a result, most 4

16 1.1 Gossip-based Systems implementations so far have been focused on prototypes and preliminary deployments. The following discusses four main aspects where software development support is required to facilitate the development of gossip-based systems. Simplicity The existing development process should be simplified, providing a better understanding of the design space of gossip protocols and hiding lowlevel programming details from developers. The need of software development support arises from the wide range of gossip protocols that target a diverse set of services that operate on different networking environments that range from fixed networks to mobile adhoc networks (see examples above). These protocols differ in many aspects (communication patterns, state, network requirements and usage), making a large design space that needs to be explored by developers. Furthermore, individual gossip protocols must offer distinct APIs to a variety of applications that use them, and must grapple with the idiosyncrasies of the underlying network they rely on (APIs, churn, resilience, costs, performance, meta-information). Advanced gossip protocols are also often composite, and rely on simpler gossip protocols for part of their functionality. Reusability Most existing gossip protocols/systems have been developed in an adhoc and per-protocol fashion that does not help to capture their similarities for software reuse. As the number of gossip protocols and the community that research them have grown considerably, the lack of a unifying framework means much development effort is wasted in the implementation of gossip-based systems 5

17 1.1 Gossip-based Systems that involve duplicated communication patterns and local processing algorithms. This repeated effort also hampers the use of standardised implementation, and increases the risk to introduce programming errors into systems. Configurability Gossip-based systems provide many different services in a diverse range of environments. Thus they often need to be customised for their targeting applications and operating environments. For instance, gossip protocols that run on mobile devices often require a minimal configuration to fit the memory constraints while those running on fixed networks can include rich non-functional elements (e.g. quality of service, fault tolerance, real time requirements). This fact strongly requires flexible configuration support that allows developers of gossip-based systems to easily compose different customising features into some base gossip algorithms. However, the research community of gossipbased mechanisms has been focused on prototyping novel gossip-based algorithms and exploring a wider range of possible application domains. These prototype systems are often implemented in a single programming language and then compiled and linked to a static application, hence resulting in monolithic code that is not configurable. The lack of configurability means developers have to code different customising features in low-level programming languages for individual gossip systems, and hence wastes the implementation effort. Reconfigurability Most existing gossip-based systems are statically encapsulated in a self-contained environment, which does not allow system behaviours 6

18 1.2 Software Development Approaches to be flexibly reconfigured at runtime apart from a fixed number of parametric changes. This means that these systems are unable to modify their behaviours [Frey et al. (2009); Haas et al. (2002); Nedos et al. (2007)], for instance by actions such as adding new protocols, discarding existing protocols, or changing part of a protocol s elements. Thus, existing gossip-based systems provide limited capacities to adapt to changing conditions such as dynamic networking environments and new application requirements. In practice, this poor support of reconfigurability prevents gossip-based systems from being adopted for certain types of services such as long running systems with evolving requirements and infrastructures that are difficult or expensive to install new programs on (e.g. a sensor network with nodes scattered in a wild area). 1.2 Software Development Approaches Several existing software mechanisms are potentially exploitable to support the development of gossip-based systems and distributed systems in general (e.g. component-based technology [Heineman et Councill (2001)], model-driven architecture [Voelter et Schmidt (2006)], aspect-oriented development [Kiczales et al. (1997)], domain specific language [Mernik et al. (2005)]). These mechanisms are often integrated into a middleware framework [Emmerich et al. (2007)], providing general guidelines and tools for building specific middlewares and distributed ap- 7

19 1.2 Software Development Approaches plications. Amongst these mechanisms, component-based technology and domain specific language are particularly relevant in the context of this thesis. A component-based technology specifies a system as a set of components and their interactions. It allows these components to be developed independently and then assembled together to form a concrete software system according to some domain specific rules. A component architecture is often described in a general purpose or a domain specific architectural description language, which provides declarative expressions to formally represent component-based systems in terms of components, connectors, and composition rules. Component-based technologies provide several benefits for developing distributed systems, including those based on gossip: i) components are reusable in the composition of different systems; ii) component mechanisms ease system (re)configuration, because components can be flexibly composed to provide customised systems and advanced component frameworks often allow their architectures to be modified at runtime to adapt to environment/requirement changes [Coulson et al. (2004); Bruneton et al. (2006)]; and iii) component configurations provide high-level abstractions of software system architectures in terms of components and connections, hence facilitating exploratory design and architectural analysis. Because of these advantages, components have been successfully applied in the industry with Enterprise JavaBeans (EJB), the CORBA Component Model (CCM), and Microsoft DCOM, and have found a strong following in the research community, giving rise to a number of 8

20 1.2 Software Development Approaches lightweight component frameworks (e.g. OpenCom [Coulson et al. (2004)], Fractal [Bruneton et al. (2006)]) and their associated middleware frameworks (e.g. GridKit [Grace et al. (2004)], RAPIDWare [McKinley et al. (2001)]). In practice, coarse-grained components are relatively less flexible and reusable in the construction of component-based systems [Mehta et Heineman (2002)], comparing with fine-grained components. On the other hand, fine-grained component systems often involve a large number of components and connectors [Steppe et al. (2004); Edwards et al. (2004)], making the architectural description of component systems a tedious and error-prone task. Furthermore, because component architectural descriptions are by nature declarative and focus on structures rather than algorithms [Clements (1996)], the low-level implementation details of individual components still need to be described in a general purpose programming language such as Java or C. This in turn hampers the understandability of component systems, preventing developers who are unfamiliar with a particular framework from understanding component configurations if they do not dive into the specifics of each component. In contrast to component frameworks, protocol specification language is a specific class of domain specific languages that focuses explicitly on the algorithmic logic of distributed systems (e.g. Lotos [van Eijk et Diaz (1989)], Estelle [Amer et Çeçeli (1990)], PLAN-P [Thibault et al. (1998)], Promela++ [Basu et al. (1997)], Mace [Killian et al. (2007)]), rather than on their compositional structures. These 9

21 1.3 Hypothesis and Approach protocol specification languages provide simple and high-level expressions to capture the logical behaviours of distributed and parallel algorithms. Compared to components, protocol specification languages are less able to support dynamic adaptive systems. Because they do not explicitly expose fine-grained dependencies between different program parts, they make it more difficult to reason about and implement dynamic changes. As a result, they often require adaptive behaviours to be statically hard-wired prior to system deployment, or a complete new version to be installed to replace an existing one. 1.3 Hypothesis and Approach Contemporary distributed systems such as those based on gossiping mechanisms involve an increasing complexity in terms of heterogeneity, dynamism, and large scale. Following the rapid emergence of these complex systems, it is increasingly important to provide new software mechanisms to support their development. Based on this observation, this thesis aims to offer new design patterns and architectural principles to facilitate the development of contemporary distributed systems. To narrow down this broad scope of research problem, this work focuses on the development support for gossip-based systems a representative subclass of contemporary distributed systems. More specifically, it aims to address the existing four problems that are associated with developing gossip-based 10

22 1.3 Hypothesis and Approach systems (see section 1.1): simplifying the programming effort, promoting code reuse, supporting configurable system development, and runtime reconfiguration. Considering these four objectives and the potential software development approaches to achieve them, both component frameworks and high-level protocol specification languages seem to be highly complementary for developing contemporary distributed systems such as gossip-based systems. Component-based technology seems a promising solution to improve reusability and (re)configurability while high-level protocol specification language provides abstractions of system behaviours to simplify the programming and the understanding of distributed algorithms (see section 1.2 above). However, these two approaches are difficult to combine: bringing components to high-level protocol specification languages tend to undermine their programmatic simplicity by forcing developers to navigate back and forth between structural and behavioural concerns. This tension becomes worse with finer structural decomposition which jeopardises this approach for fine-grained adaptation. Symmetrically, adding a behavioural dimension to component assembly makes composition harder to grasp and manipulate by developers, canceling the very benefits it should bring. As a result, existing systems often choose the best one of these technologies for their development, preventing themselves from offering reusability, (re)configurability, and simplicity at the same time. This thesis describes the design and implementation of a hybrid program- 11

23 1.3 Hypothesis and Approach ming framework that combines the strengths of both techniques. More precisely, this work advocates a hybrid framework that consists of two layers: 1) a highlevel protocol specification language that allows developers to focus solely on describing the logical behaviours of gossip algorithms, and supports automatical transformation of these algorithmic descriptions into software components and their composition rules, and 2) a underlying component framework that promotes component reuse and supports (re)configuration based on the components and composition rules generated by the higher-level language layer. The main hypothesis of this thesis is that the application of this hybrid approach can facilitate the development of gossip-based systems in terms of simplicity, reusability, configurability, and reconfigurability. More specifically and in order to test this hypothesis, this thesis aims to achieve the following goals. Identify an overall architectural pattern that captures the key behaviours of gossip protocols as components and their interactions. This process is based on the study of a variety of existing gossip algorithms in the literature. Capture the commonalities and variabilities of gossip protocols in a component framework. Implement these common and variable parts as software components that can be reused in the development of different gossip protocols/systems. Seek a configuration mechanism that allows gossip-based systems to be as- 12

24 1.3 Hypothesis and Approach sembled from individual software components according to those component interactions identified in the overall architectural pattern. Furthermore, this thesis looks for a reconfiguration mechanism that can dynamically modify the behaviours of gossiping nodes across a network. This reconfiguration mechanism should be scalable and be tolerant to dynamic networking environments that involve transient node and network failures. Investigate the relationship between component configuration and high-level protocol specification language. More precisely, the components and interactions in the common architectural pattern of gossip protocols need to be mapped onto the language constructs of a high-level protocol specification language. In addition, this language should offer simple abstractions to describe programming details such as message transmissions, network interfaces, protocol composites and data structure and processes. This thesis does not address a number of important problems that are also related to gossip-based technology. First, this thesis does not aim to provide any new gossip algorithm or discover new application domains that gossip systems can be used. Second, this thesis adopts an experimental approach to system research, therefore the evaluation is based on experiments and quantitative measurements rather than formal proofs. Finally, the security issues of gossip-based systems are not investigated in this thesis. 13

25 1.4 Thesis Outline 1.4 Thesis Outline The following two chapters investigate the state of the art that is in the related areas of research to this thesis. The aim of chapter 2 is two-fold: first, it surveys gossip protocols to a wide extent, looking at their fundamental communication patterns, their key behaviours, and how they interface with their upper layer applications; and second, based on this detailed survey, it analyses the problem space that is related to software support of gossip-based systems and explains why this thesis decides to focus on the reusability, (re)configurability, and simplicity. Chapter 3 discusses the software engineering approaches and architectures that are potentially applicable for gossip programming, examining closely their strengths as well as weaknesses. Chapter 4 first discusses the general principle that is applicable to the broad research question of this work how best to provide software development support for a wide variety of contemporary distributed systems. It then presents the proposed solution of this work to achieve the specific research goals a hybrid programming framework that combines the strengths of both high-level protocol specification languages and componentisation to support the development of gossip-based systems. Chapter 5 focuses on the evaluation of the implemented framework. It describes the experimental setup and results, and analyses in detail how each of the 14

26 1.5 Associated Publications research problems identified in chapter 2 is addressed with these results. Finally, chapter 6 summaries the major results and contributions of this thesis, and suggests areas with potential for future work. 1.5 Associated Publications Subsets of my work towards this thesis have been published in international conferences and workshops as follows: Shen Lin, François Taïani, Marin Bertier, Gordon S. Blair, Anne-Marie Kermarrec, Transparent Componentisation: High-level (Re)configurable Programming for Evolving Distributed Systems, accepted by the Dependable and Adaptive Distributed Systems track of the 26th Symposium on Applied Computing, March 2011, Taichung, Taiwan. Shen Lin, François Taïani, Gordon S. Blair, Facilitating Gossip Programming with the GossipKit Framework, In Proceedings of the 8th IFIP International Conference on Distributed Applications and Interoperable Systems, pp , June 2008, Oslo, Norway. Shen Lin, François Taïani, Gordon S. Blair, GossipKit: A Framework of Gossip Protocol Family, In Proceedings of the 5th International Workshop on Middleware for Network Eccentric and Mobile Applications, 15

27 1.5 Associated Publications September 2007, Magdeburg, Germany. Finally, although not central to this thesis, the following publication demonstrates an interesting application of the middleware framework reported in the thesis. The complete publication is included in Appendix A. Shen Lin, François Taïani, Gordon S. Blair, Exploiting Synergies Between Coexisting Overlays, In Proceedings of the 9th IFIP International Conference on Distributed Applications and Interoperable Systems, pp. 1-15, 9-11 June 2009, Lisbon, Portugal. 16

28 Chapter 2 Gossip Protocols Domain analysis is often the first phase in the software development process of a specific domain [Hjørland et Albrechtsen (1995)]. A domain analysis provides several benefits for the subsequent design and implementation of a software development framework: i) it helps to better understand and model the domain; ii) it encourages systematic software reuse by identifying the recurring elements in a software domain; and iii) it captures the key elements of a software system and identifies the interactions between these elements, hence facilitating the development of (re)configuration mechanisms. In the case of this thesis, domain analysis is applied to the gossip protocols to support systematic design and implementation of a programming framework for gossip-based systems. More specifically, this chapter explores the design space of gossip-based protocols, identifying the common and the variable parts of a typical system in the gossip family. It starts with an introduction to gossip 17

29 2.1 Introduction to Gossip Protocols protocols with reference to typical examples (section 2.1). This introduction is followed by a detailed survey of gossip protocols, capturing a finite set of recurring elements, architectural patterns, and application contexts that are involved in the design of a gossip protocol (section 2.2). This survey involves the study of over 30 gossip protocols from the literature, which makes a representative sample of the gossip family. During the survey of gossip protocols, this work also notices that multiple gossip protocols can coexist on a single node in practice, running independently to provide distinct services or collaborating with each other. These coexisting gossip protocols are presented in section 2.3, together with a discussion on the potential challenges that are associated with developing gossip systems that comprise multiple coexisting gossip protocols. Finally, section 2.4 discusses the particular issues of gossip-based development that this thesis focuses on. 2.1 Introduction to Gossip Protocols Gossip-based algorithms have been applied to develop distributed systems for more than 20 years. In particular, these gossip approaches have become extremely popular in the past decade and given rise to a number of gossip protocols. In order to explore the design space of gossip protocols, the work described in this thesis has studied over 30 gossip protocols from the literature. The following discusses these gossip protocols from a historical perspective with reference to some 18

30 2.1 Introduction to Gossip Protocols representative examples. First, it discusses the early gossip protocols that were applied to propagate information, and associated the scalability of gossip communication with the underlying epidemic theory (section 2.1.1). Because there lacked a scalable membership service for these early gossip protocols to select random peers from the network group, gossip-based peer sampling protocols are developed (section 2.1.2). These peer sampling protocols illustrated the convergence property of gossiping, which opened a broader scope of distributed services that gossip protocols can be applied to (section 2.1.3). More recently, gossipbased approaches have been used to operate on mobile adhoc networks (section 2.1.4) Gossip for Disseminating Information The underlying idea of gossip dates back to the work at Xerox Research Center, which applied gossip-based communication to maintain wide-area database systems in the Clearinghouse project [Demers et al. (1987)]. In this project, a database was replicated at several hundred or thousand nodes in a fullyconnected, heterogeneous, slightly unreliable, and slowly changing network. This large and complex environment made it difficult to maintain consistent data amongst all the database replicas. To address this issue, this project required algorithms that can propagate database updates at some nodes to all the other 19

31 2.1 Introduction to Gossip Protocols nodes in an efficient and robust way as well as to scale gracefully as the number of nodes increases. The algorithm proposed by Xerox Research Center was a two-phase protocol: an efficient unreliable broadcast (e.g. UDP broadcast) is used to propagate the updates to as many nodes in the network as possible; and running in the background, an anti-entropy algorithm is used to repair the nodes that failed to receive the unreliable broadcasting messages. In this anti-entropy algorithm, every node randomly chooses another node at each fixed time interval t, and then sends its database contents to this random node to resolve the differences between the two nodes. Demers et al. (1987) showed this anti-entropy algorithm is essentially related to the mathematic theory that studies the propagation of epidemics [Pittel (1987)]. The nodes that hold new database contents are similar to the population that are infected by an epidemic disease, and anti-entropy is one of the simple epidemic behaviours that allows the epidemic to eventually spread over the entire population. The epidemic theory [Pittel (1987)] also provides the worst case scenario of anti-entropy s infection speed in the case of fully-connected topology: when starting with a single infected node, the average number of time intervals (N(t)) that are required to infect the entire population is: N(t) = log 2 (n) + ln(n) + O(1) 20

32 2.1 Introduction to Gossip Protocols for large node number n. This logarithmic infection speed makes the anti-entropy algorithm a scalable approach to disseminate information over large networks. Apart from this scalable propagation speed, the anti-entropy protocol illustrates the two key benefits of gossip protocols. First, it involves repeated communication with nodes that are randomly selected with a uniform probability, and hence allows multiple routes to be explored to avoid network failures. Second, because it consumes a fixed amount of local and network overheads on each node, this algorithm is scalable to large networks. However, Xerox s anti-entropy protocol is only tolerant to network failures, but not to node failures. Furthermore, because the Xerox work assumes database contents are updated at a low frequency (at most a few per second), it did not consider real-time applications that require continuous delivery of multimedia updates to all nodes (e.g. Internet radio, TV or conference). In spite of these limitations, Xerox s gossip-based approach opened new possibilities for implementing scalable applications, and it motivated several subsequent research works that improved on Xerox s original anti-entropy algorithm for database consistency (e.g. [Holliday et al. (2003)]) and reliable multicast (e.g. pbcast [Birman et al. (1999)], lpcast [Eugster et al. (2003)]). Amongst these works, Holliday et al. (2003) used gossip to propagate transaction records to prevent concurrent database transactions from accessing uncommitted data; pbcast [Birman et al. (1999)] proposed additional controls to tolerate node failures and higher throughput of updates 21

33 2.1 Introduction to Gossip Protocols than Xerox considered, making gossip a possible approach for multicast services that have real-time requirements; and lpcast [Eugster et al. (2003)] suggested scalable message buffering for gossip-based multicast Peer Sampling Service The above mentioned gossip algorithms assume that each node knows all other participating nodes in order to select a random subset of nodes to gossip with. This model requires a synchronised view of membership on all nodes and increasing memory capacity as the system grows, which is inherently unscalable. This problem motivated the development of scalable membership services such as SCAMP [Ganesh et al. (2001)] and RPS (random peer sampling) [Jelasity et al. (2007)] that allow gossip-based systems to operate on dynamic networks that involve churns. SCAMP allows each newly joined node to propagate a gossip message that contains its contact information to existing members. On receipt of the gossip message, every node either adds the contact information of the new node to its view or forward the message to members in its view based on some probabilistic decisions. As a result, SCAMP maintains a static graph where each node has a view that contains the contact information of log 2 (n) random nodes for group size n, and it reactively re-balances the graph every time a new node joins the group. 22

34 2.1 Introduction to Gossip Protocols However, SCAMP is less robust in dynamic systems where nodes join and leave frequently [Jelasity et al. (2007)]. To address this issue, RPS uses periodic gossip to ensure that each node maintains a fresh view of random members. In RPS, each node maintains a random sample of the node population. This sample contains the contact information of C random peers in the system, where C is a small constant number. At each gossip round, each node n selects one random peer i from its sample and sends a copy of its sample to i. On receipt of the sample from node n, i immediately replies with a copy of its sample and carries out an update process on its local sample. In this process, Node i first merges n s sample with its own to form a sample with size 2C, and then discards C random peers in the sample to obtain an updated sample with size C. Finally, on receipt of i s sample, node n also carries out the same update process on its sample. By executing the RPS algorithm, each node obtains a fresh sample that provides the contact information of C different random nodes for selection at each gossip rounds. To join the network, a node adds the contact information of itself and of an existing random node in the system to start the periodic execution of the main RPS algorithm. The periodic behaviour makes RPS more robust to churns as the experimental result in [Jelasity et al. (2007)] demonstrates, but it repeatedly generates a potentially large amount of network messages. 23

35 2.1 Introduction to Gossip Protocols Convergence of Properties The RPS algorithm illustrates the convergence feature of gossip-based approaches: unlike gossip-based dissemination algorithms (e.g. Xerox s work, pbcast) that use gossip to spread the same information to all nodes, RPS requires individual nodes to periodically exchange their local properties (in this case local views of the global membership) with some random peers, and as a result eventually brings these local views to converge to a particular global property (in this case a balanced random graph where each node contains a uniformly random sample of the membership). Jelasity et al. (2007) also provided experimental results to demonstrate that this convergence feature scales to distributed applications that involve very large number of nodes, demonstrating an average speed of O(log 2 (n)) gossip rounds to reach global convergence for large node number n. This scalable convergence feature, together with the scalable membership service provided by RPS, significantly broadens the scope of application domains that gossip-based mechanisms can be employed for. To name a few examples, T-Man [Jelasity et Babaoglu (2005)] creates and maintains various structured overlay topologies (e.g. ring, grid, cluster) from an initially random graph. The convergence feature of gossip-based communication has also been exploited to aggregate various converged global knowledge such as the maximum, minimum or average of local data [Kempe et al. (2003); Jelasity et al. (2005); Gupta et al. (2001)]. Finally, 24

36 2.1 Introduction to Gossip Protocols ordered slicing [Jelasity et Kermarrec (2006)] allows nodes in a distributed system to converge to partitioned groups that are ordered with respect to a particular measurable property (e.g. bandwidth, workload) Gossip on Mobile Adhoc Networks More recently, gossip-based approaches have also been adopted to provide scalable and efficient communication in mobile adhoc networks [Haas et al. (2002); Luo et al. (2003); Sasson et al. (2003); Hou et al. (2006, 2005)]. Mobile adhoc networks significantly differ from IP-based networks on several aspects: 1) communication in mobile adhoc networks often relies on radio broadcast, which allows a broadcast message to be received by all the nodes within the radio broadcast range (i.e. physical neighbours); 2) nodes are less robust, and often have limited resources such as memory, processing capacity, and energy power; 3) links are less reliable because of dynamic factors such as mobility, lack of collision detection, and faulty nodes; and 4) it is more expensive to establish point-to-point routes with distant nodes because of dynamic networking environments and unreliable links. Because of these differences, it is not practical to use gossip algorithms that rely on random peer selection in mobile adhoc networks [Friedman et al. (2007)]. Instead, most gossip algorithms in mobile adhoc networks focus on efficient dissemination of broadcasting message to all or almost all nodes. Traditionally, 25

37 2.1 Introduction to Gossip Protocols message dissemination in mobile adhoc networks is based on certain forms of flooding, whereby every node forwards the received broadcast message to its physical neighbours. Because a single radio broadcast can be heard by all the physical neighbours, flooding often involves many unnecessary broadcasts, and hence wastes energy and generates heavy network overhead. Instead of forwarding all the received messages, gossip-based algorithms in mobile adhoc networks only forward broadcast messages with a certain probability. For instance, one of the gossip protocols proposed in [Haas et al. (2002)] suggested that gossip probability of each node should be proportional to the number of physical neighbours; Smart Gossip [Kyasanur et al. (2006)] detects nodes that are on the critical paths of the network, and then informs these nodes to broadcast with higher probability; and the gossip-based sleep protocol (GSP) [Hou et al. (2006)] instructs individual nodes to enter sleep mode with certain probability so that they do not correspond to the incoming messages for a short period. The experiments of these probabilistic gossip algorithms have shown that they effectively reduce the number of broadcasts in a dissemination while achieving almost the same message delivery as flooding. This is because the probabilistic gossip exhibits a bimodal behaviour as predicted by the percolation theory [Meester et Roy (1996)]: as soon as the gossip probability p meets some minimum threshold, a broadcast message will be received by all or almost all nodes with a very high probability. 26

38 2.2 Gossip Patterns 2.2 Gossip Patterns This section explores the design space of gossip protocols. First, it analyses the recurring elements that are involved in gossip protocols, discussing the possible design choices for each element and their potential benefits and limitations (section 2.2.1). Based on this analysis, section presents five recurring architectural patterns that are formed by these elements. Finally and in addition to the analysis of gossip protocols internal behaviours such as their key elements and architectural patterns, section examines the contextual issues that might affect the gossip design Underlying Elements Based on the study of over 30 gossip protocols/systems, this work observes several recurring elements that are used in all the gossip algorithms: 1) a gossip message is either triggered reactively or periodically; 2) gossip communication involve three basic styles of data flows (i.e. pull, push, push-pull); and 3) all the gossip algorithms use certain forms of randomised communication, as the following discusses. Gossip Trigger Gossip protocols can execute periodically (e.g. RPS) or react to external events (e.g. SCAMP). Periodic gossip effectively avoids possible traffic congestion. This is because gossip rounds are not synchronised and many periodic 27