1 Adaptive Hybrid Multicast with Partial Network Support Huan Luo and Khaled Harfoush Department of Computer Science North Carolina State University Raleigh, NC,USA Abstract In this paper, we propose a new multicast scheme, PAM, which as opposed to native IP multicast, does not require all routers to be IP multicast-enabled, and as opposed to existing application-level multicast, does not exclude network support. Instead, PAM relies on partial network support, selects a small subset of routers as PAM-enabled multicast routers that are strategically located to serve group communication, and adapts its selection based on group dynamics. As a result, PAM (1) is suitable for both sparse and dense communication groups, (2) can reduce the network overhead inherent in native IP multicast, and (3) does not suffer the delay stretch and the high stress inherent in application-level multicast. Experimental results on both synthetic and realistic Internet topologies, for both sparse and dense groups, reveal that PAM can achieve efficient group communication with no delay stretch, an average stress of merely 1.25, while using less than 15% of the multicast routers that are needed in native IP multicast. Index Terms IP Multicast, Application Layer Multicast, stress, delay stretch. I. INTRODUCTION Group communication is required to support popular distributed applications such as IPTV and distributed gaming. Efficient group communication results in pleasant user experience by avoiding unnecessary stretch in communication delays, and a better use of the network by avoiding unnecessary stress on network resources. IP multicast was proposed to support efficient group communication. However, IP multicast is not widely deployed due to its overhead. Specifically, native IP multicast requires (1) all routers to be IP multicast enabled, and (2) each router to maintain a multicast routing table per communicating group. The multicast routing table is updated as group members join and leave, which consumes network resources especially when the number of co-existing groups is large. To counter the limitations of IP multicast, several group communication approaches have been proposed . (1) Small Group Multicast (SGM)  tries to remove the burden of running multicast routing protocols and maintaining multicast state for small multicast groups from the network. In SGM the multicast state is carried in each data packet and is processed in real time along the forwarding path. Similarly, Reunite  employs recursive unicast to fulfill the multicast functionality. (2) Application-level multicast (ALM) does not assume network support , , , , , , , . Instead, it relies on self-organizing group members in an overlay and communication between nodes is through other overlay members using unicast packets. As a result, ALM communication incurs extra communication delay and induces stress on network resources. It has been shown that ALM with strategically located proxies may work better than pure ALM . Both SGM and ALM do not scale to large groups. (3) Hybrid multicast schemes , , ,  connect IP multicast islands through unicast tunnels in order to limit the IP multicast overhead in the network. For example,  relies on connecting branching routers in a multicast delivery tree through unicast tunnels. In this paper, we propose a new hybrid multicast scheme, PAM, which as opposed to native IP multicast, does not require all routers to be IP multicast-enabled, and as opposed to ALM, does not exclude network support. Instead, PAM relies on partial network support, selects a small subset of routers as PAM-enabled multicast routers that are strategically located to serve group communication, and adapts its selection based on group dynamics. As a result, PAM, as opposed to existing hybrid multicast schemes, is suitable for both sparse and dense communication groups. PAM also reduces the network overhead inherent in native IP multicast, and does not suffer the delay stretch and the high stress inherent in applicationlevel multicast. Experimental results on both synthetic and realistic Internet topologies, for both sparse and dense groups, reveal that PAM can achieve efficient group communication with no delay stretch, an average stress of merely 1.25, while using less than 15% of the multicast routers that are needed in native IP multicast. The rest of the paper is organized as follows. In Section II we overview the PAM architecture. In Section III, we provide the details of the PAM protocol. In Section IV we present and discuss the experimental results. We finally conclude in Section V. II. ARCHITECTURE OVERVIEW A node acting as the source of the PAM multicast transmission is expected to advertise the group ID defined as <source IP address, port number> through a well-known web page. The multicast group may also be assigned a class D multicast
2 Fig. 1. A content delivery tree. address and the mapping between the class D address and the group ID can be done through session directory services . A content delivery tree rooted at the source node, with group members at its leaves, is constructed as group members join the system Refer to Section III for the tree construction details. As shown in Figure 1, the delivery tree may have some routers acting as PAM multicast routers, which we refer to as m-routers, and the other routers being unicast routers, which we refer to as u-routers. Each router, R, whether it is an m-router or a u-router, maintains IP address information about its closest m-router for this group along each downstream interface, and if no m-router exists along a downstream interface, then R also maintains information about all groups members that are reachable through this downstream interface over the delivery tree. We refer to the nodes for which a router, R, maintains IP address information as R s children, and refer to R as their parent. For example, in Figure 1, routers M 0, M 1 and M 2 are m-routers. Router M 0 s children are group members G 0, G 3 and m-routers M 1 and M 2. Router M 1 s children are group members G 1 and G 2. Router M 2 s children are group members G 4, G 5, G 6 and G 7. Routers U 0, U 1, U 2, U 3, U 4 and U 5 are u-routers. G 0 is a child for U 0 and for U 1. M 1 is a child for U 2 and U 3. G 4 and G 5 are the children for U 5. G 4, G 5, and G 6 are the children for U 4. Furthermore, each router maintains information about its parent in the tree, which is typically an m-router, and the number of hope towards the parent. Each m-router creates an IP tunnel with each of its children. Upon receiving content from the source, whether directly or through intermediate routers, an m-router creates and sends a copy to each of its children through the corresponding tunnel (IP-in-IP encapsulation). The arrows in Figure 1 show the tunnels used by each m-router. Note that if an m-router, M, has a directly connected child, which is also an m-router then M can simply send an IP multicast packet, using the group s class D address, to this child instead of doing IPin-IP encapsulation. For example, M 0 in Figure 1 can send IP multicast packets to M 2. On the other hand, children information at u-routers are not used to route/tunnel PAM multicast packets. Instead, they are used merely to decide whether they should switch to m-routers or not. Thus u-routers will forward tunneled packets from their upstream parents in the tree just like typical unicast routers do. That is, u-routers, as opposed to m-routers, forward one packet at a time without the overhead of replicating and sending content to multiple destinations. The question now is: How are m-routers selected? Or more specifically, what triggers a u-router s decision to convert to an m-router? To answer this question, notice that if all routers are m-routers, then there will be no extra stress at any link (the delivered content will traverse each link only once) but all routers will have to participate in the multicast content replication and routing. On the other hand, If all routers are u-routers, then the stress on many links will be high (content copies are likely to traverse the same links many times) but routers will merely forward packets with no multicast overhead. Notice that in either case, there is no extra delay stretch as is the case in ALM, and the tradeoff is between network stress and multicast overhead. PAM exploits this tradeoff. In other words, PAM will introduce m-routers only to reduce high stress. High stress will happen in the following two scenarios: A u-router, R, having a large number of children (high degree). The reason for the high stress in this case is that an upstream m-router, M, will have to duplicate packets for all R s children and these copies will all travel over the links connecting M to R, leading to the high stress over these links. The solution to the high stress in this case is to convert R into into an m-router. Figure 2(A) shows this case when R is a u-router compared to an m-router. A u-router, R, having (1) at least two children, and (2) having a large number of upstream u-router hops towards its parent (an m-router) in the tree. The reason for the high stress in this case is that R s parent will have to duplicate and send packets for all R s children over the large number of hops connecting R s parent to R. The solution to the high stress in this case is to convert R into into an m-router. Figure 2(B) shows this case when R is a u-router compared to an m-router. PAM resolves these high stress cases by converting problematic u-routers to m-routers through two thresholds: (1) A degree threshold, and (2) a hop threshold. If the number of children of a u-router exceeds the degree threshold, or if the number of upstream hops from a u-router to its m-
3 Fig. 2. Illustration of high stress due to (A) high degree and (B) high number of upstream u-router hops. router parent in the tree exceeds the hop threshold, then the router is converted into an m-router. Note that the information maintained at each router, the children and the number of hops towards the parent, is used to test whether the degree and hop thresholds are violated or not. The tree in Figure 1 is constructed with a degree threshold equal to four and a hop threshold equal to two. By varying these thresholds, PAM becomes varies in its aggressiveness in assigning multicast routers and managing the stress on the network resources, and adapts the locations of these multicast routers to the locations of the group members in the delivery tree. III. PROTOCOL DETAILS PAM mainly relies on the following messages: (1) join, (2) graft, and (3) prune. All these messages are sent upstream in the delivery tree. Routers, which are not willing to run the PAM protocol are not problematic to PAM routers as all messages are tunneled through them. PAM routers (u-routers and m-routers) maintain their children and parent information as soft state. Join messages are sent periodically to update the routers state. This information expires after some time period and is purged from routers if not updated. Leave, graft, and prune messages are triggered by a change of the state maintained at PAM routers as explained below. Join messages from nodes joining the system are unicasted towards the source node in order to construct/update the delivery tree. Join messages cary a protocol ID field in the IP header that is representative of the PAM protocol, and some additional PAM fields to identify the message type, the multicast address, etc. A Join message destined to the source of a multicast group G is eventually intercepted by an m- router for group G, a u-router, or the source itself, which also acts as either a u-router or an m-router. Upon receiving a join message, a u-router, U, adds the IP address information of the joining node to its own list of children. If the number of children of U does not exceed the degree threshold, D, and if the number of upstream u-routers from U to its parent does not exceed the hop threshold, H, then U forwards the join message upstream. Otherwise, if the degree threshold or the hop threshold is violated, then U converts into an m-router and sends a prune message upstream. The prune message contains U s list of children so that upstream routers remove U s children from their own list of children and append U itself to their list. Upon receiving a join message, an m-router, M, adds the IP address information of the joining node to its own list of children, but does not send the join message further upstream in the tree. When the degree threshold and/or the hop threshold at an m-router become non-violated then the router reverts back to a u-router and sends a graft message upstream to implant his children in the upstream routers. Upon receiving a prune message at a u-router, R, the router removes the list of nodes provided in the prune message from its own list of children and appends the IP address of the node originating the message to its list of children. The prune message is then forwarded to upstream routers. Upon receiving a prune message at an m-router, M, the router removes the list of nodes provided in the prune message from its own list of children and appends the IP address of the node originating the message to its list of children. The prune message is not forwarded any further. Upon receiving a graft message at a u-router, R, the router appends the list of nodes provided in the graft message to its own list of children and removes the IP address of the node originating the message from its list of children. The graft message is then forwarded to upstream routers only if R will not convert to m-router. Upon receiving a graft message at an m-router, M, the router appends the list of nodes provided in the graft message to its own list of children and removes the IP address of the node originating the message from its list of children. The graft message is not forwarded any further.
4 STAR Fraction of multicast routers avg. stress DVMRP 1 1 max. stress 1 max. fanout 1 Fig. 3. Performance of the multicast strategies on the GTITM topology as the increases. A. Simulation Setup IV. PERFORMANCE EVALUATION We test the performance of PAM on (1) a synthetic topology using the GTITM Internet topology generator  and (2) a real snapshot of the Internet topology collected on PlanetLab . GTITM employs a transit-stub model to generate router-level topologies. In our experiment, the GTITM topology has one transit domain with 5 transit nodes, 5 stub domains per transit node, and nodes in each stub domain. The total number of nodes in the graph is 5. In our study of the GTITM topology, we generate 25 different instances, with different random seeds on 5 topologies, each with random group members. Each result is an average of the 25 runs. Group members only belong to stub domains. We vary the between and 0 to represent various group densities. The PlanetLab topology was obtained by collecting path information between 598 PlanetLab nodes using the traceroute tool. However, blocked ICMP messages and unresponsive routers have lead to incomplete traceoute path information and we were able to get complete path information between only 364 nodes. The traceroute probing between these 364 nodes revealed 8530 distinct IP addresses. We then performed alias resolution on these IP addresses, through reverse DNS queries, in order to reveal the IP addresses that belong to different interfaces of the same router. As a result, we only had 75 routers connecting the 364 PlanetLab nodes. In our simulations, we attach group members to these routers uniformly at random. In our study of the PlanetLab topology, as in the GTITM case, we generate 25 different instances, and each result is an average of 25 runs on these instances. In our experiments on the GTITM and PlanetLab topologies, we capture the following performance metrics: (1) the fraction of multicast routers, (2) the maximum fanout, (3) the maximum stress, (4) the average stress. The maximum fanout is the maximum node degree. The stress is defined as the maximum number of duplicate copies of the multicast content transmitted on any network link. Given a fraction of multicast routers, p, and an average stress, s, we estimate a protocol effectiveness as s.p. The smaller the value of s.p the better. For example, if the number of multicast routers is reduced to half (p = 0.5), and the average stress is doubled (s = 2), then the effectiveness becomes 1. PAM can obtain an average stress of s = 1.25 using only 6% of the existing routers as multicast routers (p = 0.06) leading to an effectiveness of We compare PAM with (1) DVMRP , (2) , and (3) STAR. DVMRP is network level multicast protocol in which all routers are multicast enabled. creates tunnels
5 Fraction of multicast routers avg. stress max. stress 30 max. fanout Fig. 4. Performance of PAM on the GTITM topology as we vary the threshold values, D and H. only between branching routers and thus only assumes that these branching routers are multicast capable. STAR represents the other end of the spectrum in which no routers are multicast enabled and the source sends content directly to all group members through unicast transmission. PAM, DVMRP,, and STAR all do not lead to delay distortion; DVMRP and both also do not lead to network stress; STAR results in more stress than the other three protocols; and PAM adaptively adapts the number and the location of multicast capable routers to attain minimal stress. B. Performance Results Recall that PAM relies on two thresholds: The degree threshold, D, and the hop threshold, H. We refer to PAM(D,H) as the PAM protocol using the D and H threshold values. PAM(,H) refers to the PAM protocol when D =, that is when the degree threshold is not considered in the clustering decision at PAM routers. Similarly, PAM(D, ) refers to the case when the hop threshold is not considered. In Figure 3 we compare the performance of the different multicast strategies as we vary the multicast on the GTITM topology. For each measured performance metric, we plot three curves for the PAM protocol to highlight the impact of the degree and path thresholds. Specifically we plot the results for,, and PAM (5, ). The curves in Figure 3 show that (1) Combining the degree and hop thresholds leads to better performance with relatively less multicast routers than using only one of the thresholds. (2) PAM is able to reduce the number of multicast routers dramatically with minimal impact on performance. For example, PAM uses less than 50% the multicast routers than, and only % to 15% of the multicast routers in DVMRP with roughly similar stress and fanout. The average stress for PAM was only 1.25 while STAR has a stress as high as 6. (3) As the increases, the benefits of using PAM become more prevalent. In Figure 4 we plot the performance of the PAM protocol on the GTITM topology as we vary the degree threshold, D, when the is 500 and the hop threshold, H is equal to 2 and to 4. The results clearly show PAM s ability to trade multicast state/routers for performance. Specifically, when the threshold values increase, the number of multicast routers decreases, while the stress and fanout values increase. We have also tested a skewed distribution of the multicast group members in the GTITM topology, following a Zipf distribution, instead of placing them uniformly at random. Our results follow our intuition that the performance is better than in the uniform distribution case. In Figure 5 we compare the performance of the different
6 STAR Fraction of multicast routers avg. stress DVMRP max. stress 50 max. fanout Fig. 5. Performance of the multicast strategies on the PlanetLab topology as the increases. multicast strategies as we vary the multicast on the PlanetLab topology. The results have similar trends as those observed in Figure 3. The number of PAM routers remains about half the multicast routers and is reduced to around 6% of the DVMRP multicast routers, with an average stress of 1.1. By comparing the results on the GTITM and PlanetLab, it is clear that the PAM edge is more pronounced on the PlanetLab topology. This is due to the difference between the GTITM and PlanetLab topology structures. GTITM generates a fat tree with a low diameter (around 9), while PlanetLab has a diameter above 30. So on PlanetLab there is more chance for clustering nodes by multicast routers. The results obtained by varying the PAM threshold values are similar to those obtained on the GTITM topology. V. CONCLUSION AND FUTURE WORK In this paper, we introduce PAM, a hybrid network multicast scheme which can adapt the configuration of the multicast delivery tree thus delivering a performance close to IP multicast with no delay stretch and minimal network stress, with only a small fraction of the multicast enabled routers required for native IP multicast. PAM excels in realistic Internet topologies. There are numerous possible extensions to PAM. Adapting the threshold values with the underlying topology and is one direction. Implementing and testing PAM in a realistic Internet setup is another. REFERENCES  Suman Banerjee, Bobby Bhattacharjee, Christopher Kommareddy Scalable Application Layer Multicast, Proceedings of ACM Sigcomm 02, Pittsburgh, Pennsylvania, August 02  Rick Boivie,Nancy Feldman,Christopher Metz, Small Group Multicast: A New Solution for Multicasting on the Internet,IEEE Internet Computing, vol.04,no.3,pp.75-79,may,00.  K. Calvert, M. Doar and E. W. Zegura. Modeling Internet Topology, IEEE Communications Magazine, June  Y. Chu, S. Rao, and H. Zhang. A Case for End System Multicast, in ACM SIGMETRICS 00. Santa Clara, CA, USA,June 00.  A. EI-Sayed, V. Roca, A survey of proposals for an alternative group communication service, IEEE Network, 03, 17(1):  A. Garyfalos, K. Almeroth, and J. Finney,A Comparison of Network and Application Layer Multicast for Mobile IPv6 Networks, in Proceedings of the 6th ACM international workshop on Modeling analysis and simulation of wireless and mobile systems(mswim 03),San Diego,CA,September,03.  Dongkyun Kim, Ki-Sung Yu, A Scalable Hybrid Overlay Multicast Adopting Host Group Model for Subnet-Dense Receivers International Journal of Computer Science and Network Security(IJCSNS 07).  Li Lao, Jun-Hong Cui, Mario Gerla and Dario Maggiorini, A Comparative Study of Multicast Protocols: Top, Bottom, or In the Middle?, in Proceedings of Eighth IEEE Global Internet Symposium (GI 05,in conjunction with INFOCOM 05), Miami, Florida, March, 05  Shaofei Lu, Jianxin Wang, Guanzhong Yang, Chao Guo, SHM: Scalable and Backbone Topology-Aware Hybrid Multicast,ICCCN 07, Honolulu, Hawaii,USA, Aug. 07.
7  M. Castro, P. Druschel, A. Kermarrec, A. Nandi, A. Rowstron, A. Singh, SplitStream: high-bandwidth multicast in cooperative environments, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 03, Bolton Landing, NY, USA.  Ion Stoica, T. S. Eugene Ng, Hui Zhang, REUNITE: A Recursive Unicast Approach to Multicast,INFOCOM 00, Tel-Aviv, Israel, March 00.  Su-Wei Tan, Gill Waters, John Crawford, MeshTree: A Delay-optimised Overlay Multicast Tree Building Protocol,INFOCOM 05,Miami, Florida, March, 05.  Jining Tian, Gerald Neufeld, Forwarding State Reduction for Sparse Mode Multicast Communication,INFOCOM 1998,San Francisco, CA, USA, March,1998.  Duc A. Tran, Kien Hua, Tai Do, ZIGZAG: An Efficient Peer-to- Peer Scheme for Media Streaming,INFOCOM 03,San Francisco, CA, USA,April,03.  D. Kostic, A. Rodriguez, J. Albrecht, A. Vahdat, Bullet: high bandwidth data dissemination using an overlay mesh, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 03, Bolton Landing, NY, USA.  D. Pendarakis, S. Shi, D. Verma, M. Waldvogel, ALMI: an application level multicast infrastructure, Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems, p.5-5, March 26-28, 01, San Francisco, California.  Y. D. Chawathe, E. A. Brewer, Scattercast: an architecture for internet broadcast distribution as an infrastructure service, University of California, Berkeley, 00.  Beichuan Zhang, Sugih Jamin, Lixia Zhang, Universal IP Multicast Delivery, Computer Networks: The International Journal of Computer and Telecommunications Networking archive Volume 50, Issue 6 (April 06)  Beichuan Zhang, Sugih Jamin, Lixia Zhang, Host Multicast: A Framework for Delivering Multicast To End Users, INFOCOM 02,New York,NY,June,02.   M. Handley, Session directories and scalable Internet multicast address allocation, ACM Computer Communication Review 28 (4) (1998)