SemCast: Semantic Multicast for Content-based Data Dissemination



Similar documents
The Advantages Of A Network Based Data Routing System

Dynamic Load Balancing for Cluster-based Publish/Subscribe System

JoramMQ, a distributed MQTT broker for the Internet of Things

Consecutive Geographic Multicasting Protocol in Large-Scale Wireless Sensor Networks

IP Multicasting. Applications with multiple receivers

Service and Resource Discovery in Smart Spaces Composed of Low Capacity Devices

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS

Definition. A Historical Example

Graph Theory Algorithms for Mobile Ad Hoc Networks

BUILDING MPLS-BASED MULTICAST VPN SOLUTION. DENOG3 Meeting, /Frankfurt Carsten Michel

PEER TO PEER FILE SHARING USING NETWORK CODING

Design and Evaluation of a Wide-Area Event Notification Service

diversifeye Application Note

Ad hoc and Sensor Networks Chapter 13: Transport Layer and Quality of Service

PERFORMANCE OF MOBILE AD HOC NETWORKING ROUTING PROTOCOLS IN REALISTIC SCENARIOS

Future Internet Technologies

A Survey Study on Monitoring Service for Grid

QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES

How To Provide Qos Based Routing In The Internet

Airlift: Video Conferencing as a Cloud Service using Inter- Datacenter Networks

Traceroute-Based Topology Inference without Network Coordinate Estimation

Lecture 14: Data transfer in multihop wireless networks. Mythili Vutukuru CS 653 Spring 2014 March 6, Thursday

Chapter 3. Enterprise Campus Network Design

A P2PSIP event notification architecture

Benchmark Study on Distributed XML Filtering Using Hadoop Distribution Environment. Sanjay Kulhari, Jian Wen UC Riverside

Graph Database Proof of Concept Report

Faculty of Engineering Computer Engineering Department Islamic University of Gaza Network Chapter# 19 INTERNETWORK OPERATION

The necessity of multicast for IPTV streaming

An Improved ACO Algorithm for Multicast Routing

A Comparison Study of Qos Using Different Routing Algorithms In Mobile Ad Hoc Networks

Introduction to LAN/WAN. Network Layer

Classic Grid Architecture

SplitStream: High-Bandwidth Multicast in Cooperative Environments

TORA : Temporally Ordered Routing Algorithm

An Efficient Primary-Segmented Backup Scheme for Dependable Real-Time Communication in Multihop Networks

Routing Protocols. Interconnected ASes. Hierarchical Routing. Hierarchical Routing

A Digital Fountain Approach to Reliable Distribution of Bulk Data

PEER-TO-PEER (P2P) systems have emerged as an appealing

Network Level Multihoming and BGP Challenges

Lecture 2.1 : The Distributed Bellman-Ford Algorithm. Lecture 2.2 : The Destination Sequenced Distance Vector (DSDV) protocol

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING

Kevin Webb, Alex Snoeren, Ken Yocum UC San Diego Computer Science March 29, 2011 Hot-ICE 2011

International journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

Scaling 10Gb/s Clustering at Wire-Speed

Oracle Spatial and Graph. Jayant Sharma Director, Product Management

other. A B AP wired network

Concept of Cache in web proxies

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

IaaS Federation. Contrail project. IaaS Federation! Objectives and Challenges! & SLA management in Federations 5/23/11

Event-based middleware services

HPAM: Hybrid Protocol for Application Level Multicast. Yeo Chai Kiat

DESIGN AND DEVELOPMENT OF LOAD SHARING MULTIPATH ROUTING PROTCOL FOR MOBILE AD HOC NETWORKS

Title: Scalable, Extensible, and Safe Monitoring of GENI Clusters Project Number: 1723

Tranzeo s EnRoute500 Performance Analysis and Prediction

Performance Evaluation of Multicast Transmission on MPLS Network Using PIM SM

Using Peer to Peer Dynamic Querying in Grid Information Services

DDS-Enabled Cloud Management Support for Fast Task Offloading

Exterior Gateway Protocols (BGP)

Bandwidth Management Framework for Multicasting in Wireless Mesh Networks

CS555: Distributed Systems [Fall 2015] Dept. Of Computer Science, Colorado State University

Proxy-Assisted Periodic Broadcast for Video Streaming with Multiple Servers

Introduction, Rate and Latency

Performance Analysis of Optimal Path Finding Algorithm In Wireless Mesh Network

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

A Review on Quality of Service Architectures for Internet Network Service Provider (INSP)

EVOLVING ENTERPRISE NETWORKS WITH SPB-M APPLICATION NOTE

White Paper November Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses

Service Definition. Internet Service. Introduction. Product Overview. Service Specification

VMDC 3.0 Design Overview

Quality of Service Routing in Ad-Hoc Networks Using OLSR

Transcription:

SemCast: Semantic Multicast for Content-based Data Dissemination Olga Papaemmanouil Brown University Uğur Çetintemel Brown University

Wide Area Stream Dissemination Clients Data Sources Applications Network monitoring Real-time financial services/enterprise News services (RSS feeds) Characteristics High data volume Highly dispersed sources & destinations Low latency delivery 2

Content-based Dissemination High level of expressiveness Profiles are query predicates Sources/destinations decoupling Destinations depend on message content 3

Content-based Pub/Sub Content-based routing Predefined acyclic overlay network SIENA [Carzaniga et al., 2001], Gryphon [Banavar et al., 1999] Predicate-based filtering network Upstream profile aggregation Limitations Processing cost at each hop Bandwidth consumption S 1 4

Processing cost overhead At each level: Content-based filtering at each level Maintain data structures at every broker Compression/decompression Usually with XML streams Encryption/decryption Integrity-critical applications (e.g., financial feeds, distributed games) Forwarding costs could dominate dissemination costs S 1 5

Bandwidth overhead Missed tree optimization opportunities Clients profiles could overlap One spanning tree is suboptimal Intermediate nodes may receive irrelevant data S 1 S 1 S 1 Cost = 13 msg/time unit Max Latency= 3 hops Cost = 10 msg/time unit Max Latency =3 hops 6

Our approach: Semantic Multicast Constructs multiple content-based (a.k.a. semantic) dissemination channels Semantic split of incoming streams Channels characterized by their content Independent overlay dissemination trees Input Streams 7

SemCast s advantages Low processing cost Eliminates local filtering at interior brokers Low processing requirements for intermediate nodes Low overall bandwidth requirements Low-cost dissemination trees Enables latency-cost trade off Delay-bounded trees for latency-sensitive applications 8

Content-based Channelization SemCast decides: Number of channels Content of channels Clients subscriptions to channels Channel implementation Operational goals No false exclusion Low run-time cost : Overall bandwidth consumption Minimize redundancy among channels contents Create efficient dissemination trees Optimal solution for channelization: NP-complete [Adler et al., 2001] 9

System Model Publishers Source brokers (S) Receive streams from publishers Gateway brokers (GB) Receive profiles from subscribers Rendezvous points (RP) Roots of channels Interior brokers (I) Forward incoming messages Coordinators Identify content of channels Coordinator GB S S S RP RP RP I GB GB GB I GB Subscribers I 10

SemCast overview Subscription management Join existing channels or create new channels Periodically reorganize channels Exploits overlap among profiles Content of channel defined by profiles assigned to it Assign similar profiles to same channel Identify overlap between profiles Statistical & syntactical information Discover containment relations Merge channels with high partial overlap Use cost-based model to assign profiles to channels 11

Containment relations SemCast discovers containment hierarchies P j contains P i Messages matching P i are subset of those matching P j Profile syntax reveals containment relations Statistics approximate containment relations Hierarchies are possible semantic channels P j P i P i 12

Cost-based Channelization Containment hierarchies might cause high data redundancy High message rate for non-overlapping part Small message rate for overlapping part Partial overlapping profiles may increase cost Assign similar profiles to different channels Forward matching messages to more than one channel Use cost-based model for channelization 13

Cost-based containment relations Place profiles in the same hierarchy (channel) if it improves bandwidth consumption If P j covers P i, compare cost P i and P j in same channel P i on different channel Apply lowest-cost scenario Cost estimation Statistics on message rate Approximate number of edges Calculate edges for the Steiner tree connecting brokers 14

Hierarchy Merging Merging hierarchies with partial overlap reduces cost Use a cost-based model Send non overlapping part through noise channels E 1 : x:[25,..,55] E 2 : x:[15,..,50] Destinations D 1 Destinations D 2 15 25 50 55 15

Hierarchy Merging Merging hierarchies with partial overlap reduces cost Use a cost-based model Send non overlapping part through noise channels E 1 E 2: x:[50,..,55] E1 E2: x:[25,..,50] Destinations D 1 Destinations D 1+ D 2 E1 E2 x:[15,..,25] 15 25 50 55 Destinations D 2 16

Incremental tree construction Base low cost heuristic Request channel s gateway broker from RP Find min-cost path to all destinations in the channel Connect to closest one Incremental Steiner tree Delay-bounded trees Find min-cost path to one broker in the tree that covers delay bound. C 2 A B RP C 1 C 3 S C 17

Experimental study Metrics Processing cost Bandwidth efficiency Approaches Unicast approach SPT: Shortest Path Tree approach Distributed pub-sub system [Carzaniga et al., 2001] SemCast: Distributed content-based channelization SemCast-O: Centralized Steiner tree construction Network environment Graphs generated by GT-ITM 18

Bandwidth efficiency- Disjoint profiles higher bandwidth consumption % cost degradation over SemCast-O 100 90 80 70 60 50 40 30 20 10 0 Unicast SPT SemCast 0.02 0.06 0.18 0.5 profile selectivity Network size = 600 nodes Profiles= 7000 less distinct profiles SemCast s trees are close to Steiner trees SemCast performs better than SPT, Unicast with no overlapping profiles 19

Bandwidth efficiency- Overlapping profiles higher bandwidth improvement % cost improvement over SPT 70 40 10-20 -50-80 -110-140 -170 0.05 0.25 0.45 0.65 0.85 1 Unicast SemCast aggregated profile overlap Network size =600 Profiles =7000 Profile selectivity =2% higher partial overlap SemCast performs better than SPT with overlapping profiles 20

Scalability higher bandwidth improvement % cost improvement over SPT 60 50 40 30 20 10 0 SemCast (N=300) SemCast (N=400) SemCast (N=600) 0.05 0.25 0.45 0.65 0.85 1 aggregated profile overlap N = Network size Profiles =5 N Selectivity factor =2 higher partial overlap SemCast s improvement increases with the network size 21

Related Work Publish-Subscribe Systems Distributed approaches: Content- based Routing Gryphon [Opyrchal et al., 2000] SIENA [Carzaniga et al., 2001] ONYX [Diao et al, 2004] Centralized approaches: XML Filtering XFilter [Altinel et al., 2000], YFilter [Diao et al., 2002] XTrie [Chan et al., 2002] Application-Level Multicast SplitStream [Castro et al., 2003] SCRIBE [Castro et al., 2002] CAN-based Multicast [Ratnasamy et al., 2001] 22

Conclusions & Ongoing Work SemCast Performs semantic split of incoming streams Eliminates local filtering in interior brokers Improves bandwidth consumption Ongoing work SemCast prototype More expressive profiles Statefull subscriptions Wireless environment 23