query enabled P2P networks 2009. 08. 27 Park, Byunggyu



Similar documents
Load Balancing in Structured Overlay Networks. Tallat M. Shafaat

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Chord - A Distributed Hash Table

New Structured P2P Network with Dynamic Load Balancing Scheme

PEER-TO-PEER (P2P) systems have emerged as an appealing

SCALABLE RANGE QUERY PROCESSING FOR LARGE-SCALE DISTRIBUTED DATABASE APPLICATIONS *

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

PSON: A Scalable Peer-to-Peer File Sharing System Supporting Complex Queries

How To Create A P2P Network

Chord. A scalable peer-to-peer look-up protocol for internet applications

RESEARCH ISSUES IN PEER-TO-PEER DATA MANAGEMENT

An Introduction to Peer-to-Peer Networks

Tornado: A Capability-Aware Peer-to-Peer Storage Network

8 Conclusion and Future Work

Security in Structured P2P Systems

Distributed file system in cloud based on load rebalancing algorithm

LOAD BALANCING WITH PARTIAL KNOWLEDGE OF SYSTEM

A P2P SERVICE DISCOVERY STRATEGY BASED ON CONTENT

A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM

Comparison on Different Load Balancing Algorithms of Peer to Peer Networks

Approximate Object Location and Spam Filtering on Peer-to-Peer Systems

New Algorithms for Load Balancing in Peer-to-Peer Systems

Information Searching Methods In P2P file-sharing systems

Plaxton routing. Systems. (Pastry, Tapestry and Kademlia) Pastry: Routing Basics. Pastry: Topology. Pastry: Routing Basics /3

Load Balancing in Structured P2P Systems

A Survey on Distributed Hash Table (DHT): Theory, Platforms, and Applications. Hao Zhang, Yonggang Wen, Haiyong Xie, and Nenghai Yu

International journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

SLBA: A Security Load-balancing Algorithm for Structured P2P Systems

A Reputation Management System in Structured Peer-to-Peer Networks

An Optimization Model of Load Balancing in P2P SIP Architecture

CS5412: TIER 2 OVERLAYS

Acknowledgements. Peer to Peer File Storage Systems. Target Uses. P2P File Systems CS 699. Serving data with inexpensive hosts:

International Research Journal of Interdisciplinary & Multidisciplinary Studies (IRJIMS)

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Varalakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam

Locality-Aware Randomized Load Balancing Algorithms for DHT Networks

IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION

Architectures and protocols in Peer-to-Peer networks

P2P Networking - Advantages and Disadvantages of Virtualization

Achieving Resilient and Efficient Load Balancing in DHT-based P2P Systems

Methods & Tools Peer-to-Peer Jakob Jenkov

Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems

Load Balancing in Structured Peer to Peer Systems

Load Balancing in Structured Peer to Peer Systems

A Survey and Comparison of Peer-to-Peer Overlay Network Schemes

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

CS435 Introduction to Big Data

A Load Balancing Method in SiCo Hierarchical DHT-based P2P Network

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Bloom Filter based Inter-domain Name Resolution: A Feasibility Study

Object Request Reduction in Home Nodes and Load Balancing of Object Request in Hybrid Decentralized Web Caching

p2p: systems and applications Internet Avanzado, QoS, Multimedia Carmen Guerrero

RELOAD Usages for P2P Data Storage and Discovery

IEEE/ACM TRANSACTIONS ON NETWORKING 1

P2P Storage Systems. Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung

File System Client and Server

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

D1.1 Service Discovery system: Load balancing mechanisms

Decentralized supplementary services for Voice-over-IP telephony

Heterogeneity and Load Balance in Distributed Hash Tables

T he Electronic Magazine of O riginal Peer-Reviewed Survey Articles ABSTRACT

Calto: A Self Sufficient Presence System for Autonomous Networks

The p2pweb model: a glue for the Web

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Physical Data Organization

A Review on Efficient File Sharing in Clustered P2P System

A Survey of Peer-to-Peer File Sharing Technologies

Research on P2P-SIP based VoIP system enhanced by UPnP technology

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment

P2P: centralized directory (Napster s Approach)

A Peer-to-Peer File Sharing System for Wireless Ad-Hoc Networks

Async: Secure File Synchronization

SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD

Load Balancing in Dynamic Structured P2P Systems

Using Peer to Peer Dynamic Querying in Grid Information Services

DATA STRUCTURES USING C

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

5. Peer-to-peer (P2P) networks

Plaxton Routing. - From a peer - to - peer network point of view. Lars P. Wederhake

LOOKING UP DATA IN P2P SYSTEMS

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

Analysis on Leveraging social networks for p2p content-based file sharing in disconnected manets

PEER TO PEER CLOUD FILE STORAGE ---- OPTIMIZATION OF CHORD AND DHASH. COEN283 Term Project Group 1 Name: Ang Cheng Tiong, Qiong Liu

Analysis of MapReduce Algorithms

DPTree: A Balanced Tree Based Indexing Framework for Peer-to-Peer Systems

Privacy- Preserving P2P Data Sharing with OneSwarm. Presented by. Adnan Malik

Simulating a File-Sharing P2P Network

Transcription:

Load balancing mechanism in range query enabled P2P networks 2009. 08. 27 Park, Byunggyu

Background Contents DHT(Distributed Hash Table) Motivation Proposed scheme Compression based Hashing Load balancing Technique using FRI(Fixed Routing Identifier) Summary

What is DHT? Background DHT provides the object lookup service for P2P applications Provide two primitives : put(k,v), get(k) Scalable O(logN) routing cost Provide load balancing Consistence hash function: O(logN) imbalance Support only point query E.g. Chord, CAN, Pastry, Tapestry, etc put(key, data) Distributed application get (key) Distributed hash table data node node. node

Background Due to un-flexible query support of DHT, application can be restricted Currently many applications require complex query support Multiple keyword Range Semantic Device: printer Type: laser PPM: 4 x 10 10 Query? Device: printer Type: laser Printing device

Background How to enable flexible query support in DHT? Object namespace Object namespace DHT Node ID space Lose order semantics Support only point query O(logN) Load imbalance Node ID space Order preserving mapping Support complex query Serious load imbalance

Motivation 1) Order-preserving mapping can provide flexible query 2) It causes load imbalance problem because of clustered property of data

Proposed Scheme Compression based Hashing Based on arithmetic coding Order-preserving mapping balancing load Load balancing Technique using FRI(Fixed Routing Identifier) Based on virtual server

Compression based hashing Object namespace Symbol probability Range a 0.80 [0.00, 0.80) b 0.02 [0.80, 0.82) c 018 0.18 [0.82, 1.00) Compression based acb 0.773504 Hashing 0.00 0.00 0.656 a Node ID space 080 0.80 Order preserving mapping 0.82 b Support complex query c Relaxed load imbalance 100 1.00 0.80 0.656 c 0.7712 0.77408 b <Arithmetic coding> 0.773504

Compression based Hashing Politic Sports Society Finance Culture...... Books Music Law education......... a aa f a f e ae f a b f null b z z f 000 0.00 000 0.00 Sample data Construct trie and calculate Get the compressed value of corpus frequency of each symbol using arithmetic coding System pre-processing 1.00 Binary representation ti of compressed value a f f b f f z f f ae f f Peer processing

Lookup process Compression based Hashing Metadata(K D ) Lookup(K D ) Compressed value (K D(C )) Compression(K D ) Binary representation (K D(2) ) N1 Translate(K D(C) ) N48 N8 Get(K D(2) ) N14 N42 N38 N32 N21

Data uniformity Evaluation Training data set : ACM keyword

Evaluation Unbalance factor {(L i E) 2 /E} of brown corpus Training data set: ACM keyword

Advantage of CBH Compression based Hashing Obtain uniform data distribution Order-preserving mapping Support complex query Flexible Can be applied lidto different type of fdata model dl Dis-advantage Require training data set Performance depends on accuracy of sample data

Load balancing Techniques in DHT Selective node join Node migration Support dynamic load balancing Virtual server Provide find-grained load balancing No change in underlying DHT High maintenance cost O(logN) virtual servers per physical node O(LogN) 2 routing entries per node Long query routing length Unstable routing Dynamic node leave/join Simple Hard to provide fine-grained load balancing Replication/Cache

Load balancing Techniques in DHT Virtual server Logical node in DHT Transfer unit in order to balance load Node5 Node A Node 1 Node1 Node3 Node4 Node6 Node 7 Node 2 Node B Node 6 Node 3 NETWORK Node C Node2 Node 5 Node 4 Node7 <Logical view of V.S > <Physical view of V.S> <Chord with V.S >

Load balancing Technique using FRI Node is classified into two types of nodes One physical node has one routing node and several storage nodes Routing node Has fixed routing identifier Maintains O(logN) routing entry Storage node(virtual server) Has storage identifier and shares routing identifier Miti Maintains constant number of routing entry Predecessor of routing node + successor of routing node Can be migrated to other node

Load balancing Technique using FRI Load balancing based on V.S S.Node R.Node R.Node S.Node S.Node R.Node S.Node S.Node S.Node Overloaded! R.Node SNode S.Node NETWORK S.Node S.Node R.Node S.Node

Routing in Chord using FRI Routing table structure N1 Interval F.R.I N8 1 N8 2 N8 Nodes S.R.I Snode 0 SRI S.R.I 0 Snode 1 S.R.I 1 N48 N14 <Routing table of Routing node N1> N42 Type F.R.I N38 N32 N21 Routing node Shared F.R.I Successor R N1 N8 - - Storage node <Routing table of Storage node>

Routing example Routing in Chord using FRI N1 N8 N48 N14 Routing node N42 Storage node N38 N32 N21

Load balancing in Chord using FRI Load information gathering Log(N) information from finger table Used to reassign storage node N1 Interval FRI F.R.I L N8 1 N8 2 N8 N48 L L N14 Finger table of N1 N42 Routing node N38 Storage node N32 N21

Load balancing in Chord using FRI Clustered routing node vs uniform routing node Skewed finger pointer vs High popular region Routing node Storage node Can not guarantee logn routing hops Hard to get overall load information Guarantee logn routing hops Easy to get overall load information

Load balancing in Chord using FRI Node join Random join (h c (IP) = F.R.I) Sequential join J J Routing node join J J Uniform routing node distribution Efficient load sampling Optimal routing

Load balancing Technique using FRI Advantage of FRI Include all advantages of virtual server scheme Fine-grained load balancing General and dflexible Solve inherent problems of virtual server Still provide O(logN) maintenance cost per physical node Stable routing Shorter query routing gpath Dis-advantage Change in underlying DHT Routing algorithm

Evaluation Load distribution according to increase of number of storage nodes 100 90 80 Num mber of data 70 60 50 40 30 FRI with 500 Rnodes 3 Snodes FRI with 500 Rnodes 2 Snodes FRI with 500 Rnodes 3 Snodes Original chord with 500 nodes 20 10 0 100 200 300 400 500 Node distribution

Evaluation Average routing path length of FRI vs Virtual Server 9 8 length Average e routing path 7 6 5 4 3 2 Original Chord Chord with FRI Chord with V.S 1 0 2 4 8 Number of V.S

Summary To satisfy various application requirement, complex query should be supported in DHT layer Order-preserving mapping Load balancing problem due to skewed data distribution Compression based hashing Provide order-preserving mapping Provide uniform data distribution Load balancing Technique using FRI Dynamic load balancing based on virtual server Reduce maintenance overhead O(logN) + C routing entries Fine-grained balancing Shorter routing path

C.B.H Future Work How to guarantee the accuracy of sample data? Comparison with other order-preserving hashing function Load Balancing using F.R.I Comparison with V.S scheme Balancing ratio( number of nodes, storage nodes, data ) Balancing overhead( dynamicity of P2P...) Maintenance cost

Thank you