Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling
|
|
- Winfred Park
- 7 years ago
- Views:
Transcription
1 2013/06/26 SIGMOD 13 Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling Takuya Akiba (U Tokyo) Yoichi Iwata (U Tokyo) Yuichi Yoshida (NII&PFI)
2 Problem Definition Given a graph G = V, E 1. construct an index to 2. answer distance d G s, t s t Goal: Good trade-off Scalability Indexing time Index size Query Performance Query time Precision 2
3 Distance on Graphs Context-Aware Web Search Socially-Sensitive Search Social Network Analysis Biological Analysis Route Navigation 3
4 Real-world Networks Road Networks Complex Networks Social Networks Web Graphs Computer Networks Biological Networks 4
5 Real-world Networks Road Networks Complex Networks Social Networks Web Graphs Computer Networks Biological Networks Structural differences Different methods 5
6 Real-world Networks Road Networks Complex Networks SIGMOD Research 24 A. D. Zhu, H. Ma, X. Xiao, et al. (tomorrow) Today s topic: methods for complex networks 6
7 Previous Methods Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] 7
8 Previous Methods: Problems Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] 2-Hop Cover [Cheng+,EDBT 09] Landmark-based methods [Potamias+, CIKM 09] Bad precision for close pairs Distance Sketch [Sarma+, WSDM 10] TD-based Cannot handle networks Highway 2-Hop with 10M edges [Wei, SIGMOD 10] [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Slow querying (ms) 8
9 Proposed Method: Pruned Landmark Labeling Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) 9
10 Proposed Method: Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based Cannot handle networks Highway 2-Hop with >10M edges [Wei, SIGMOD 10] [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, Can handle networks with >100M edges [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) 10
11 Proposed Method: Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] 2-Hop Cover [Cheng+,EDBT 09] Landmark-based methods [Potamias+, CIKM 09] Bad precision for close pairs Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] No error (exact method) Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) 11
12 Proposed Method: Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop Fast [Jin+, querying SIGMOD 12] (μs) Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) Slow querying (ms) 12
13 Proposed Method: Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, Simple and easy to implement [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) 13
14 Preliminaries 14
15 2-Hop Labeling: Index Data Structure Commonly used framework [Cohen+ 02], [Cheng+ 09], [Jin+ 12], [Abraham+ 12] and ours Index: label L v = l 1, δ 1, l 2, δ 2, v δ 1 δ 2 l 1 l 2 l i V, δ i = d G v, l i δ 3 l 3 Example L(1): L(2): L 3 : Vertex Distance Vertex Distance Vertex Distance d G 1,10 = 5 15
16 2-Hop Labeling: Query Algorithm Query: d G (s, t) = min l L s L(t) d G s, l + d G l, t Paths through common vertices L(1): Example Vertex Distance s t L 3 : Vertex Distance Distance between vertex 1 and 3: : = 7 Answer min{6, 7} = : = 6 16
17 2-Hop Labeling: Challenge Challenge: computing labels Correctness (Exactness) Sizes of labels (Index Size & Query Time) Efficiency (Scalability) Previous approach [Cohen+ 02], [Cheng+ 09], [Jin+ 12], [Abraham+ 12] Reduce to optimization problems Our approach Directly assign label entries by graph searches 17
18 Pruned Landmark Labeling 18
19 Our Approach 1. Naïve Landmark Labeling Conduct a BFS from every vertex 2. Pruned Landmark Labeling (proposed method) Pruning during BFSs 19
20 Naïve Landmark Labeling (w/o pruning) 1. L 0 an empty index 2. For each vertex v 1, v 2,, v n Conduct a BFS from v i Label all the visited vertices L i u = L i 1 u v i, d G u, v i O(nm) indexing & space, O(n) query 20
21 Naïve Landmark Labeling (w/o pruning) After a BFS from 1 L 1 (1): L 1 (2): L 1 (3): Vertex 1 Distance 0 Vertex 1 Distance 2 Vertex 1 Distance After a BFS from 2 L 2 (1): L 2 (2): L 2 (3): Vertex 1 2 Distance 0 2 Vertex 1 2 Distance 2 0 Vertex 1 2 Distance
22 Pruned Landmark Labeling 1. L 0 an empty index 2. For each vertex v 1, v 2,, v n Conduct a pruned BFS from v i Label all the visited vertices L i u = L i 1 u v i, d G u, v i 22
23 Pruned BFS Pruned BFS from v i v i Distance δ u If QUERY v i, u, L i 1 We do not label u this time Current incomplete index We do not traverse edges from u δ Prune u 23
24 Example First BFS from vertex 1 L 1 (1): L 1 (2): Vertex 1 Distance 0 Vertex 1 Distance 2 L 1 (6): Vertex 1 Distance 1 Second BFS from vertex 2 QUERY 2, 6, L 1 = = 3 = d(2,6) Vertex 6 is pruned. 24
25 Example The search space gets smaller and smaller 25
26 Vertex Ordering Strategies Choosing the order of vertices to conduct BFSs from. To prune later BFSs as much as possible, Central vertices should come first Selection of landmarks for landmark-based approx. methods [Potamias+,CIKM 09][Gubichev+,CIKM 10][Sarma+,WSDM 10][Qiao+,ICDE 12][Tretyakov+,CIKM 12] Central vertices should be selected Well studied in [Potamias+ 09] Similar We conduct BFSs from vertices with higher degree 26
27 Pruning on Real-world Networks Number of vertices labeled in each pruned BFS. 27
28 Pruning on Real-world Networks 1% 99% Number of vertices labeled in each pruned BFS. 28
29 Theorems: Correctness Theorem 4.1 QUERY s, t, L i for any s, t and i = QUERY s, t, L i Corollary 4.1 (Correctness) QUERY s, t, L n = d G s, t for any s, t i.e., our method is exact. 29
30 Theorems: Minimality Theorem 4.2 (Minimality) L n (the constructed index) is minimal. i.e., we cannot remove any label entry from the index. 30
31 Theorems: Relations between other methods Landmark-based approx. methods [Potamias+ 09][Gubichev+ 10][Sarma+ 10][Qiao+ 12][Tretyakov+ 12] Attain high average precision by exploiting hubs Tree-decomposition-based methods [Wei 10], [Akiba+ 12] Attain good performance by exploiting tree-like fringes Theorem 4.3 If landmark-based methods attain good precision on a graph then PLL will have small label sizes PLL can also exploit hubs (highly central vertices) Fringe Core Theorem 4.4 PLL perform well on graphs with small tree-width PLL can also exploit tree-like fringes 31
32 Combination of Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) 32
33 Combination of Advantages Exact Methods Approx. Methods 2-Hop Cover [Cohen+,SODA 02] Landmark-based methods [Potamias+, CIKM 09] 2-Hop Cover [Cheng+,EDBT 09] Distance Sketch [Sarma+, WSDM 10] TD-based [Wei, SIGMOD 10] Highway 2-Hop [Jin+, SIGMOD 12] Path Sketch [Gubichev+, CIKM 10] TD-based [Akiba+, EDBT 12] Hierarchical HL [Abraham+,ESA 12] Low tree-width of tree-like fringes Landmark-based method + LCA, Shortcuts, [Qiao+, ICDE 12] [Tretyakov+, CIKM 12] Pruned Landmark Labeling (this work) Existence of highly central nodes 33
34 Bit-parallel Labeling 34
35 Two Labeling Schemes Further improve the performance of pruned labeling In the beginning, pruning does not work much. Therefore, we ignore pruning in the beginning. To skip the overhead of vain pruning testes We apply another labeling scheme here 1. Bit-parallel Labeling t times (typically <100) 2. Pruned Labeling For the rest (works only on unweighted graphs) 35
36 Naïve Landmark Labeling 1. L 0 an empty index 2. For each vertex v 1, v 2,, v n Conduct a BFS from v i Label all the visited vertices L i u = L i 1 u v i, d G u, v i 36
37 Key Insight: Distances of neighbors d v, u = δ n 1 ~n 8 are neighbors of v v δ + 1 δ n 1 n 2 n 3 v n 4 n 5 δ 1 n 6 n 7 n 8 u d n i, u = δ 1, δ, or δ + 1 u 37
38 Bit-parallel Labeling: 65 BFSs at once v Vertex v + at most 64 neighbors n 1 n 2 n64 v δ u u For each u, we compute: Distance δ u (BFS) 64bit bitset 2 (Dynamic Programming) Neighbors with distance δ u 1 Neighbors with distance δ u (Neighbors with distance δ u + 1) 38
39 Experiments 39
40 Indexing Time Dataset Network V E PLL (this work) HHL [Abraham+ 12] TD [Akiba+ 12] Gnutella Computer 63 K 148K 54 s 245 s 209 s Slashdot Social 82 K 948K 6.0 s 670 s 343 s Notredame Web 326 K 1.5M 4.5 s 10,256 s 243 s MetroSec Computer 2.3 M 22M 108 s DNF DNF Hollywood Social 1.1 M 114 M 15,164 s DNF DNF Indochina Web 7.4 M 194 M 6,068 s DNF DNF DNF: Did not finish in one day or ran out of memory Much faster & scalable! PLL, TD: Xeon X5670, 48GB, Linux, C++, g++, sequential HHL: Xeon X5680 2, 96GB, Windows, C++, VC++, 12 cores PLL: degree strategy, 16 BP-BFS for first three datasets, 64 BP-BFS for the others 40
41 10 6 Edges Scalability Largest networks used in experiments (indexing time: <1day) Two orders of magnitude larger networks TEDI HCL TD HHL PLL [SIGMOD 10] [SIGMOD 12] [EDBT 12] [ESA 12] (this work) 41
42 Index Size Dataset Network V E PLL (this work) HHL [Abraham+ 12] TD [Akiba+ 12] Gnutella Computer 63 K 148K 209 MB 380 MB 68 MB Slashdot Social 82 K 948K 48 MB 182 MB 83 MB Notredame Web 326 K 1.5M 138 MB 64 MB 120 MB MetroSec Computer 2.3 M 22M 2.5 GB - - Hollywood Social 1.1 M 114 M 12 GB - - Indochina Web 7.4 M 194 M 22 GB - - Comparable PLL, TD: Xeon X5670, 48GB, Linux, C++, g++ HHL: Xeon X5680 2, 96GB, Windows, C++, VC++ PLL: degree strategy, 16 BP-BFS for first three datasets, 64 BP-BFS for the others Distance: 8bit, Vertex ID: 32 bit 42
43 Query Time Dataset Network V E PLL (this work) HHL [Abraham+ 12] TD [Akiba+ 12] Gnutella Computer 63 K 148K 5.2 μs 11 μs 19 μs Slashdot Social 82 K 948K 0.8 μs 3.9 μs 12 μs Notredame Web 326 K 1.5M 0.5 μs 0.4 μs 39 μs MetroSec Computer 2.3 M 22M 0.7 μs - - Hollywood Social 1.1 M 114 M 15.6 μs - - Indochina Web 7.4 M 194 M 4.1 μs - - Also faster PLL, TD: Xeon X5670, 48GB, Linux, C++, g++, sequential HHL: Xeon X5680 2, 96GB, Windows, C++, VC++, sequential PLL: degree strategy, 16 BP-BFS for first three datasets, 64 BP-BFS for the others Average for 1,000,000 random queries 43
44 Conclusion Distance querying on large complex networks is important and challenging Proposed method: Pruned Landmark Labeling Based on 2-hop cover Conduct pruned BFSs from every vertex Further improved by bit-parallel BFSs Experiments 100x larger networks with comparable indexing time, index size and query time Software available: 44
SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs
SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large
More informationCpt S 223. School of EECS, WSU
The Shortest Path Problem 1 Shortest-Path Algorithms Find the shortest path from point A to point B Shortest in time, distance, cost, Numerous applications Map navigation Flight itineraries Circuit wiring
More informationGraph Database Proof of Concept Report
Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment
More informationLecture 8: Routing I Distance-vector Algorithms. CSE 123: Computer Networks Stefan Savage
Lecture 8: Routing I Distance-vector Algorithms CSE 3: Computer Networks Stefan Savage This class New topic: routing How do I get there from here? Overview Routing overview Intra vs. Inter-domain routing
More informationAsking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
More informationDistributed Computing over Communication Networks: Topology. (with an excursion to P2P)
Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...
More informationInfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
More informationSCAN: A Structural Clustering Algorithm for Networks
SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected
More informationMapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12
MapReduce Algorithms A Sense of Scale At web scales... Mail: Billions of messages per day Search: Billions of searches per day Social: Billions of relationships 2 A Sense of Scale At web scales... Mail:
More informationFast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
More informationBig Graph Processing: Some Background
Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs
More informationMapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
More informationA1 and FARM scalable graph database on top of a transactional memory layer
A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb
More informationDistance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
More informationFault-Tolerant Routing Algorithm for BSN-Hypercube Using Unsafety Vectors
Journal of omputational Information Systems 7:2 (2011) 623-630 Available at http://www.jofcis.com Fault-Tolerant Routing Algorithm for BSN-Hypercube Using Unsafety Vectors Wenhong WEI 1,, Yong LI 2 1 School
More informationMicroblogging Queries on Graph Databases: An Introspection
Microblogging Queries on Graph Databases: An Introspection ABSTRACT Oshini Goonetilleke RMIT University, Australia oshini.goonetilleke@rmit.edu.au Timos Sellis RMIT University, Australia timos.sellis@rmit.edu.au
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationDistributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
More informationGraph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 08: Computer networks Version: March 3, 2011 2 / 53 Contents
More informationFast Iterative Graph Computation with Resource Aware Graph Parallel Abstraction
Human connectome. Gerhard et al., Frontiers in Neuroinformatics 5(3), 2011 2 NA = 6.022 1023 mol 1 Paul Burkhardt, Chris Waring An NSA Big Graph experiment Fast Iterative Graph Computation with Resource
More informationNetwork Design with Coverage Costs
Network Design with Coverage Costs Siddharth Barman 1 Shuchi Chawla 2 Seeun William Umboh 2 1 Caltech 2 University of Wisconsin-Madison APPROX-RANDOM 2014 Motivation Physical Flow vs Data Flow vs. Commodity
More informationAn Energy Efficient Location Service for Mobile Ad Hoc Networks
An Energ Efficient Location Service for Mobile Ad Hoc Networks Zijian Wang 1, Euphan Bulut 1 and Boleslaw K. Szmanski 1, 1 Department of Computer Science, Rensselaer Poltechnic Institute, Tro, NY 12180
More informationMulticast Group Management for Interactive Distributed Applications
Multicast Group Management for Interactive Distributed Applications Carsten Griwodz griff@simula.no September 25, 2008 based on the thesis work of Knut-Helge Vik, knuthelv@simula.no Group communication
More informationEngineering Route Planning Algorithms. Online Topological Ordering for Dense DAGs
Algorithm Engineering for Large Graphs Engineering Route Planning Algorithms Peter Sanders Dominik Schultes Universität Karlsruhe (TH) Online Topological Ordering for Dense DAGs Deepak Ajwani Tobias Friedrich
More informationVirtual Landmarks for the Internet
Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser
More informationMax Flow, Min Cut, and Matchings (Solution)
Max Flow, Min Cut, and Matchings (Solution) 1. The figure below shows a flow network on which an s-t flow is shown. The capacity of each edge appears as a label next to the edge, and the numbers in boxes
More informationA Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader
A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward
More informationThe Goldberg Rao Algorithm for the Maximum Flow Problem
The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }
More informationOverview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group) 06-08-2012
Overview on Graph Datastores and Graph Computing Systems -- Litao Deng (Cloud Computing Group) 06-08-2012 Graph - Everywhere 1: Friendship Graph 2: Food Graph 3: Internet Graph Most of the relationships
More information1. Write the number of the left-hand item next to the item on the right that corresponds to it.
1. Write the number of the left-hand item next to the item on the right that corresponds to it. 1. Stanford prison experiment 2. Friendster 3. neuron 4. router 5. tipping 6. small worlds 7. job-hunting
More informationClass One: Degree Sequences
Class One: Degree Sequences For our purposes a graph is a just a bunch of points, called vertices, together with lines or curves, called edges, joining certain pairs of vertices. Three small examples of
More informationLink Prediction in Social Networks
CS378 Data Mining Final Project Report Dustin Ho : dsh544 Eric Shrewsberry : eas2389 Link Prediction in Social Networks 1. Introduction Social networks are becoming increasingly more prevalent in the daily
More informationOutline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationIncSpan: Incremental Mining of Sequential Patterns in Large Database
IncSpan: Incremental Mining of Sequential Patterns in Large Database Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 hcheng3@uiuc.edu Xifeng
More informationMS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL
MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL Dr. Allon Cohen Eli Ben Namer info@sanrad.com 1 EXECUTIVE SUMMARY SANRAD VXL provides enterprise class acceleration for virtualized
More informationPHAST: Hardware-Accelerated Shortest Path Trees
PHAST: Hardware-Accelerated Shortest Path Trees Daniel Delling Andrew V. Goldberg Andreas Nowatzyk Renato F. Werneck Microsoft Research Silicon Valley, 1065 La Avenida, Mountain View, CA 94043, USA September
More informationFuzzy Duplicate Detection on XML Data
Fuzzy Duplicate Detection on XML Data Melanie Weis Humboldt-Universität zu Berlin Unter den Linden 6, Berlin, Germany mweis@informatik.hu-berlin.de Abstract XML is popular for data exchange and data publishing
More informationPeer-to-Peer Networks 02: Napster & Gnutella. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg
Peer-to-Peer Networks 02: Napster & Gnutella Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg Napster Shawn (Napster) Fanning - published 1999 his beta
More informationLecture 13: Distance-vector Routing. Lecture 13 Overview. Bellman-Ford Algorithm. d u (z) = min{c(u,v) + d v (z), c(u,w) + d w (z)}
Lecture : istance-vector Routing S : omputer Networks hris Kanich Quiz TOMORROW Lecture Overview istance vector ssume each router knows its own address and cost to reach each of its directly connected
More informationLecture 12: Link-state Routing"
Lecture 2: Link-state Routing" CSE 23: Computer Networks Alex C. Snoeren HW 3 due next Tuesday! Lecture 2 Overview" Routing overview Intra vs. Inter-domain routing Link-state routing protocols CSE 23 Lecture
More informationRouting in packet-switching networks
Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on circuit or packet switching Circuit switching designed for voice Resources dedicated to a particular call
More informationMizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
/35 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of
More informationHOW CLOUD DATABASE ENABLES EFFICIENT REAL-TIME ANALYTICS?
HOW CLOUD DATABASE ENABLES EFFICIENT REAL-TIME ANALYTICS? DATA MANAGEMENT MATTERS Worldwide data volumes keep growing Real time management of big data Return result in milliseconds Deals with TBs to PBs
More information8.1 Min Degree Spanning Tree
CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree
More informationRethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
More informationAn Introduction to The A* Algorithm
An Introduction to The A* Algorithm Introduction The A* (A-Star) algorithm depicts one of the most popular AI methods used to identify the shortest path between 2 locations in a mapped area. The A* algorithm
More informationBinary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More informationUSING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu
More informationStatic Load Balancing
Load Balancing Load Balancing Load balancing: distributing data and/or computations across multiple processes to maximize efficiency for a parallel program. Static load-balancing: the algorithm decides
More informationFast Contextual Preference Scoring of Database Tuples
Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation
More informationA Fast Algorithm For Finding Hamilton Cycles
A Fast Algorithm For Finding Hamilton Cycles by Andrew Chalaturnyk A thesis presented to the University of Manitoba in partial fulfillment of the requirements for the degree of Masters of Science in Computer
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationGraphalytics: A Big Data Benchmark for Graph-Processing Platforms
Graphalytics: A Big Data Benchmark for Graph-Processing Platforms Mihai Capotă, Tim Hegeman, Alexandru Iosup, Arnau Prat-Pérez, Orri Erling, Peter Boncz Delft University of Technology Universitat Politècnica
More informationYarrp ing the Internet
Yarrp ing the Internet Robert Beverly Naval Postgraduate School February 12, 2016 Active Internet Measurements (AIMS) Workshop R. Beverly (NPS) Yarrp AIMS 2016 1 / 17 Motivation Active Topology Probing
More informationRandom Map Generator v1.0 User s Guide
Random Map Generator v1.0 User s Guide Jonathan Teutenberg 2003 1 Map Generation Overview...4 1.1 Command Line...4 1.2 Operation Flow...4 2 Map Initialisation...5 2.1 Initialisation Parameters...5 -w xxxxxxx...5
More informationAnalysis of Algorithms, I
Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications
More informationDynamic Graph Query Primitives for SDN-based Cloud Network Management
Dynamic Graph Query Primitives for SDN-based Cloud Network Management Ramya Raghavendra, Jorge Lobo, Kang-Won Lee IBM T. J. Watson Research Center {rraghav,jlobo,kangwon}@us.ibm.com ABSTRACT The need to
More informationGraph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.0, steen@cs.vu.nl Chapter 06: Network analysis Version: April 8, 04 / 3 Contents Chapter
More informationGreedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures
Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures Dmitri Krioukov, kc claffy, and Kevin Fall CAIDA/UCSD, and Intel Research, Berkeley Problem High-level Routing is
More informationCS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010
CS 598CSC: Combinatorial Optimization Lecture date: /4/010 Instructor: Chandra Chekuri Scribe: David Morrison Gomory-Hu Trees (The work in this section closely follows [3]) Let G = (V, E) be an undirected
More informationThe Role and uses of Peer-to-Peer in file-sharing. Computer Communication & Distributed Systems EDA 390
The Role and uses of Peer-to-Peer in file-sharing Computer Communication & Distributed Systems EDA 390 Jenny Bengtsson Prarthanaa Khokar jenben@dtek.chalmers.se prarthan@dtek.chalmers.se Gothenburg, May
More informationWhite Paper November 2015. Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses
White Paper November 2015 Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses Our Evolutionary Approach to Integration With the proliferation of SaaS adoption, a gap
More informationGraph Databases. Prad Nelluru, Bharat Naik, Evan Liu, Bon Koo
Graph Databases Prad Nelluru, Bharat Naik, Evan Liu, Bon Koo 1 Why are graphs important? Modeling chemical and biological data Social networks The web Hierarchical data 2 What is a graph database? A database
More informationSuperViz: An Interactive Visualization of Super-Peer P2P Network
SuperViz: An Interactive Visualization of Super-Peer P2P Network Anthony (Peiqun) Yu pqyu@cs.ubc.ca Abstract: The Efficient Clustered Super-Peer P2P network is a novel P2P architecture, which overcomes
More informationChapter 10 Link-State Routing Protocols
Chapter 10 Link-State Routing Protocols CCNA2-1 Chapter 10 Note for Instructors These presentations are the result of a collaboration among the instructors at St. Clair College in Windsor, Ontario. Thanks
More informationSubgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scale-free,
More informationEfficient Search in Gnutella-like Small-World Peerto-Peer
Efficient Search in Gnutella-like Small-World Peerto-Peer Systems * Dongsheng Li, Xicheng Lu, Yijie Wang, Nong Xiao School of Computer, National University of Defense Technology, 410073 Changsha, China
More informationCacti with minimum, second-minimum, and third-minimum Kirchhoff indices
MATHEMATICAL COMMUNICATIONS 47 Math. Commun., Vol. 15, No. 2, pp. 47-58 (2010) Cacti with minimum, second-minimum, and third-minimum Kirchhoff indices Hongzhuan Wang 1, Hongbo Hua 1, and Dongdong Wang
More informationP2P: centralized directory (Napster s Approach)
P2P File Sharing P2P file sharing Example Alice runs P2P client application on her notebook computer Intermittently connects to Internet; gets new IP address for each connection Asks for Hey Jude Application
More informationSeminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it
Seminar Path planning using Voronoi diagrams and B-Splines Stefano Martina stefano.martina@stud.unifi.it 23 may 2016 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International
More informationData Structures. Chapter 8
Chapter 8 Data Structures Computer has to process lots and lots of data. To systematically process those data efficiently, those data are organized as a whole, appropriate for the application, called a
More information1 One Dimensional Horizontal Motion Position vs. time Velocity vs. time
PHY132 Experiment 1 One Dimensional Horizontal Motion Position vs. time Velocity vs. time One of the most effective methods of describing motion is to plot graphs of distance, velocity, and acceleration
More informationA Review And Evaluations Of Shortest Path Algorithms
A Review And Evaluations Of Shortest Path Algorithms Kairanbay Magzhan, Hajar Mat Jani Abstract: Nowadays, in computer networks, the routing is based on the shortest path problem. This will help in minimizing
More informationFast Shortest Path Distance Estimation in Large Networks
Fast Shortest Path Distance Estimation in Large Networks Michalis Potamias 1 Francesco Bonchi 2 Carlos Castillo 2 Aristides Gionis 2 1 Computer Science Department 2 Yahoo! Research Boston University, USA
More informationEfficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems
Efficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems Hui (Wendy) Wang, Ruilin Liu Stevens Institute of Technology, New Jersey, USA Dimitri Theodoratos, Xiaoying Wu New
More informationHow to Design the Shortest Path Between Two Keyword Algorithms
Multi-Approximate-Keyword Routing in GIS Data Bin Yao Mingwang Tang Feifei Li yaobin@cs.sjtu.edu.cn, {tang,lifeifei}@cs.utah.edu Department of Computer Science and Engineering, Shanghai JiaoTong University
More informationSimilarity Search in a Very Large Scale Using Hadoop and HBase
Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France
More informationSome questions... Graphs
Uni Innsbruck Informatik - 1 Uni Innsbruck Informatik - 2 Some questions... Peer-to to-peer Systems Analysis of unstructured P2P systems How scalable is Gnutella? How robust is Gnutella? Why does FreeNet
More informationArchitectural Framework for Large- Scale Multicast in Mobile Ad Hoc Networks
Architectural Framework for Large- Scale Multicast in Mobile Ad Hoc Networks Ahmed Helmy Electrical Engineering Department University of Southern California (USC) helmy@usc.edu http://ceng.usc.edu/~helmy
More informationCOMP 631: COMPUTER NETWORKS. Internet Routing. Jasleen Kaur. Fall 2014. Forwarding vs. Routing: Local vs. Distributed
OMP 3: OMPUTER NETWORKS // OMP 3: OMPUTER NETWORKS Internet Routing Jasleen Kaur Fall 0 Forwarding vs. Routing: Local vs. istributed oth datagram and virtual-circuit based networks need to know how to
More informationEfficiently Identifying Inclusion Dependencies in RDBMS
Efficiently Identifying Inclusion Dependencies in RDBMS Jana Bauckmann Department for Computer Science, Humboldt-Universität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany bauckmann@informatik.hu-berlin.de
More informationIn-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating
More information5. Binary objects labeling
Image Processing - Laboratory 5: Binary objects labeling 1 5. Binary objects labeling 5.1. Introduction In this laboratory an object labeling algorithm which allows you to label distinct objects from a
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationOn Mining Group Patterns of Mobile Users
On Mining Group Patterns of Mobile Users Yida Wang 1, Ee-Peng Lim 1, and San-Yih Hwang 2 1 Centre for Advanced Information Systems, School of Computer Engineering Nanyang Technological University, Singapore
More informationEfficient Iceberg Query Evaluation for Structured Data using Bitmap Indices
Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil
More informationInternet Firewall CSIS 4222. Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS 4222. net15 1. Routers can implement packet filtering
Internet Firewall CSIS 4222 A combination of hardware and software that isolates an organization s internal network from the Internet at large Ch 27: Internet Routing Ch 30: Packet filtering & firewalls
More informationCSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
More informationNetwork (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,
More informationScalable Source Routing
Scalable Source Routing January 2010 Thomas Fuhrmann Department of Informatics, Self-Organizing Systems Group, Technical University Munich, Germany Routing in Networks You re there. I m here. Scalable
More informationSociaLite: Datalog Extensions for Efficient Social Network Analysis
SociaLite: Datalog Extensions for Efficient Social Network Analysis Jiwon Seo Stephen Guo Monica S. Lam Computer Systems Laboratory Stanford University Stanford, CA, USA 94305 {jiwon, sdguo, lam}@stanford.edu
More informationEfficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems
Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems Kunwadee Sripanidkulchai Bruce Maggs Hui Zhang Carnegie Mellon University, Pittsburgh, PA 15213 {kunwadee,bmm,hzhang}@cs.cmu.edu
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationSoftware tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team
Software tools for Complex Networks Analysis Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team MOTIVATION Why do we need tools? Source : nature.com Visualization Properties extraction
More informationWAN Wide Area Networks. Packet Switch Operation. Packet Switches. COMP476 Networked Computer Systems. WANs are made of store and forward switches.
Routing WAN Wide Area Networks WANs are made of store and forward switches. To there and back again COMP476 Networked Computer Systems A packet switch with two types of I/O connectors: one type is used
More informationHigh-performance metadata indexing and search in petascale data storage systems
High-performance metadata indexing and search in petascale data storage systems A W Leung, M Shao, T Bisson, S Pasupathy and E L Miller Storage Systems Research Center, University of California, Santa
More informationComputer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li
Computer Algorithms NP-Complete Problems NP-completeness The quest for efficient algorithms is about finding clever ways to bypass the process of exhaustive search, using clues from the input in order
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationLoad balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
More information