Big Data Graph Algorithms

Size: px
Start display at page:

Download "Big Data Graph Algorithms"

Transcription

1 Christian Schulz CompSE seminar, RWTH Aachen, Karlsruhe 1 Christian Schulz: Institute for Theoretical Informatics

2 Algorithm Engineering design analyze Algorithms implement experiment 1 Christian Schulz:

3 Research Areas Scalable (parallel) sorting Scalable (parallel/external) graph partitioning Scalable (parallel) graph generation Scalable (parallel/external) matchings Scalable (shared-mem parallel) graph drawing Independent sets on large inputs 2 Christian Schulz:

4 Research Areas Scalable (parallel) sorting Scalable (parallel/external) graph partitioning Scalable (parallel) graph generation Scalable (parallel/external) matchings Scalable (shared-mem parallel) graph drawing Independent sets on large inputs 2 Christian Schulz:

5 Huge Complex Networks Rn n 3 Ax = b Rn 3 Christian Schulz:

6 Scalable Graph Partitioning joint work with: H. Meyerhenke and P. Sanders 3 Christian Schulz:

7 The Common Parallel Approach Mesh partitioned via dual graph 1. Each volume (data, calculation) represented by a vertex (+edges) 2. Interdependencies represented by edges All PE s get same amount of work Communication is expensive Graph Partitioning Problem: Partition a graph into (almost) equally sized blocks, such that the number of edges connecting vertices from different blocks is minimal. 4 Christian Schulz:

8 ɛ-balanced Graph Partitioning Partition graph G = (V, E, c : V R >0, ω : E R >0 ) into k disjoint blocks s.t. total node weight of each block 1 + ɛ total node weight k total weight of cut edges as small as possible Relevant Applications: graph processing frameworks, parallel sparse matrix vector mult Christian Schulz:

9 Multilevel Graph Partitioning input graph output partition... local improvement... match contract initial partitioning uncontract Successful in many systems (using matchings or AMG for coarsening) 6 Christian Schulz:

10 Matching-based Coarsening A B a b a+b A+B 1. compute matching 2. perform contraction 7 Christian Schulz:

11 Matching-based Coarsening A B a b a+b A+B 1. compute matching 2. perform contraction 7 Christian Schulz:

12 Local Search compute gain: g(v) = d ext (v) d int (v) select blocks alternately move nodes greedy edge cut: 7 8 Christian Schulz:

13 Local Search compute gain: g(v) = d ext (v) d int (v) select blocks alternately move nodes greedy edge cut: 7 8 Christian Schulz:

14 Local Search compute gain: g(v) = d ext (v) d int (v) select blocks alternately move nodes greedy edge cut: 7 8 Christian Schulz:

15 Local Search update gain g(v) of neighbors move a node at most once edge cut: 7, 6 8 Christian Schulz:

16 Local Search until stop criteria reached increase in cut possible edge cut: 7, 6, 5 8 Christian Schulz:

17 Local Search until stop criteria reached increase in cut possible edge cut: 7, 6, 5, 5 8 Christian Schulz:

18 Local Search until stop criteria reached increase in cut possible edge cut: 7, 6, 5, 5, 6 8 Christian Schulz:

19 Local Search cut #steps increase in cut possible undo changes until best solution reached 9 Christian Schulz:

20 Local Search final partition with edge cut 5 linear time 10 Christian Schulz:

21 KaHIP Karlsruhe High Quality Partitioning buffoon [ALENEX12] social [SEA14,IPDPS15] separators distr. evol. alg. [ALENEX12] [DIMACS12] highly balanced: [SEA13] A 0 B A 0 B A C C C B V F W [ESA11] cycles a la multigrid input graph Output Partition [IPDPS10]... flows etc. [ESA11]... edge local improvement ratings match + [SEA12/14] contract uncontract initial partitioning parallel [IPDPS10] Multilevel Graph Partitioning 11 Christian Schulz:

22 KaHIP Benchmarks 1. Walschaw Benchmark: runtime neglected 816 instances (ɛ {0%, 1%, 3%, 5%}) focus on solution quality almost all instances improved or reproduced 2. 10th DIMACS Implementation Challenge best scores in categories: solution quality and runtime vs. solution quality 12 Christian Schulz:

23 Partition Complex Networks 12 Christian Schulz:

24 Matching-based Coarsening Problem bad for networks that are highly irregular substantial reduction is hard using matchings may contract wrong edges! 13 Christian Schulz:

25 Matching-based Coarsening Problem bad for networks that are highly irregular substantial reduction is hard using matchings may contract wrong edges! 13 Christian Schulz:

26 Basic Idea aggressive contraction / simple and fast local search main idea: contract clusterings clustering paradigm: internally dense and externally sparse 14 Christian Schulz:

27 Basic Idea Contraction of Clusterings a b c A C B a+b+c A+B+C contraction: respect balance and cut avoid large blocks: size constraint U recurse until graph is small 15 Christian Schulz:

28 Basic Idea Contraction of Clusterings a b c A C B a+b+c A+B+C contraction: respect balance and cut avoid large blocks: size constraint U recurse until graph is small 15 Christian Schulz:

29 Label Propagation Cut-based, Linear Time Clustering Algorithm [Raghavan et. al] cut-based clustering using size-constraint label propagation start with singletons traverse nodes in random order or smallest degree first move node to cluster having strongest eligible connection modification eligible: w.r.t size constraint U Scan 16 Christian Schulz:

30 Label Propagation Iteration Cut [%] Christian Schulz:

31 Label Propagation Iteration Cut [%] Christian Schulz:

32 Label Propagation Iteration Cut [%] Christian Schulz:

33 Label Propagation Iteration Cut [%] Christian Schulz:

34 Label Propagation Iteration Cut [%] Christian Schulz:

35 Label Propagation Iteration Cut [%] Christian Schulz:

36 Label Propagation Iteration Cut [%] Christian Schulz:

37 Label Propagation Iteration Cut [%] Christian Schulz:

38 Label Propagation Iteration Cut [%] Christian Schulz:

39 Label Propagation Iteration Cut [%] Christian Schulz:

40 Label Propagation Simple Local Search Greedy Local Search: start with partition from coarser level traverse nodes in random order move node to cluster having strongest eligible connection eligible: w.r.t size constraint U := (1 + ɛ) V k Scan 18 Christian Schulz:

41 Parallelization 18 Christian Schulz:

42 Graph Distribution over PEs Graph Distribution: a PE receives n/p vertices and their edges Processor I Processor II Communication ghost nodes: adjacent nodes on other processor (communication!) interface nodes: nodes adjacent to ghost nodes 19 Christian Schulz:

43 Label Propagation Distributed Memory each PE has a static part of the graph, only block IDs can change Overlap Computation and Communication (PE centric view): V Phase i 1 Phase i Phase i+1 Scan At the end of phase i: send block ID updates of phase i to neighboring PEs receive block ID updates from neighboring PEs from phase i 1 * while scanning in phase i, messages are routed through the network * algorithm converged nothing will be communicated 20 Christian Schulz:

44 Contraction of Clusterings The Parallel Case High Level a b c A C B a+b+c A+B+C parallel find mapping C : n.. 1 n.. 1 exchange subgraphs, compute contracted graph locally when graph small parallel initial partitioning 21 Christian Schulz:

45 Experiments 21 Christian Schulz:

46 Parallel Solution Quality Performance k = 2 blocks, 32PEs instances: meshes and social networks/web graphs ParMetis has ineffective coarsening (matching-based) due to memory consumption of coarsest graph (dist. among PEs) could not solve arabic-2005, sk-2005 and uk-2007 solved instances (ParMetis): fast and eco yield 19.2% and 27.4% improvement fast and eco slower on average social networks/web graphs: fast: 38% less cut edges and > 2 faster eco: 45% less cut edges and slower best instance: 18 faster and 61.6% less cut edges improvement over Facebook [Ugander and Backstrom]: 45% less cut edges on LiveJournal and much faster (k = 100) 22 Christian Schulz:

47 Strong Scaling Social Networks 1000 Fast sk-2007 Fast arabic-2005 Fast uk-2002 Fast uk-2007 Minimal uk-2007 total time [s] K 2K number of PEs p uk-2007 can be partitioned in 15.2 seconds (seq. 10.5min) 72 seconds for random geometric graph with 22G edges more scaling results in the paper 23 Christian Schulz:

48 Graph Drawing joint work with: H. Meyerhenke and M. Nöllenburg 23 Christian Schulz:

49 Problem x : V R 2 G = (V, E, d) d : E R 24 Christian Schulz:

50 Problem x : V R 2 G = (V, E, d) d : E R d 1 24 Christian Schulz:

51 Maximal Entropy Stress Model [Gansner et al. 13] Entropy H(x): physics: nodes evenly dispersed nodes as far away as possible some nodes have predefined distance! Maximal Entropy Stress Model: max H(x) := ln x u x v {u,v} E subject to x u x v = d uv, {u, v} E 25 Christian Schulz:

52 Maximal Entropy Stress Model [Gansner et al. 13] Entropy H(x): physics: nodes evenly dispersed nodes as far away as possible some nodes have predefined distance! Maximal Entropy Stress Model: max H(x) := ln x u x v {u,v} E subject to x u x v = d uv, {u, v} E not possible to satisfy all constraints! model may be infeasible 25 Christian Schulz:

53 Maximal Entropy Stress Model [Gansner et al. 13] Compromise: min error, max entropy min w uv ( x u x v d uv ) 2 αh(x) u,v E α trade-off parameter Solve optimization problem by repeatedly solving Laplacian systems or iterative scheme Christian Schulz:

54 Maximal Entropy Stress Model [Gansner et al. 13] Compromise: min error, max entropy min w uv ( x u x v d uv ) 2 αh(x) u,v E α trade-off parameter Solve optimization problem by repeatedly solving Laplacian systems or iterative scheme Christian Schulz:

55 Maximal Entropy Stress Model [Gansner et al. 13]... or iterative scheme: x u 1 ( x u x v ρ u w uv x v + d uv x {u,v} E u x v overall update costs O(n 2 ) per iteration ) + α ρ u {u,v}/ E x u x v x u x v 2 Our contributions: make this usable and fast in practice multilevel integration approximate long-range forces employ parallelism 27 Christian Schulz:

56 Multilevel Graph Drawing [Hadany, Harel 99] input graph output drawing... improve drawing... contract initial drawing uncontract as before: contract using size-constrained clusterings 28 Christian Schulz:

57 Multilevel Graph Drawing Initial Drawing coarsen until only two nodes left place them at optimal distance define distances on coarse graphs, stay tuned x u = (0, 0) x v = (0, d uv ) 29 Christian Schulz:

58 Uncoarsening Local Improvement minimize maxent-stress on each level of hierarchy assume disk with radius c(u) to draw c(u) vertices c(u)+ c(v) define distance d uv := 2 on current level Iterative Scheme x u 1 ( x u x v ρ u w uv x v + d uv x {u,v} E u x v ) + α ρ u {u,v}/ E x u x v x u x v 2 30 Christian Schulz:

59 Uncoarsening Local Improvement minimize maxent-stress on each level of hierarchy assume disk with radius c(u) to draw c(u) vertices c(u)+ c(v) define distance d uv := 2 on current level Iterative Scheme x u 1 ( x u x v ρ u w uv x v + d uv x {u,v} E u x v ) + α ρ u {u,v}/ E x u x v x u x v 2 30 Christian Schulz:

60 Local Improvement Iterative Scheme Approximation x u... {u,v}/ E x u x v x u x v } {{ 2 } =:r(u,v) x u r(u, v) + u =v M(u)=M(v) v V v =M(u) ν(v x u x v ) x u x v 2 {u,v} E r(u, v) M(u) cluster of u ν(v ) number of finer vertices of v on current level V vertex set of next coarser level x v coordinate of v 31 Christian Schulz:

61 Local Improvement Approximation x u r(u, v) + u =v M(u)=M(v) v V v =M(u) ν(v x u x v ) x u x v 2 M(u) cluster of u ν(v ) number of finer vertices of v on current level V vertex set of next coarser level x v coordinate of v {u,v} E r(u, v) 32 Christian Schulz:

62 Local Improvement Additional Enhancements after each iteration update barycenter of coarse nodes vertex computations independent add parallelism use approximation multiple h levels beneath in hierarchy input graph output drawing... improve drawing... contract initial drawing uncontract Proposition: Assume equal cluster sizes. The running time of one iteration of MulMent h, h 0, is O(m + n h+2 h+1 ) 33 Christian Schulz:

63 Experiments 33 Christian Schulz:

64 Scalability Running Time [Delaunay n = 2 20 ] 10 6 MulMent 0 MulMent 1 MulMent 2 MulMent 4 MulMent 6 MulMent 8 MulMent 9 MulMent total time [s] number of PEs p 34 Christian Schulz:

65 Running Times graph PMDS MaxEnt MulMent 0 MulMent 1 MulMent 10 btree bus USpowerG elt commanche bcsstk fe pwt del luxembourg nyc auto del Table : Running times in seconds per graph. Smaller is better. PivotMDS and MaxEnt use one thread (sequential codes), the MulMent algorithms use 32 cores (64 threads). Running times of MaxEnt are without the time of PMDS (which yields input coordinates to MaxEnt) 35 Christian Schulz:

66 Experimental Results Summary Influence of h / Comparision increasing h not a large impact on solution quality maxent-stress remains comparable maxent-stress of MaxEnt and MulMent more or less similar Dynamic Networks model: remove x% random edges, insert x% edges (distance D) 4 faster (h = 0), save 50% time (h = 7) 9% worse full-stress, 1% better maxent-stress 36 Christian Schulz:

67 Example Drawings fe pwt PMDS MaxEnt MulMent 37 Christian Schulz:

68 Example Drawings bcsstk31 PMDS 37 Christian Schulz: MaxEnt MulMent

69 Example Drawings commanche PMDS MaxEnt MulMent 37 Christian Schulz:

70 Independent Sets joint work with: S. Lamm, P. Sanders, D. Strash and R. Werneck 37 Christian Schulz:

71 Definitions Independent Set: subset S V such that there are no adjacent nodes in S Maximum Independent Set: maximum cardinality set S related to maximum clique and minimum vertex cover finding a MIS is NP-hard and hard to approximate 38 Christian Schulz:

72 Common Approaches use heuristic algorithms gradually improve a single solution node deletions, insertions and swaps plateau search diversification & restart rules Andrade et al. (2012): iterated local search using (j, k)-swaps 39 Christian Schulz:

73 ARW Local Search Node Swaps: (j, k)-swaps: remove j solution nodes and insert k new ones (1, 2)-swaps: remove single solution node and insert two new ones Local Search: search for (1, 2)-swaps in time O(m) use data structure that supports fast insertion and removal Solution nodes Free nodes Non-free non-solution nodes 40 Christian Schulz:

74 ɛ-balanced Graph Partitioning Node Separators Partition graph G = (V, E, c : V R >0, ω : E R >0 ) into k disjoint blocks + node separator s.t. total node weight of each block 1 + ɛ total node weight k total size of node separator as small as possible 41 Christian Schulz:

75 Evolutionary Algorithms General Structure procedure steady-state-ea create initial population P while stopping criterion not fulfilled select parents P 1, P 2 from P combine P 1 with P 2 to create offspring o mutate offspring o evict individual in population using o return the fittest individual that occurred 42 Christian Schulz:

76 Combine Operations Graph Partitioning: exchange whole blocks of solutions operators for edge separators and node separators small cut-size vital for efficiency + = Local Search: resulting independent sets may not be maximal use maximization step and ARW to reach local optimum 43 Christian Schulz:

77 Node Separator Combine V S V 2 S 1 V1 V 2 S V V 2 1 build node separator V = V 1 V 2 S use node separator as crossover point combination takes linear time O(n) maximize with greedy + ARW local search 44 Christian Schulz:

78 Node Separator Combine V S V 2 S 1 V1 V 2 S V V 2 1 build node separator V = V 1 V 2 S use node separator as crossover point combination takes linear time O(n) maximize with Greedy + ARW local search 44 Christian Schulz:

79 Kernelization [Akiba,Iwata 15] Reductions: rules to decrease graph size, while maintaining optimality solve problem on problem kernel (using EA) obtain solution on input graph 45 Christian Schulz:

80 Kernelization [Akiba,Iwata 15] Reductions: rules to decrease graph size, while maintaining optimality Example: remove degree 0 or 1 vertices v neighbor of v in MIS choose v instead, else add v 45 Christian Schulz:

81 Kernelization [Akiba,Iwata 15] Reductions: rules to decrease graph size, while maintaining optimality Example: remove degree 0 or 1 vertices v neighbor of v in MIS choose v instead, else add v much more reductions used in practice [Lamm et al. 16] 45 Christian Schulz:

82 Guess likely candidates can we guess vertices that are in MIS? idea: select small-degree vertices from fittest independent set apply more reductions and recurse! 46 Christian Schulz:

83 Experiments 46 Christian Schulz:

84 Near-optimal on difficult networks finds exact MIS faster, when exact algorithm is slow: Skitter 48 min 21 min Stanford 13 hours 5 min bcsstk hours 2.4 sec Skitter 2 hours 28 sec,... finds exact MIS, for large networks with known MIS size consistently finds larger solutions on social and road networks Solution Size Time [s] ARW EvoMIS ReduMIS Solution Size Time [s] ARW EvoMIS ReduMIS consistent, even as we scale to graphs on 10M to 100M nodes 47 Christian Schulz:

85 Conclusions 47 Christian Schulz:

86 Conclusion apply algorithm engineering to optimization problems obtain algorithms that scale to large inputs and machines outperform state-of-the-art open source implementations KaHIP: algo2.iti.kit.edu/kahip KaDraw: algo2.iti.kit.edu/kadraw KaMIS: algo2.iti.kit.edu/kamis realistic algorithm models 3 engineering perf. guarantees 48 Christian Schulz: experiments appl. engineering implementation algorithm libraries 9 10 applications analysis deduction real Inputs design 4 falsifiable 5 hypotheses 7 induction 6 8

Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

More information

FAST LOCAL SEARCH FOR THE MAXIMUM INDEPENDENT SET PROBLEM

FAST LOCAL SEARCH FOR THE MAXIMUM INDEPENDENT SET PROBLEM FAST LOCAL SEARCH FOR THE MAXIMUM INDEPENDENT SET PROBLEM DIOGO V. ANDRADE, MAURICIO G.C. RESENDE, AND RENATO F. WERNECK Abstract. Given a graph G = (V, E), the independent set problem is that of finding

More information

A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS

A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS SIAM J. SCI. COMPUT. Vol. 20, No., pp. 359 392 c 998 Society for Industrial and Applied Mathematics A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS GEORGE KARYPIS AND VIPIN

More information

SCAN: A Structural Clustering Algorithm for Networks

SCAN: A Structural Clustering Algorithm for Networks SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected

More information

Engineering Route Planning Algorithms. Online Topological Ordering for Dense DAGs

Engineering Route Planning Algorithms. Online Topological Ordering for Dense DAGs Algorithm Engineering for Large Graphs Engineering Route Planning Algorithms Peter Sanders Dominik Schultes Universität Karlsruhe (TH) Online Topological Ordering for Dense DAGs Deepak Ajwani Tobias Friedrich

More information

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Roy D. Williams, 1990 Presented by Chris Eldred Outline Summary Finite Element Solver Load Balancing Results Types Conclusions

More information

How To Cluster Of Complex Systems

How To Cluster Of Complex Systems Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Network Algorithms for Homeland Security

Network Algorithms for Homeland Security Network Algorithms for Homeland Security Mark Goldberg and Malik Magdon-Ismail Rensselaer Polytechnic Institute September 27, 2004. Collaborators J. Baumes, M. Krishmamoorthy, N. Preston, W. Wallace. Partially

More information

Small Maximal Independent Sets and Faster Exact Graph Coloring

Small Maximal Independent Sets and Faster Exact Graph Coloring Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected

More information

Lecture 12: Partitioning and Load Balancing

Lecture 12: Partitioning and Load Balancing Lecture 12: Partitioning and Load Balancing G63.2011.002/G22.2945.001 November 16, 2010 thanks to Schloegel,Karypis and Kumar survey paper and Zoltan website for many of today s slides and pictures Partitioning

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014 Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

More information

An Empirical Study of Two MIS Algorithms

An Empirical Study of Two MIS Algorithms An Empirical Study of Two MIS Algorithms Email: Tushar Bisht and Kishore Kothapalli International Institute of Information Technology, Hyderabad Hyderabad, Andhra Pradesh, India 32. tushar.bisht@research.iiit.ac.in,

More information

Guessing Game: NP-Complete?

Guessing Game: NP-Complete? Guessing Game: NP-Complete? 1. LONGEST-PATH: Given a graph G = (V, E), does there exists a simple path of length at least k edges? YES 2. SHORTEST-PATH: Given a graph G = (V, E), does there exists a simple

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm. Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

More information

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

More information

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

More information

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Massively Parallel Multilevel Finite

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

Big Graph Processing: Some Background

Big Graph Processing: Some Background Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs

More information

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

More information

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large

More information

A scalable multilevel algorithm for graph clustering and community structure detection

A scalable multilevel algorithm for graph clustering and community structure detection A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures

More information

Big graphs for big data: parallel matching and clustering on billion-vertex graphs

Big graphs for big data: parallel matching and clustering on billion-vertex graphs 1 Big graphs for big data: parallel matching and clustering on billion-vertex graphs Rob H. Bisseling Mathematical Institute, Utrecht University Collaborators: Bas Fagginger Auer, Fredrik Manne, Albert-Jan

More information

MuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines

MuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines MuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines Daniil Chivilikhin and Vladimir Ulyantsev National Research University of IT, Mechanics and Optics St.

More information

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li Computer Algorithms NP-Complete Problems NP-completeness The quest for efficient algorithms is about finding clever ways to bypass the process of exhaustive search, using clues from the input in order

More information

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory Graph Analytics in Big Data John Feo Pacific Northwest National Laboratory 1 A changing World The breadth of problems requiring graph analytics is growing rapidly Large Network Systems Social Networks

More information

MapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12

MapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12 MapReduce Algorithms A Sense of Scale At web scales... Mail: Billions of messages per day Search: Billions of searches per day Social: Billions of relationships 2 A Sense of Scale At web scales... Mail:

More information

Towards real-time image processing with Hierarchical Hybrid Grids

Towards real-time image processing with Hierarchical Hybrid Grids Towards real-time image processing with Hierarchical Hybrid Grids International Doctorate Program - Summer School Björn Gmeiner Joint work with: Harald Köstler, Ulrich Rüde August, 2011 Contents The HHG

More information

The Minimum Consistent Subset Cover Problem and its Applications in Data Mining

The Minimum Consistent Subset Cover Problem and its Applications in Data Mining The Minimum Consistent Subset Cover Problem and its Applications in Data Mining Byron J Gao 1,2, Martin Ester 1, Jin-Yi Cai 2, Oliver Schulte 1, and Hui Xiong 3 1 School of Computing Science, Simon Fraser

More information

5.1 Bipartite Matching

5.1 Bipartite Matching CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson

More information

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents

More information

Attacking Anonymized Social Network

Attacking Anonymized Social Network Attacking Anonymized Social Network From: Wherefore Art Thou RX3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography Presented By: Machigar Ongtang (Ongtang@cse.psu.edu ) Social

More information

recursion, O(n), linked lists 6/14

recursion, O(n), linked lists 6/14 recursion, O(n), linked lists 6/14 recursion reducing the amount of data to process and processing a smaller amount of data example: process one item in a list, recursively process the rest of the list

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Computing Many-to-Many Shortest Paths Using Highway Hierarchies

Computing Many-to-Many Shortest Paths Using Highway Hierarchies Computing Many-to-Many Shortest Paths Using Highway Hierarchies Sebastian Knopp Peter Sanders Dominik Schultes Frank Schulz Dorothea Wagner Abstract We present a fast algorithm for computing all shortest

More information

Dynamic Load Balancing for Parallel Numerical Simulations based on Repartitioning with Disturbed Diffusion

Dynamic Load Balancing for Parallel Numerical Simulations based on Repartitioning with Disturbed Diffusion Dynamic Load Balancing for Parallel Numerical Simulations based on Repartitioning with Disturbed Diffusion Henning Meyerhenke University of Paderborn Department of Computer Science Fürstenallee 11, 33102

More information

Optimized Software Component Allocation On Clustered Application Servers. by Hsiauh-Tsyr Clara Chang, B.B., M.S., M.S.

Optimized Software Component Allocation On Clustered Application Servers. by Hsiauh-Tsyr Clara Chang, B.B., M.S., M.S. Optimized Software Component Allocation On Clustered Application Servers by Hsiauh-Tsyr Clara Chang, B.B., M.S., M.S. Submitted in partial fulfillment of the requirements for the degree of Doctor of Professional

More information

CIS 700: algorithms for Big Data

CIS 700: algorithms for Big Data CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries

Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries Shin Morishima 1 and Hiroki Matsutani 1,2,3 1Keio University, 3 14 1 Hiyoshi, Kohoku ku, Yokohama, Japan 2National Institute

More information

Parallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri

Parallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri Parallel Algorithms for Small-world Network Analysis ayssand Partitioning atto g(s (SNAP) David A. Bader and Kamesh Madduri Overview Informatics networks, small-world topology Community Identification/Graph

More information

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward

More information

A Constraint Programming based Column Generation Approach to Nurse Rostering Problems

A Constraint Programming based Column Generation Approach to Nurse Rostering Problems Abstract A Constraint Programming based Column Generation Approach to Nurse Rostering Problems Fang He and Rong Qu The Automated Scheduling, Optimisation and Planning (ASAP) Group School of Computer Science,

More information

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI Henning Meyerhenke Institute of Theoretical Informatics Karlsruhe Institute of Technology Am Fasanengarten 5, 76131

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;

More information

Parallel Simulated Annealing Algorithm for Graph Coloring Problem

Parallel Simulated Annealing Algorithm for Graph Coloring Problem Parallel Simulated Annealing Algorithm for Graph Coloring Problem Szymon Łukasik 1, Zbigniew Kokosiński 2, and Grzegorz Świętoń 2 1 Systems Research Institute, Polish Academy of Sciences, ul. Newelska

More information

Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

More information

Chapter 6: Episode discovery process

Chapter 6: Episode discovery process Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

Fast Multipole Method for particle interactions: an open source parallel library component

Fast Multipole Method for particle interactions: an open source parallel library component Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,

More information

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SIMS 255 Foundations of Software Design. Complexity and NP-completeness SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 mdw@cs.berkeley.edu 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity

More information

Distributed communication-aware load balancing with TreeMatch in Charm++

Distributed communication-aware load balancing with TreeMatch in Charm++ Distributed communication-aware load balancing with TreeMatch in Charm++ The 9th Scheduling for Large Scale Systems Workshop, Lyon, France Emmanuel Jeannot Guillaume Mercier Francois Tessier In collaboration

More information

CSC2420 Spring 2015: Lecture 3

CSC2420 Spring 2015: Lecture 3 CSC2420 Spring 2015: Lecture 3 Allan Borodin January 22, 2015 1 / 1 Announcements and todays agenda Assignment 1 due next Thursday. I may add one or two additional questions today or tomorrow. Todays agenda

More information

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

More information

Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling. Tim Nieberg Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

More information

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Load balancing in a heterogeneous computer system by self-organizing Kohonen network Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011 Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis

More information

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes. 1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The

More information

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems

More information

Introduction to Scheduling Theory

Introduction to Scheduling Theory Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France arnaud.legrand@imag.fr November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling

More information

Similarity Search in a Very Large Scale Using Hadoop and HBase

Similarity Search in a Very Large Scale Using Hadoop and HBase Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France

More information

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course Alex Angelopoulos, NTUA January 22, 2015 Outline Alex Angelopoulos (NTUA) FoCS: Reductions & NP-completeness-

More information

Data Warehousing und Data Mining

Data Warehousing und Data Mining Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

More information

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 ParFUM: A Parallel Framework for Unstructured Meshes Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 What is ParFUM? A framework for writing parallel finite element

More information

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 General Integer Linear Program: (ILP) min c T x Ax b x 0 integer Assumption: A, b integer The integrality condition

More information

Design and Analysis of ACO algorithms for edge matching problems

Design and Analysis of ACO algorithms for edge matching problems Design and Analysis of ACO algorithms for edge matching problems Carl Martin Dissing Söderlind Kgs. Lyngby 2010 DTU Informatics Department of Informatics and Mathematical Modelling Technical University

More information

A Comparison of General Approaches to Multiprocessor Scheduling

A Comparison of General Approaches to Multiprocessor Scheduling A Comparison of General Approaches to Multiprocessor Scheduling Jing-Chiou Liou AT&T Laboratories Middletown, NJ 0778, USA jing@jolt.mt.att.com Michael A. Palis Department of Computer Science Rutgers University

More information

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P) Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...

More information

Model-based Parameter Optimization of an Engine Control Unit using Genetic Algorithms

Model-based Parameter Optimization of an Engine Control Unit using Genetic Algorithms Symposium on Automotive/Avionics Avionics Systems Engineering (SAASE) 2009, UC San Diego Model-based Parameter Optimization of an Engine Control Unit using Genetic Algorithms Dipl.-Inform. Malte Lochau

More information

Unsupervised Learning and Data Mining. Unsupervised Learning and Data Mining. Clustering. Supervised Learning. Supervised Learning

Unsupervised Learning and Data Mining. Unsupervised Learning and Data Mining. Clustering. Supervised Learning. Supervised Learning Unsupervised Learning and Data Mining Unsupervised Learning and Data Mining Clustering Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...

More information

walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation

walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation SIAM Parallel Processing for Scientific Computing 2012 February 16, 2012 Florian Schornbaum,

More information

Energy Efficient MapReduce

Energy Efficient MapReduce Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

More information

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing /35 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of

More information

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with

More information

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples

More information

Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer

Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer 1 Content What is Community Detection? Motivation Defining a community Methods to find communities Overlapping communities

More information

A number of tasks executing serially or in parallel. Distribute tasks on processors so that minimal execution time is achieved. Optimal distribution

A number of tasks executing serially or in parallel. Distribute tasks on processors so that minimal execution time is achieved. Optimal distribution Scheduling MIMD parallel program A number of tasks executing serially or in parallel Lecture : Load Balancing The scheduling problem NP-complete problem (in general) Distribute tasks on processors so that

More information

arxiv:1412.2333v1 [cs.dc] 7 Dec 2014

arxiv:1412.2333v1 [cs.dc] 7 Dec 2014 Minimum-weight Spanning Tree Construction in O(log log log n) Rounds on the Congested Clique Sriram V. Pemmaraju Vivek B. Sardeshmukh Department of Computer Science, The University of Iowa, Iowa City,

More information

Perron vector Optimization applied to search engines

Perron vector Optimization applied to search engines Perron vector Optimization applied to search engines Olivier Fercoq INRIA Saclay and CMAP Ecole Polytechnique May 18, 2011 Web page ranking The core of search engines Semantic rankings (keywords) Hyperlink

More information

Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC

Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC Outline Dynamic Load Balancing framework in Charm++ Measurement Based Load Balancing Examples: Hybrid Load Balancers Topology-aware

More information

Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

More information

Forschungskolleg Data Analytics Methods and Techniques

Forschungskolleg Data Analytics Methods and Techniques Forschungskolleg Data Analytics Methods and Techniques Martin Hahmann, Gunnar Schröder, Phillip Grosse Prof. Dr.-Ing. Wolfgang Lehner Why do we need it? We are drowning in data, but starving for knowledge!

More information

Graph Database Proof of Concept Report

Graph Database Proof of Concept Report Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment

More information

Algorithmic Methods for Complex Network Analysis: Graph Clustering

Algorithmic Methods for Complex Network Analysis: Graph Clustering Algorithmic Methods for Complex Network Analysis: Graph Clustering Summer School on Algorithm Engineering Dorothea Wagner September 9, 204 K ARLSRUHE I NSTITUTE OF T ECHNOLOGY I NSTITUTE OF T HEORETICAL

More information

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering Advances in Intelligent Systems and Technologies Proceedings ECIT2004 - Third European Conference on Intelligent Systems and Technologies Iasi, Romania, July 21-23, 2004 Evolutionary Detection of Rules

More information

Social Media Mining. Graph Essentials

Social Media Mining. Graph Essentials Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

More information

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the

More information

Distance Degree Sequences for Network Analysis

Distance Degree Sequences for Network Analysis Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation

More information

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.0, steen@cs.vu.nl Chapter 06: Network analysis Version: April 8, 04 / 3 Contents Chapter

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Definition 11.1. Given a graph G on n vertices, we define the following quantities:

Definition 11.1. Given a graph G on n vertices, we define the following quantities: Lecture 11 The Lovász ϑ Function 11.1 Perfect graphs We begin with some background on perfect graphs. graphs. First, we define some quantities on Definition 11.1. Given a graph G on n vertices, we define

More information

Introduction to Graph Mining

Introduction to Graph Mining Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain

More information

Introduction To Genetic Algorithms

Introduction To Genetic Algorithms 1 Introduction To Genetic Algorithms Dr. Rajib Kumar Bhattacharjya Department of Civil Engineering IIT Guwahati Email: rkbc@iitg.ernet.in References 2 D. E. Goldberg, Genetic Algorithm In Search, Optimization

More information

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions

More information