Systems and Algorithms for Big Data Analytics YAN, Da Email: yanda@cse.cuhk.edu.hk
My Research Graph Data Distributed Graph Processing Spatial Data Spatial Query Processing Uncertain Data Querying & Mining Uncertain Data 2
My Research Graph Data Distributed Graph Processing Algorithm Design & Analysis Computation Model Communication Mechanism Fault Tolerance Out-of-core Support 3
My Research Spatial Settings Road Networks Terrain Meshes Euclidean Space (Trajectories). Spatial Data Spatial Query Processing Spatial Queries Optimal Meeting Point Distance-Preserving Subgraph Facility Location Problem Reverse Nearest Neighbors 4
My Research Top-k Queries (DASFAA 2011 Best Paper) Sequential Pattern Mining Spatial Queries. Uncertain Data Querying & Mining Uncertain Data 5
My Research Focus of this presentation Graph Data Distributed Graph Processing Spatial Data Spatial Query Processing Uncertain Data Querying & Mining Uncertain Data 6
Google s Pregel Distributed Framework for Graph Processing» User-friendly: think like a vertex» Message passing» Iterative Bulk synchronous parallel Superstep 7
Google s Pregel Vertex Partitioning 0 1 2 3 4 5 6 7 8 0 1 3 1 0 2 3 2 1 3 4 7 3 0 1 2 7 4 2 5 7 5 4 6 6 5 8 7 2 3 4 8 8 6 7 M 0 M 1 M 2 8
Google s Pregel Programming Interfaces» u.compute(msgs)» u.send_msg(v, msg)» get_superstep_number()» u.vote_to_halt() Called inside u.compute(msgs) 9
Google s Pregel Vertex state» Active / inactive» Reactivated by messages Stop condition» All vertices are halted, and» No pending messages for the next superstep 10
Google s Pregel Hash-Min: Connected Components 3 1 3 1 7 7 5 5 0 0 6 6 8 8 2 2 4 4 Superstep 1 11
Google s Pregel Hash-Min: Connected Components 3 1 1 0 7 5 5 0 0 0 6 0 8 6 2 0 4 2 Superstep 2 12
Google s Pregel Illustration of Hash-Min 3 1 0 0 7 0 5 0 0 0 6 0 8 0 2 0 4 0 Superstep 3 13
Outline Practical Pregel Algorithms Blogel: Block-Centric Computation Pregel+: Message Reduction Other Improvements to Pregel Future Directions 14
Outline Practical Pregel Algorithms Blogel: Block-Centric Computation Pregel+: Message Reduction Other Improvements to Pregel Future Directions 15
Practical Pregel Alogorithms Practical Pregel Algorithms (PPAs) [PVLDB 14]» The first cost model for Pregel algorithm design» PPAs for fundamental graph problems Breadth-first search, list ranking, spanning tree, Euler tour, pre/post-order traversal, connected components, biconnected components, strongly connected components, etc. 16
Practical Pregel Alogorithms Practical Pregel Algorithms (PPAs) [PVLDB 14]» Linear cost per superstep O( V + E ) message number O( V + E ) computation time O( V + E ) RAM space» Logarithm number of supersteps O(log V ) supersteps O(log V ) = O(log E ) How about load balancing? 17
Practical Pregel Alogorithms Balanced Practical Pregel Algorithms (BPPAs)» d in (v): in-degree of v» d out (v): out-degree of v» Linear cost per superstep O(d in (v) + d out (v)) message number O(d in (v) + d out (v)) computation time O(d in (v) + d out (v)) RAM space» Logarithm number of supersteps 18
Practical Pregel Alogorithms Example: List Ranking» A procedure in computing bi-connected components» Linked list where each element v has Value val(v) Predecessor pred(v)» Element at the head has pred(v) = NULL NULL v 1 v 2 v 3 v 4 v 5 1 1 1 1 1 Toy Example: val(v) = 1 for all v 19
Practical Pregel Alogorithms Example: List Ranking» Compute sum(v) for each element v summing val(v) and values of all predecessors» Why TeraSort cannot work? NULL v 1 v 2 v 3 v 4 v 5 1 2 3 4 5 20
Practical Pregel Alogorithms Example: List Ranking» Pointer jumping / path doubling sum(v) sum(v) + sum(pred(v)) pred(v) pred(pred(v)) As long as pred(v) NULL NULL v 1 v 2 v 3 v 4 v 5 1 1 1 1 1 21
Practical Pregel Alogorithms Example: List Ranking» Pointer jumping / path doubling sum(v) sum(v) + sum(pred(v)) pred(v) pred(pred(v)) NULL NULL v 1 v 2 v 3 v 4 v 5 1 1 1 1 1 1 2 2 2 2 22
Practical Pregel Alogorithms Example: List Ranking» Pointer jumping / path doubling sum(v) sum(v) + sum(pred(v)) pred(v) pred(pred(v)) NULL NULL v 1 v 2 v 3 v 4 v 5 1 1 1 1 1 1 2 2 2 2 NULL 1 2 3 4 4 23
Practical Pregel Alogorithms Example: List Ranking» Pointer jumping / path doubling sum(v) sum(v) + sum(pred(v)) pred(v) pred(pred(v)) O(log V ) supersteps NULL NULL v 1 v 2 v 3 v 4 v 5 1 1 1 1 1 1 2 2 2 2 NULL 1 2 3 4 4 NULL 1 2 3 4 5 24
Practical Pregel Alogorithms Example: Connected Components» Pointer jumping / path doubling» Each vertex u maintains a pointer D[u] Vertices are organized by a pseudo-forest D[u] is the parent link v w 25
Practical Pregel Alogorithms Example: Connected Components» Repeating two steps: O(log V ) rounds» Step 1: tree hooking w x u v D[v] < D[u] 26
Practical Pregel Alogorithms Example: Connected Components» Repeating two steps: O(log V ) rounds» Step 2: Shortcutting y Pointing v to the parent of v s parent u w x u x y w 27
Practical Pregel Alogorithms Example: Connected Components» Repeating two steps: O(log V ) rounds» Stop condition: D[u] converges for every vertex u Every vertex belongs to a star Every star refers to a CC 28
Outline Practical Pregel Algorithms Blogel: Block-Centric Computation Pregel+: Message Reduction Other Improvements to Pregel Future Directions 29
Block-Centric Computation Blogel: Block-Centric Model [PVLDB 14]» Orders of magnitude performance improvement e.g., one hour 10 seconds 30
Block-Centric Computation Motivation» Graph characteristics adverse to Pregel Large graph diameter Skewed vertex degree distribution High average vertex degree Data Type V E AVG Deg Max Deg WebUK directed 133,633,040 5,507,679,822 41.21 22,429 LiveJournal directed 10,690,276 224,614,770 21.01 1,053,676 Twitter directed 52,579,682 1,963,263,821 37.34 779,958 BTC undirected 164,732,473 772,822,094 4.69 1,637,619 33
Block-Centric Computation Idea of Block-Centric Computation» A block refers to a connected subgraph of the graph» Message exchanges occur only among blocks» Serial in-memory algorithm is run within a block 34
Block-Centric Computation Benefits of Block-Centric Computation» High-degree vertices inside a block send no msgs» Much less number of supersteps» Much less number of blocks than vertices 35
Block-Centric Computation Example: Hash-Min» Condense each block into a supervertex, to get blocklevel graph i.e., to construct an adjacency list for each block» Run Hash-Min over block-level graph To propagate min block ID instead of min vertex ID 36
Block-Centric Computation Effectiveness BTC Friendster USA Road Computing Time Total Msg # Superstep # V-Centric 28.48 s 1,188,832,712 30 B-Centric 0.94 s 1,747,653 6 V-Centric 120.24 s 7,226,963,186 22 B-Centric 2.52 s 19,410,865 5 V-Centric 510.98s 8,353,044,435 6,262 B-Centric 1.94 s 270,257 26 37
Block-Centric Computation Example: Single-Source Shortest Paths» Source s V» Each edge has a length» Goal: to compute distance from s to each v V 38
Block-Centric Computation Example: Single-Source Shortest Paths» Vertices receives msgs from remote neighbors to update their distances» A block runs Dijkstra s algorithm from updated vertices» Remote neighbors are sent msgs, rather than enqueued 39
Block-Centric Computation Effectiveness Euro Road USA Road Time Step # V-Centric 1767.69 s 6210 B-Centric 11.10 s 60 V-Centric 9788.08 s 10789 B-Centric 12.48 s 58 40
Block-Centric Computation Graph Partitioning» Graph Voronoi Diagram (GVD) partitioning v Three seeds v is 2 hops from red seed v is 3 hops from green seed v is 5 hops from blue seed 41
Block-Centric Computation GVD Partitioning» Sample seed vertices with probability p 42
Block-Centric Computation GVD Partitioning» Sample seed vertices with probability p 43
Block-Centric Computation GVD Partitioning» Sample seed vertices with probability p» Compute GVD grouping Vertex-centric multi-source BFS 44
Block-Centric Computation Vertex-Centric Multi-Source BFS State after Seed Sampling 45
Block-Centric Computation Vertex-Centric Multi-Source BFS Superstep 1 46
Block-Centric Computation Vertex-Centric Multi-Source BFS Superstep 2 47
Block-Centric Computation Vertex-Centric Multi-Source BFS Superstep 3 48
Block-Centric Computation GVD Partitioning» Sample seed vertices with probability p» Compute GVD grouping» Repeat GVD Computation: Erase colors of large blocks Increase p and resample seeds Compute GVD over unassigned vertices 49
Block-Centric Computation GVD Partitioning» Sample seed vertices with probability p» Compute GVD grouping» Repeat GVD Computation» Run Hash-Min over unassigned vertices Why is this step necessary? Consider a graph with many small components 50
Block-Centric Computation GVD Partitioning Performance 3000 2500 2000 2026.65 1500 1000 500 0 505.85 186.89 105.48 75.88 70.68 WebUK Friendster BTC LiveJournal USA Road Euro Road Loading Partitioning Dumping 51
Outline Practical Pregel Algorithms Blogel: Block-Centric Computation Pregel+: Message Reduction Other Improvements to Pregel Future Directions 52
Message Reduction Message Reduction in Pregel+ [WWW 15]» Two techniques to reduce # of messages transmitted Vertex Mirroring Request-Respond Paradigm 53
Message Reduction Vertex Mirroring» Motivation: High-degree vertices send a lot of messages A vertex sends the same messages to neighbors Hash-Min: min(v) PageRank: PageRank(v) / out-degree(v) 54
Message Reduction Vertex Mirroring v 1 u 1 w 1 v 2 u 2 w 2 v j u i w k M 2 M 1 M 3 55
Message Reduction Vertex Mirroring v 1 u 1 w 1 v 2 u 2 w 2 v j u i u i u i w k M 2 M 1 M 3 56
Message Reduction Vertex Mirroring v.s. Message Combining» Create mirror for u 4? Consider messages to v 2 u 1 v 1 v 2 v 1 u 2 v 1 v 2 u 3 v 1 v 2 v 2 v 3 u 4 v 1 v 2 v 3 v 4 v 4 M 1 M 2 57
Message Reduction Vertex Mirroring v.s. Message Combining» Create mirror for u 4? Message combining without mirroring u 4 u 1 v 1 v 2 u 1 v 1 u 2 v 1 v 2 u 3 v 1 v 2 u 2 u 3 a(u 1 ) + a(u 2 ) + a(u 3 ) + a(u 4 ) v 2 v 3 u 4 v 1 v 2 v 3 v 4 u 4 v 4 M 1 M 1 M 2 58
Message Reduction Vertex Mirroring v.s. Message Combining» Create mirror for u 4? Message combining with u 4 mirrored u 1 v 1 v 2 u 1 a(u 1 ) + a(u 2 ) + a(u 3 ) v 1 u 2 v 1 v 2 u 2 v 2 u 3 v 1 v 2 u 4 v 1 v 2 v 3 v 4 u 3 u 4 a(u 4 ) u 4 v 3 v 4 M 1 M 1 M 2 59
Message Reduction Vertex Mirroring v.s. Message Combining» Only mirror high-degree vertices Choice of degree threshold τ M machines, n vertices, m edges Average degree: deg avg = m / n Optimal τ is M exp{deg avg / M} 60
Message Reduction Effectiveness of Message Reduction Number of messages sent by each worker in Pregel+ (blue bars w/o mirroring, red bars mirroring) 61
Message Reduction Request-Respond Paradigm» Motivation As a pointer-jumping algorithm goes on, there are fewer and fewer delegates communicating with more and more vertices E.g., PPA for computing connected components Merge small trees to large trees A vertex is the delegate of its children 62
Message Reduction Request-Respond Paradigm» Request-Respond API Retains all basic Pregel operations A vertex v can request attribute a(u) in superstep i, and a(u) will be available in superstep (i + 1) Here, u can be a delegate, and a(u) may be requested by many vertices v 63
Message Reduction Request-Respond Paradigm» Benefits Without Request-Respond v 1 v 2 v 3 v 4 <v 1 > <v 2 > <v 3 > <v 4 > u a(u) M 2 64
Message Reduction Request-Respond Paradigm» Benefits Without Request-Respond v 1 v 2 a(u) a(u) a(u) u v 3 v 4 a(u) a(u) M 2 65
Message Reduction Request-Respond Paradigm» Benefits Using Request-Respond v 1 v 2 u a[u] request u u v 3 v 4 M 1 a[u] M 2 66
Message Reduction Effectiveness of Request-Respond Paradigm Number of messages sent by each worker using Pregel+ (blue bars w/o req-resp, red bars with req-resp) 67
Outline Practical Pregel Algorithms Blogel: Block-Centric Computation Pregel+: Message Reduction Other Improvements to Pregel Future Directions 68
Other Improvements Fault Tolerance» Checkpointing time: 60 seconds 2 seconds Querying Workload» Over 100 seconds per query 3 queries per second Out-of-core Execution» Performance comparable to the fastest in-memory Pregel-like system Survey on Big Graph Systems 69
Open-Source Systems High ranking in Google, well indexed Used by industrial partners An ITF project funded with HK$ 1.4M 70
Open-Source Systems Many times faster than CMU s GraphLab» GraphLab is sold for US$ 6.7M 10x faster than Giraph used by Facebook» Facebook researchers closely follow our work Taobao replaces Spark with our system» Faster with 4 machines than Spark with 100 machines 71
Future Directions Beyond Pregel» Graph problem not suitable for Pregel Output size beyond linear Non-iterative» Examples Graph matching Motif mining Frequent subgraph mining 72
Future Directions Other Big Data Systems» Urban Computing Taxi trajectories Octopus card records (bus, MTR, ferry, )» Machine Learning Improving recommendation by Semantic Web Systems for deep learning 73
Thanks YAN, Da Contact Info Email: yanda@cse.cuhk.edu.hk Webpage: www.cse.cuhk.edu.hk/~yanda 74