Big Graph Processing: Some Background

Size: px

Start display at page:

Download "Big Graph Processing: Some Background"

Bonnie Daniels
9 years ago
Views:

1 Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu

2 Graphs are everywhere! o A graph is a collection of binary relationships, i.e. networks of pairwise interactions including social networks, digital networks part of Internet Mines CSCI-580, Bo Wu brain network 2

3 Scale of the first graph o Nearly 300 years ago the first graph problem consisted of 4 vertices and 7 edges Seven Bridges of Konigsberg problem Is it possible to cross each of the seven bridges exactly once? Not too hard to fit in memory Mines CSCI-580, Bo Wu 3

Konigsberg problem Is it possible to cross each of the seven

4 Scale of real-world graphs o Graph scale in current CS literature on order of billions of edges, tens of gigabytes Mines CSCI-580, Bo Wu 4

5 Big Data begets Big Graphs o Increasing volume,velocity,variety of Big Data are significant challenges to scalable algorithms o How will graph applications adapt to Big Data at petabyte scale? o Ability to store and process Big Graphs impacts typical data structures Mines CSCI-580, Bo Wu 5

applications adapt to Big Data at petabyte scale?

6 Social Scale o 1 billion vertices, 100 billion edges 111 PB adjacency matrix 2.92 TB adjacency list 2.92 TB edge list Mines CSCI-580, Bo Wu 6

7 Web scale o 50 billion vertices, 1 trillion edges 271 EB adjacency matrix 29.5 TB adjacency list 29.1 TB edge list Mines CSCI-580, Bo Wu 7

8 Brain scale o 100 billion vertices, 100 trillion edge 2.84 PB adjacency list 2.84 PB edge list Mines CSCI-580, Bo Wu 8

9 Benchmarking scalability on Big Graphs o Big Graph challenge our conventional thinking on both algorithms and computer architecture! o New Graph500.org benchmark provides a foundation for conducting experiments on graph datasets Mines CSCI-580, Bo Wu 9

10 Graph algorithms are challenging o Difficult to parallelize irregular data accesses increase latency skewed data distribution creates bottlenecks Celebrity nodes in social networks o Increased size imposes greater storage overhead IO burden! Mines CSCI-580, Bo Wu 10

creates bottlenecks Celebrity nodes in social networks o Increased

11 Problem: How do we store and process Big Graphs? o Conventional approach is to store and compute inmemory o Shared memory Parallel Random Access Machine (PRAM) data in globally-shared memory implicit communication by updating memory fast-random access o Distributed memory Bulk Synchronous Parallel (BSP) data distributed to local, private memory explicit communication by sending messages easier to scale by adding more machines Mines CSCI-580, Bo Wu 11

data in globally-shared memory implicit communication by updating memory fast-random access o Distributed memory

12 Memory is fast but o Algorithms must exploit computer memory hierarchy designed for spatial and temporal locality registers, L1,L2,L3 cache, TLB, pages, disk... great for unit-stride access common in many scientific codes, e.g. linear algebra o But common graph algorithm implementations have... lots of random access to memory causing... many cache and TLB misses Mines CSCI-580, Bo Wu 12

.. great for unit-stride access common in many scientific codes, e.g. linear algebra o But common graph algorithm implementations have.

13 Poor locality increases latency o Question: What is the memory throughput if 90% TLB hit and 0.01% page fault on miss? Mines CSCI-580, Bo Wu 13

14 If it fits o Graph problems that fit in memory can leverage excellent advances in architecture and libraries... Cray XMT2 designed for latency-hiding SGI UV2 designed for large, cache-coherent shared-memory body of literature and libraries Parallel Boost Graph Library (PBGL) Indiana University Multithreaded Graph Library (MTGL) Sandia National Labs GraphCT/STINGER Georgia Tech GraphLab Carnegie Mellon University Giraph Apache Software Foundation Mines CSCI-580, Bo Wu 14

and libraries Parallel Boost Graph Library (PBGL) Indiana University Multithreaded Graph Library (MTGL) Sandia

15 We can add more memory, but o Memory capacity is limited by... number of CPU pins, memory controller channels, DIMMs per channel memory bus width o Globally-shared memory limited by... CPU address space cache-coherence Mines CSCI-580, Bo Wu 15

channel memory bus width o Globally-shared memory limited by.

16 Larger systems, greater latency o Increasing memory can increase latency traverse more memory addresses larger system with greater physical distance between machines Fundamental limitation: speed of light o Latency causes significant inefficiency in new CPU architectures Mines CSCI-580, Bo Wu 16

distance between machines Fundamental limitation: speed of light o

17 Easier to increase capacity using disks o Current Intel Xeon E5 architectures: 384 GB max. per CPU (4 channels x 3 DIMMS x 32 GB) 64 TB max. globally-shared memory (46-bit address space) 3881 dual Xeon E5 motherboards to store Brain Graph 98 racks o Disk capacity not unlimited but higher than memory Larget disk on market: 8TB needs 364 to store Brain Graph which can fit in 5 racks o Disk is not enough applications will still require memory for processing Mines CSCI-580, Bo Wu 17

globally-shared memory (46-bit address space) 3881 dual Xeon E5 motherboards to store Brain Graph 98 racks o Disk

18 Big Graph Processing Frameworks

19 Why not just map reduce? o Developed by Google o Excellent for embarrassingly massively parallel computations No communication needed Many machine learning algorithms fall into this category o Not efficient for iterative algorithms that have dependences Unnecessary IO traffic 19

parallel computations No communication needed Many machine learning

20 What s the natural way to program graph computation? 20

21 Most famous parallel graph processing framework 21

22 GAS 22

23 We still need parallelism 23

24 Graph partition: not easy at all at scale 24

25 Power-law distribution count More$than$10 8 $ver+ces$$ have$one$neighbor.$ Top$1%$of$ver+ces$are$ High%Degree)) adjacent$to$ Ver+ces) 50%$of$the$edges!$ degree 25

26 Power-law degree distribution 26

27 Random graph partitioning o Graph parallel abstractions rely on partitioning: Minimize communication Balance computation and storage 10 Machines à 90% of edges cut 100 Machines à 99% of edges cut! Machine 1 Machine 2 27

28 Challenges of high-degree vertices Y Data transmitted across network O(# cut edges) Machine 1 Machine 2 28

29 Idea of vertex cut 29

30 GAS decomposition 30

31 Random Edge- Placement Randomly assign edges to machines Machine 1 Machine 2 Machine 3 Balanced Vertex- Cut Y Spans 3 Machines Z Spans 2 Machines Not cut! YY Y Z 31

32 Greedy Vertex- Cuts Place edges on machines which already have the vertices in that edge. A B B C Machine1 Machine 2 AB DE 32

33 Example What s the popularity of this user? Popular?) 33

34 PargeRank Algorithm R[i] = Rank%of% user%i" X j2nbrs(i) w ji R[j] Weighted%sum%of% neighbors %ranks" o Update ranks in parallel o Iterate until convergence 34

35 PageRank in Graphlab GraphLab_PageRank(i) // Compute sum over neighbors total = 0 foreach( j in in_neighbors(i)): total = total + R[j] * w ji! // Update the PageRank R[i] = total! // Trigger neighbors to run again if R[i] not converged then foreach( j in out_neighbors(i)) signal vertex- program on j Gather Information About Neighborhood Update Vertex Signal Neighbors & Modify Edge Data 35

36 Triangle counting on Twitter 36

37 What if I don t have a cluster? 37

38 GraphChi disk-based GraphLab o Challenge Random disk accesses! o Naive solutions Graph clustering Prefetching! o Solution Novel graph representation in disk Parallel sliding window Minimizes random accesses 38

39 Parallel sliding window layout A shard is easy to fit in memory 39

40 Parallel sliding window execution O(P^2) random accesses per pass on entire graph 40

41 Triangle counting on Twitter graph 41

An NSA Big Graph experiment. Paul Burkhardt, Chris Waring. May 20, 2013

An NSA Big Graph experiment. Paul Burkhardt, Chris Waring. May 20, 2013 U.S. National Security Agency Research Directorate - R6 Technical Report NSA-RD-2013-056002v1 May 20, 2013 Graphs are everywhere! A graph is a collection of binary relationships, i.e. networks of pairwise