Large Scale Graph Processing with Apache Giraph

Similar documents
Analysis of Web Archives. Vinay Goel Senior Data Engineer

This exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing.

The Stratosphere Big Data Analytics Platform

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Systems and Algorithms for Big Data Analytics

Overview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group)

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team

DATA ANALYSIS II. Matrix Algorithms

Cloud Computing. Lectures 10 and 11 Map Reduce: System Perspective

Big Data Analytics. Lucas Rego Drumond

Convex Optimization for Big Data: Lecture 2: Frameworks for Big Data Analytics

Machine Learning over Big Data

LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD. Dr. Buğra Gedik, Ph.D.

Adapting scientific computing problems to cloud computing frameworks Ph.D. Thesis. Pelle Jakovits

Big Data and Scripting Systems beyond Hadoop

Graph Processing and Social Networks

Using Map-Reduce for Large Scale Analysis of Graph-Based Data

Evaluating partitioning of big graphs

Big Graph Processing: Some Background

Apache Hama Design Document v0.6

Systems Engineering II. Pramod Bhatotia TU Dresden dresden.de

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Big Data and Scripting Systems build on top of Hadoop

Hadoop/MapReduce. Object-oriented framework presentation CSCI 5448 Casey McTaggart

Apache Giraph! start analyzing graph relationships in your bigdata in 45 minutes (or your money back)!

Hadoop Design and k-means Clustering

CS54100: Database Systems

Big Data looks Tiny from the Stratosphere

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Hadoop SNS. renren.com. Saturday, December 3, 11

Big Data Analytics with MapReduce VL Implementierung von Datenbanksystemen 05-Feb-13

Apache Mahout's new DSL for Distributed Machine Learning. Sebastian Schelter GOTO Berlin 11/06/2014

Large-Scale Data Processing

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Part 2: Community Detection

Hadoop at Yahoo! Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com

Deploying Hadoop with Manager

Unified Big Data Processing with Apache Spark. Matei

Big Data and Apache Hadoop s MapReduce

CSE-E5430 Scalable Cloud Computing Lecture 2

Hadoop IST 734 SS CHUNG

Graph Mining on Big Data System. Presented by Hefu Chai, Rui Zhang, Jian Fang

Hadoop Architecture. Part 1

Introduction to Parallel Programming and MapReduce

Introduction to Big Data Training

Extending Hadoop beyond MapReduce

Apache Flink Next-gen data analysis. Kostas

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Spark and the Big Data Library

Chapter 7. Using Hadoop Cluster and MapReduce

The Flink Big Data Analytics Platform. Marton Balassi, Gyula Fora" {mbalassi,

Map-Reduce for Machine Learning on Multicore

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Big Data Analytics* Outline. Issues. Big Data

Practical Graph Mining with R. 5. Link Analysis

Hadoop in Social Network Analysis - overview on tools and some best practices - Headline Goes Here

Apache Hadoop. Alexandru Costan

Architectures for Big Data Analytics A database perspective

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Energy Efficient MapReduce

BIG DATA USING HADOOP

Apache HBase. Crazy dances on the elephant back

BSPCloud: A Hybrid Programming Library for Cloud Computing *

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Jeffrey D. Ullman slides. MapReduce for data intensive computing

The PageRank Citation Ranking: Bring Order to the Web

Similarity Search in a Very Large Scale Using Hadoop and HBase

EXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Hadoop Scheduler w i t h Deadline Constraint

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Google Cloud Data Platform & Services. Gregor Hohpe

Big Data Analytics. Chances and Challenges. Volker Markl

Four Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014

Hadoop implementation of MapReduce computational model. Ján Vaňo

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. project.org. University of California, Berkeley UC BERKELEY

Scaling Out With Apache Spark. DTL Meeting Slides based on

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Graph Processing with Apache TinkerPop

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Outline. Motivation. Motivation. MapReduce & GraphLab: Programming Models for Large-Scale Parallel/Distributed Computing 2/28/2013

Parallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data

Big Graph Analytics on Neo4j with Apache Spark. Michael Hunger Original work by Kenny Bastani Berlin Buzzwords, Open Stage

Processing of massive data: MapReduce. 2. Hadoop. New Trends In Distributed Systems MSc Software and Systems

Analysis of MapReduce Algorithms

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

A scalable graph pattern matching engine on top of Apache Giraph

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

CIEL A universal execution engine for distributed data-flow computing

Spark. Fast, Interactive, Language- Integrated Cluster Computing

Transcription:

Large Scale Graph Processing with Apache Giraph Sebastian Schelter Invited talk at GameDuell Berlin 29th May 2012

the mandatory about me slide PhD student at the Database Systems and Information Management Group (DIMA) of TU Berlin Stratosphere, database inspired approach to a next generation large scale processing system, joint research project with HU Berlin and HPI Potsdam European Research Project ROBUST dealing with the analysis of huge-scale online business communities involved in open source as committer and PMC member of Apache Mahout and Apache Giraph

Overview 1) Graphs 2) Graph processing with Hadoop/MapReduce 3) Google Pregel 4) Apache Giraph 5) Outlook: everything is a network

Graph recap graph: abstract representation of a set of objects (vertices), where some pairs of these objects are connected by links (edges), which can be directed or undirected Graphs can be used to model arbitrary things like road networks, social networks, flows of goods, etc. Majority of graph algorithms are iterative and traverse the graph in some way A B D C

The Web the World Wide Web itself can be seen as a huge graph, the so called web graph pages are vertices connected by edges that represent hyperlinks the web graph has several billion vertices and several billion edges the success of major internet companies such as Google is based on the ability to conduct computations on this huge graph

Google s PageRank success factor of Google s search engine: much better ranking of search results ranking is based on PageRank, a graph algorithm computing the importance of webpages simple idea: look at the structure of the underlying network important pages have a lot of links from other important pages major technical success factor of Google: ability to conduct web scale graph processing

Social Networks on facebook, twitter, LinkedIn, etc, the users and their interactions form a social graph users are vertices connected by edges that represent some kind of interaction such as friendship, following, business contact fascinating research questions: what is the structure of these graphs? how do they evolve over time? analysis requires knowledge in both computer science and social sciences

six degrees of separation small world problem through how many social contacts do people know each other on average? small world experiment by Stanley Milgram task: deliver a letter to a recipient whom you don t know personally you may forward the letter only to persons that you know on a first-name basis how many contacts does it take on average until the letter reaches the target? results it took 5.5 to 6 contacts on average confirmation of the popular assumption of six degrees of separation between humans experiment criticised due to small number of participants, possibly biased selection

four degrees of separation the small word problem as a graph problem in social network analysis what is the average distance between two users in a social graph? in early 2011, scientists conducted a world scale experiment using the Facebook social graph 721 million users, 69 billion friendships links result: average distance in Facebook is 4.74 four degrees of separation large scale graph processing gives unpredecented opportunities for the social sciences

Overview 1) Graphs 2) Graph processing with Hadoop/MapReduce 3) Google Pregel 4) Apache Giraph 5) Outlook: everything is a network

Why not use MapReduce/Hadoop? MapReduce/Hadoop is the current standard for data intensive computing, why not use it for graph processing? Example: PageRank defined recursively each vertex distributes its authority to its neighbors in equal proportions p i j ( j, i ) p d j j

Textbook approach to PageRank in MapReduce PageRank p is the principal eigenvector of the Markov matrix M defined by the transition probabilities between web pages it can be obtained by iteratively multiplying an initial PageRank vector by M (power method) p i 1 Mp i row 1 of M row 2 of M p i p i+1 row n of M

Drawbacks Not intuitive: only crazy scientists think in matrices and eigenvectors Unnecessarily slow: Each iteration is a single MapReduce job with lots of overhead separately scheduled the graph structure is read from disk the intermediary result is written to HDFS Hard to implement: a join has to be implemented by hand, lots of work, best strategy is data dependent

Overview 1) Graphs 2) Graph processing with Hadoop/MapReduce 3) Google Pregel 4) Apache Giraph 5) Outlook: everything is a network

Google Pregel distributed system especially developed for large scale graph processing intuitive API that let s you think like a vertex Bulk Synchronous Parallel (BSP) as execution model fault tolerance by checkpointing

Bulk Synchronous Parallel (BSP) processors local computation superstep communication barrier synchronization

Vertex-centric BSP each vertex has an id, a value, a list of its adjacent neighbor ids and the corresponding edge values each vertex is invoked in each superstep, can recompute its value and send messages to other vertices, which are delivered over superstep barriers advanced features : termination votes, combiners, aggregators, topology mutations vertex1 vertex1 vertex1 vertex2 vertex2 vertex2 vertex3 vertex3 vertex3 superstep i superstep i + 1 superstep i + 2

Master-slave architecture vertices are partitioned and assigned to workers default: hash-partitioning custom partitioning possible master assigns and coordinates, while workers execute vertices and communicate with each other Master Worker 1 Worker 2 Worker 3

PageRank in Pregel class PageRankVertex { void compute(iterator messages) { if (getsuperstep() > 0) { // recompute own PageRank from the neighbors messages pagerank = sum(messages); setvertexvalue(pagerank); } p i j ( j, i ) p d j j }} if (getsuperstep() < k) { // send updated PageRank to each neighbor sendmessagetoallneighbors(pagerank / getnumoutedges()); } else { votetohalt(); // terminate }

PageRank toy example.17.33.33.33.33.17.17.17 Superstep 0 Input graph.25.34.17.50.34.09.25.09 Superstep 1 A B C.22.34.25.43.34.13.22.13 Superstep 2

Cool, where can I download it? Pregel is proprietary, but: Apache Giraph is an open source implementation of Pregel runs on standard Hadoop infrastructure computation is executed in memory can be a job in a pipeline (MapReduce, Hive) uses Apache ZooKeeper for synchronization

Overview 1) Graphs 2) Graph processing with Hadoop/MapReduce 3) Google Pregel 4) Apache Giraph 5) Outlook: everything is a network

Giraph s Hadoop usage TaskTracker TaskTracker TaskTracker worker worker worker worker worker worker TaskTracker ZooKeeper JobTracker NameNode master worker

Anatomy of an execution Setup load the graph from disk assign vertices to workers validate workers health Teardown write back result write back aggregators Compute assign messages to workers iterate on active vertices call vertices compute() Synchronize send messages to workers compute aggregators checkpoint

Who is doing what? ZooKeeper: responsible for computation state partition/worker mapping global state: #superstep checkpoint paths, aggregator values, statistics Master: responsible for coordination assigns partitions to workers coordinates synchronization requests checkpoints aggregates aggregator values collects health statuses Worker: responsible for vertices invokes active vertices compute() function sends, receives and assigns messages computes local aggregation values

Example: finding the connected components of an undirected graph algorithm: propagate smallest vertex label to neighbors until convergence 1 2 0 1 0 0 0 3 0 3 0 3 in the end, all vertices of a component will have the same label

Step 1: create a custom vertex public class ConnectedComponentsVertex extends BasicVertex<IntWritable, IntWritable, NullWritable, IntWritable> { }} public void compute(iterator messages) { int currentlabel = getvertexvalue().get(); while (messages.hasnext()) { } int candidate = messages.next().get(); currentlabel = Math.min(currentLabel, candidate); // compare with neighbors labels // propagation is necessary if we are in the first superstep or if we found a new label if (getsuperstep() == 0 currentlabel!= getvertexvalue().get()) { } setvertexvalue(new IntWritable(currentLabel)); sendmsgtoalledges(getvertexvalue()); // propagate newly found label to neighbors votetohalt(); // terminate this vertex, new messages might reactivate it

Step 2: create a custom input format input is a text file with adjacency lists, each line looks like: <vertex_id> <neighbor1_id> <neighbor2_id>... public class ConnectedComponentsInputFormat extends TextVertexInputFormat<IntWritable, IntWritable, NullWritable, IntWritable> { static class ConnectedComponentsVertexReader extends TextVertexReader<IntWritable, IntWritable, NullWritable, IntWritable> { public BasicVertex<IntWritable, IntWritable, NullWritable, IntWritable> getcurrentvertex() { // instantiate vertex BasicVertex<IntWritable, IntWritable, NullWritable, IntWritable> vertex =... // parse a line from the input and initialize the vertex vertex.initialize(...); return vertex; }}}

Step 3: create a custom output format output is a text file, each line looks like: <vertex_id> <component_id> public class VertexWithComponentTextOutputFormat extends TextVertexOutputFormat<IntWritable, IntWritable, NullWritable> { public static class VertexWithComponentWriter extends TextVertexOutputFormat.TextVertexWriter<IntWritable, IntWritable, NullWritable> { public void writevertex(basicvertex<intwritable, IntWritable, NullWritable,?> vertex) { }} // write out the vertex ID and the vertex value String output = vertex.getvertexid().get() + '\t + vertex.getvertexvalue().get(); getrecordwriter().write(new Text(output), null); }

Step 4: create a combiner (optional) we are only interested in the smallest label sent to a vertex, therefore we can apply a combiner public class MinimumIntCombiner extends VertexCombiner<IntWritable, IntWritable> { public Iterable<IntWritable> combine(intwritable target, Iterable<IntWritable> messages) { int minimum = Integer.MAX_VALUE; // find minimum label for (IntWritable message : messages) { minimum = Math.min(minimum, message.get()); } return Lists.<IntWritable>newArrayList(new IntWritable(minimum)); }

Experiments Setup: 6 machines with 2x 8core Opteron CPUs, 4x 1TB disks and 32GB RAM each, ran 1 Giraph worker per core Input: Wikipedia page link graph (6 million vertices, 200 million edges) PageRank on Hadoop/Mahout 10 iterations approx. 29 minutes average time per iteration: approx. 3 minutes PageRank on Giraph 30 iterations took approx. 15 minutes average time per iteration: approx. 30 seconds 10x performance improvement

hardware utilization

Connected Components execution takes approx. 4 minutes subsecond iterations after superstep 5 Giraph exploits small average distance! duration in milliseconds 45000 40000 35000 30000 25000 20000 15000 10000 5000 0 1 2 3 4 5 6 7 8 9 10 11 12 supersteps

Overview 1) Graphs 2) Graph processing with Hadoop/MapReduce 3) Google Pregel 4) Apache Giraph 5) Outlook: everything is a network

Everything is a network!

distributed matrix factorization decompose matrix A into product of two lower dimensional feature matrices R and C A R C T master algorithm: dimension reduction, solving least squares problems, compressing data, collaborative filtering, latent semantic analysis,...

Alternating Least Squares minimize squared error over all known entries: f ( R, C ) ( a i, j A ij r i T c j ) 2 Alternating Least Squares fix one side, solve for the other, repeat until convergence easily parallelizable, iterative algorithm what does this all have to with graphs?

matrices can be represented by graphs represent matrix as bipartite graph 2 1 3 5 7 row 1 row 2 row 3 5 3 2 7 1 column 1 column 2 column 3 now the ALS algorithm can easily be implemented as a Giraph program every vertex holds a row vector of one of the feature matrices in each superstep the vertices of one side recompute their vector and send it to the connected vertices on the other side

What s to come? Current and future work in Giraph out-of-core messaging algorithms library

Thank you. Questions? Database Systems and Information Management Group (DIMA), TU Berlin