Map-Based Graph Analysis on MapReduce



Similar documents
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

BSPCloud: A Hybrid Programming Library for Cloud Computing *

Graph Processing and Social Networks

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team

Optimization and analysis of large scale data sorting algorithm based on Hadoop

Evaluating partitioning of big graphs

Hadoop MapReduce using Cache for Big Data Processing

Review on the Cloud Computing Programming Model

CSE-E5430 Scalable Cloud Computing Lecture 2

Big Data Analytics. Lucas Rego Drumond

LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD. Dr. Buğra Gedik, Ph.D.

Developing MapReduce Programs

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop


Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique

Big Data With Hadoop

IncMR: Incremental Data Processing based on MapReduce

Exploring the Efficiency of Big Data Processing with Hadoop MapReduce

Big Data Processing with Google s MapReduce. Alexandru Costan

Classification On The Clouds Using MapReduce

Analysis and Modeling of MapReduce s Performance on Hadoop YARN

Big Data Begets Big Database Theory

Big Data with Rough Set Using Map- Reduce

Introduction to DISC and Hadoop

A Novel Optimized Mapreduce across Datacenter s Using Dache: a Data Aware Caching for BigData Applications

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

MAPREDUCE Programming Model

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data

Advances in Natural and Applied Sciences

This exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing.

Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

From GWS to MapReduce: Google s Cloud Technology in the Early Days

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Lifetime Management of Cache Memory using Hadoop Snehal Deshmukh 1 Computer, PGMCOE, Wagholi, Pune, India

Systems Engineering II. Pramod Bhatotia TU Dresden dresden.de

MapReduce (in the cloud)

Apache MRQL (incubating): Advanced Query Processing for Complex, Large-Scale Data Analysis

Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.

Evaluating HDFS I/O Performance on Virtualized Systems

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Resource Scalability for Efficient Parallel Processing in Cloud

SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD

International Journal of Engineering Research ISSN: & Management Technology November-2015 Volume 2, Issue-6

Big Data and Scripting map/reduce in Hadoop

Big Data and Apache Hadoop s MapReduce

What is Analytic Infrastructure and Why Should You Care?

Fault Tolerance in Hadoop for Work Migration

Introduction to Hadoop

Apache Flink Next-gen data analysis. Kostas

Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques

Data-Intensive Computing with Map-Reduce and Hadoop

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

A Locality Enhanced Scheduling Method for Multiple MapReduce Jobs In a Workflow Application

Graph Mining on Big Data System. Presented by Hefu Chai, Rui Zhang, Jian Fang

Implementing Graph Pattern Mining for Big Data in the Cloud

Comparison of Different Implementation of Inverted Indexes in Hadoop

Cloud Computing at Google. Architecture

Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture

Big Graph Processing: Some Background

RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE

CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES

The Performance Characteristics of MapReduce Applications on Scalable Clusters

Jeffrey D. Ullman slides. MapReduce for data intensive computing

The Hadoop Framework

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Big Data and Scripting Systems beyond Hadoop

Analysis of MapReduce Algorithms

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

16.1 MAPREDUCE. For personal use only, not for distribution. 333

CloudClustering: Toward an iterative data processing pattern on the cloud

Log Mining Based on Hadoop s Map and Reduce Technique

An Approach to Implement Map Reduce with NoSQL Databases

MapReduce Approach to Collective Classification for Networks

International Journal of Advance Research in Computer Science and Management Studies

Adapting scientific computing problems to cloud computing frameworks Ph.D. Thesis. Pelle Jakovits

INTRO TO BIG DATA. Djoerd Hiemstra. Big Data in Clinical Medicinel, 30 June 2014

Distributed Computing and Big Data: Hadoop and MapReduce

How To Balance In Cloud Computing

MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Parallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data

Using Map-Reduce for Large Scale Analysis of Graph-Based Data

Report: Declarative Machine Learning on MapReduce (SystemML)

Transcription:

2013 IEEE International Conference on Big Data Map-Based Graph Analysis on MapReduce Upa Gupta, Leonidas Fegaras University of Texas at Arlington, CSE Arlington, TX 76019 {upa.gupta,fegaras}@uta.edu Abstract The MapReduce framework has become the de-facto framework for large-scale data analysis and data mining. One important area of data analysis is graph analysis. Many graphs of interest, such as the Web graph and Social Networks, are very large in size with millions of vertices and billions of edges. To cope with this vast amount of data, researchers have been using the MapReduce framework to analyse these graphs extensively. Unfortunately, most of these graph algorithms are iterative in nature, requiring repetitive MapReduce jobs. We introduce a new design pattern for a family of iterative graph algorithms for the MapReduce framework. Our method is to separate the immutable graph topology from the graph analysis results. Each MapReduce node participating in the graph analysis task reads the same graph partition at each iteration step, which is made local to the node, but it also reads all the current analysis results from the distributed file system (DFS). These results are correlated with the local graph partition using a merge-join and the new improved analysis results associated with only the nodes in the graph partition are generated and dumped to the DFS. Our algorithm requires one MapReduce job for pre-processing the graph and the repetition of one map-based MapReduce job for the actual analysis. Index Terms MapReduce, Distributed Computing, Graph Algorithms I. INTRODUCTION Recently, the MapReduce programming model [1] has emerged as a popular framework for large-scale data analysis on the cloud. In particular, Hadoop, the most prevalent implementation of this framework, has been used extensively by many companies on a very large scale. Many of the data being generated at a fast rate take the form of massive graphs containing millions of nodes and billions of edges. One graph application, which was one of the original motivations for the MapReduce framework, is page-rank [2], that calculates the relative importance of web pages based on the web graph topology. Analysis of such large graphs is a data intensive process, which motivates the use of the MapReduce paradigm to analyse these graphs. The execution time of a MapReduce job depends on the computation times of the map and reduce tasks, the disk I/O time, and the communication time for shuffling intermediate data between the mappers and reducers. The communication time dominates the computation time and hence, decreasing it will greatly improve the efficiency of a MapReduce job. Previous work required the whole graph to be shuffled to and sorted by the reducers, leading to the inefficient graph analysis. This problem becomes even worse given that the most of these algorithms are iterative in nature, where the computation in each iteration depends on the results of the previous iteration. To improve the efficiency of graph analysis, some earlier work has been done on reducing the size of the input so that graph partitions are small enough to fit in the memory of a single cluster node. In addition, the Schimmy design pattern [14] has been introduced to avoid passing the graph topology across the network. Unfortunately, this method still requires the partial results computed for each node to be shuffled among the nodes. There is also earlier work on optimizing iterative MapReduce jobs, such as Twister [3] and HaLoop [4]. Furthermore, work has been done on implementing graph analysis in other parallel programming paradigms, such as the bulk synchronous parallel [5] paradigm, such as Pregel [6] by Google, and Hama [7] and Giraph [8] by Apache. In this paper, we introduce a new design pattern for a family of iterative graph algorithms. Our method is to separate the immutable graph topology from the graph analysis results. Each MapReduce node participating in the graph analysis task always reads the same graph partition at each iteration step, which is made local to the node, but it also reads all the current analysis results from the distributed file system (DFS). These results are correlated with the local graph partition using a merge-join and the new improved analysis results associated with only the nodes in the graph partition are generated and dumped to the DFS. Our method requires that the partial analysis results associated with only those nodes that belong to the local graph partition be stored in memory, which is usually far smaller than the graph partition itself since the number of nodes is usually far less than the number of edges. Our algorithm requires one MapReduce job for preprocessing the graph and the repetition of one map-based MapReduce job (ie, a job without a reduce phase) for the actual analysis. The rest of the paper is described as follows: Section II gives description of related work. Section III introduces the MapReduce model and Section IV describes the previous design patterns for graph analysis. Sections V describes our proposed map-based design pattern for the graph analysis. Finally, Section VI evaluates the performance of our graph analysis using various data sets and compares it with the Schimmy approach. II. RELATED WORK The MapReduce model was first introduced by Google in 2004 [1]. Several large organizations have implemented this model including Apache Hadoop [9], Google Sawzall [10], and Microsoft Dryad [11]. Hadoop, an open source project 978-1-4799-1293-3/13/$31.00 2013 IEEE 24

Fig. 1. An Example of a MapReduce Job Execution developed by Apache, is the most popular MapReduce implementation. It is being used by many companies for data analysis on large scale. Since most graph processing algorithms, such as breadthfirst-search, are iterative in nature, they require repetitive mapreduce jobs. Earlier work [12] [13] on graph processing based on MapReduce used algorithms that read and shuffle the whole graph among the participating nodes at each iteration step, resulting in inefficient graph analysis. We briefly describe some related work that addresses the problem of inefficiency of graph processing in the MapReduce framework. Twister [3] and HaLoop [4], as stated before, are developed to support iterative computations in the MapReduce framework. The main feature of Twister is the long running mapper and reducer tasks with a cacheable distributed memory, which was used to avoid the repeated data loading from disks. Twister is also a stream-based system where the output of mappers is directly streamed to reducers. HaLoop caches the loop invariant data structures, thus reducing the cost of loading and shuffling them in subsequent iterations. Neither Twister nor HaLoop are scalable, because they both require memory cache for storing and indexing the data, which is proportional to the size of data. The Schimmy design pattern [14], introduced by Lin and Schatz, uses the notion of the parallel merge join to avoid the shuffling of the graph topology in MapReduce but it still requires the shuffling of the partial results generated for each vertex or edge of the graph. MapReduce was also used to filter data and reduce the data size in a distributed fashion [15], so that the data analysis can be done in a single machine. This method is not applicable to some graph algorithms, such as the PageRank, which requires the whole graph topology and the metadata of all vertices to compute the popularity of all the vertices of the graph. There has been a recent development in incremental processing systems, such as Incoop [16], a Hadoop-based incremental processing system, and Google s Percolator [17], an incremental processing system based on BigTable. These systems target applications that process their data in the form of small updates. They do not target general graph processing algorithms. Another programming paradigm for parallel graph processing is the bulk synchronous parallel (BSP) [5] programming paradigm. Google s Pregel [6], Apache Hama [7], and Apache Giraph [8] are systems based on the BSP model. The disadvantage of the BSP model is that it requires that the whole graph be stored in the collective memory of the cluster, which limits the size of graphs that can be analysed. III. MAPREDUCE MapReduce is a distributed processing framework that enables data intensive computations. The framework, inspired by the functional programming paradigm, has two main components, a mapper and a reducer. A mapper works on each individual input record to generate intermediate results, which are grouped together based on some key and passed on to the reducers. A reducer works on the group of intermediate results associated with the same key and generates the final result using a result aggregation function. The processing units of the MapReduce framework are key-value pairs. An instance of the MapReduce framework with 3 mappers and 2 reducers is shown in Fig. 1. Developers can develop MapReduce applications by providing the implementations for the mapper and the reducer methods. The MapReduce framework handles all the other aspects of the execution on a cluster. It is responsible for scheduling tasks, handling faults and sorting and shuffling the data between the mappers and the reducers, where the intermediate key-value pairs are grouped by key. The MapReduce framework works on the top of a distributed file system, which is responsible for the distribution of the data among all the worker nodes of the cluster. After each mapper finishes its task, its intermediate generated results are passed to the reducers. a process known as shuffling. Each reducer is assigned a subset of the intermediate key space, called a partition. To control the assignment of the key-value pairs to reducers, the MapReduce framework uses a partitioning function. The intermediate values, after being grouped by key, are sorted by the reducers. The sort order can be controlled by a user-defined comparator function. 25

The MapReduce framework also allows developers to specify a function, called the combiner, to improve performance. It is similar to the reducer function but it runs directly on the output of the mapper. The combiner output becomes the input to the reducer. As it is an optimization, there is no guarantee on the number of times it will be called. When there is a large amount of shuffling of data between the map and the reduce phases, combiners can be used to aggregate the partial result at the map side to reduce the network traffic. IV. GRAPH ALGORITHMS A graph is defined as G(V, E), where V is a set of vertices and E is a set of directed edges. A directed edge from v i to v j is represented as the pair of nodes (v i, v j ), where v i V and v j V. Each vertex may have some information associated with it (such as, a node label, a page-rank value, the number of out-links, etc.) and, similarly, each edge may have some information associated with it (such as, edge label and relationship type). The focus of this paper is on iterative graph algorithms on directed graphs where partial results associated with nodes can be improved at each iteration. Such graph algorithms can be formulated as follows: repeat for all v n V do R n F n F n f({f m (v m, v n ) E}) end for until v i V : ρ(r i, F i ) < θ where F n and F m are the partial results at vertex n and m, respectively, f is a function to compute the partial result for each vertex of the graph, ρ is the function to compute the result improvement between iterations, and θ is the threshold determining the stopping condition. The algorithm given above repeats until the termination criterion is met. Page-Rank, a well-known algorithm for computing the importance of vertices in a graph based on its structure, can be captured using the above algorithm. It computes the pagerank P i for every vertex v i V belonging to the graph. P i is the probability of reaching the vertex v i through a random walk in the graph, which is computed based on the topology of the graph. The page-rank computation often includes a random periodic jump to any other vertex in the graph with a probability 1 d, where d is the dumping factor. The pagerank of a vertex v i V of the graph is calculated iteratively as follows: P i = 1 d P j + d (1) V {v m (v j, v m ) E} (v j,v i) E where v i, v j and v m are the vertices of the graph, P j is the page-rank of node v j from the previous iteration, and P i is the new page-rank of the vertex v i. The page-rank equation can be compared to the general iterative algorithm where calculating the page-rank of the vertex v i in a single iteration is the function f and the page-rank calculated for all the vertices can be seen as a partial result which will be used to calculate the page-rank of all the vertices in the next iteration. A. Basic Implementation V. EARLIER WORK Although it can apply to other graph algorithms too, we describe the earlier work on graph analysis based on MapReduce in terms of the page-rank algorithm. A graph in a MapReduce framework is typically represented as a set of directed edges, where each edge is represented as a key-value pair with the source vertex as the key and the destination vertex as the value. Each vertex p contains the identifier of the vertex p.id and its meta-data, which includes its current pagerank value p.pagerank and the number of outgoing edges p.numof OutLinks from the vertex. Algorithm 1 The Mapper for the Basic Implementation of Page-Rank 1: function MAP(Vertex f rom, Vertex to) 2: Emit (from.id, (from, to)) 3: p f rom.pagerank/f rom.numof OutLinks 4: Emit (to.id, p) 5: end function We first describe the basic approach of applying MapReduce to the graph algorithms described in Section IV. The mapper function given in Algorithm 1 applies to each key-value pair, with the source vertix serving as a key. It computes the pagerank contributions from the source vertex to the destination vertex and emits the destination vertex id as the key and its corresponding fraction of page-rank as the value. In addition to the page-rank contributions, the mapper regenerates the graph structure by emitting the source vertex id as the key and the whole edge (a pair) as the value. Algorithm 2 The Reducer for the Basic Implementation of Page-Rank 1: function REDUCE(ID m, List [p 1,..., p n ]) 2: s 0 3: M null 4: N [ ] 5: for all p [p 1,..., p n ] do 6: if IsPair(p) then 7: M p.from 8: insert p.to into N 9: else 10: s s + p 11: end if 12: end for 13: M.pageRank s 14: for all n N do 15: Emit (M, n) 16: end for 17: end function 26

The reducer, described in Algorithm 2, gets the pagerank contributions from each of the incoming edges to a vertex, along with the graph topology associated with the vertex. These page-rank contributions are aggregated to get the updated page-rank value of the vertex. The reducer also updates the page-rank value of the source vertex and the revised edge is written back to the disk. This completes an iteration of the page-rank computation and the output is then fed again to the mapper to begin the next iteration. Note that, in Algorithm 2, along with the intermediate partial result of the vertex M, the set N of all incoming edges to M is also passed. This is necessary in order to preserve the graph topology to be used in the next iteration steps. As a result, two types of data are being passed from mappers to reducers, one is the partial computation of the vertex value and the other is the incoming edges to the vertex, which passes the topology of the graph to the reduce phase. Note that this pseudo-code does not take the damping factor and dangling nodes into account. B. The Schimmy Implementation The basic implementation of a graph algorithm passes two types of data from mappers to reducers. One is the partial result computed for the vertex and the other is the graph topology itself. After receiving the partial results for a vertex and the graph topology associated with it, the reducer aggregates the partial results and updates the metadata of the nodes. The shuffling of the graph structure between the mapper and reducer has high overhead, especially in the case of iterative algorithms. To address the inefficiency of the basic implementation, the Schimmy design pattern was introduced [14]. The Schimmy design pattern is based on the concept of the parallel merge join. A merge join between two given relations S and T is done by first sorting both relations on their join keys and then by simultaneously scanning them, joining the rows having the same join key. This merge join can be processed in parallel by partitioning S and T into small files S 1,..., S n and T 1,..., T n, respectively, based on their join key and by sorting each partition on the join key. Then, each pair S i /T i is processed by a single node that performs a local merge join and the node results are combined. In the Schimmy design pattern, the graph G is partitioned into m partitions, so that each reducer R i is assigned a different partition G i and the edges of each partition are sorted by the ID of the source vertex. The reducer R i works on the intermediate partial results corresponding to the vertices in partition G i and uses a merge-join between these results and the partition G i to calculate new improved results for the vertices (Algorithm 3). The implementation of the page-rank based on the Schimmy design does not need to shuffle the graph structure and hence the mapper remains the same as in Algorithm 1 but without line 2. In the reducer (Algorithm 3), the corresponding graph partition file is opened (line 2). The reducer reads through this file until it finds the edge to be updated, then updates the pagerank of the source vertex of the edge, and then advances to the Algorithm 3 The Reducer for the Schimmy Implementation 1: function INITIALIZE 2: P.OpenGraphPartition() 3: end function 4: function REDUCE(ID m, List [p 1,..., p n ]) 5: s 0 6: for all p [p 1,..., p n ] do 7: s s + p 8: end for 9: repeat 10: (f rom, to) P.Read() 11: if from.id m then 12: Emit(f rom, to) 13: else if from.id = m then 14: f rom.pagerank s 15: Emit(f rom, to) 16: end if 17: until from.id > m 18: end function next edge. It updates all the edges with the same source vertex. Once an edge is updated, it is written back to the distributed file system. In addition to the design pattern, the Schimmy approach introduced various improvements, such as using a regular MapReduce combiner or an in-mapper combiner, which was found to perform better than a regular combiner. For more details, refer to [14] VI. MAP-BASED GRAPH ANALYSIS Although the Schimmy design pattern improves the efficiency of graph algorithms by avoiding the shuffling and sorting the graph topology, it still requires shuffling and sorting of the partially computed results. To avoid this, we introduce a map-based design pattern for the analysis of the graph. In contrast to the Schimmy approach, which requires both a map and a reduce stage at each iteration of the graph analysis, our method requires just a map stage. As in the case of Schimmy, our method too uses a parallel merge-join. In Schimmy, the merge join happens in the reduce stage between a partition of the graph and the intermediate partial results generated from the mappers. In our case, the merge-join is done at the map stage between a partition of the graph and a global file (stored in DFS) that contains all the partial results. Figure 2 illustrates this idea. More specifically, we perform the graph analysis by doing a parallel merge-join between the partition of graph and a global table that contains the partial results associated with all nodes. The graph G is partitioned into G 1,..., G m such that edges with the same destination go to the same partition. Also, each of the partitions is sorted by the source vertex of the edges. A global table in the form of a binary DFS file is created that contains the partial results of each node after the end of each iteration. This file is kept sorted by the vertex s ID. Without 27

Fig. 2. Page-Rank Computation using Parallel Merge Join loss of generality, we describe our graph analysis technique applied to the page-rank computation. Each mapper M i reads the same unchanged partition M i and combines it with the global page-rank table using a mergejoin between M i and the entire global table. Each mapper M i generates new page-rank values only for those nodes that are destinations of the edges in G i. Note that no other mapper has edges that have the same destination as those in G i due to the way the graph G is partitioned. Thus, each mapper M i is responsible for generating its own subset of the new pagerank values P i, which does not overlap with anyone else s. This is done by aggregating the incoming contributions from the current page-ranks stored in the global table. As we can see in Algorithm 4, each mapper uses a dictionary D to store its own partition P i of the new page-rank values. The average size of the dictionary D is V /m, where V is the number of graph nodes and m is the number of mappers. Our assumption is that D can fit in the mapper s memory. To analyze the graph, a merge-join is performed between G i and the global table during each iteration and the new page-rank values are aggregated in D, which contains P i. After G i is processed,d is flushed to DFS and becomes one of the new partitions of the global table. This algorithm is implemented in MapReduce as follows. An initial MapReduce job, which is a pre-processing step, is first called to partition the graph based on the destination vertex of each edge by using a user-defined Partitioner method (which uses uniform hashing on the destination vertex). At the same time, each partition G i is sorted by the source node of the edges by using user-defined Key Comparator and Grouping Comparator methods. Once all the partitions are formed, they are saved on the distributed file system. In addition, each reducer node of this MapReduce job generates one partition P i of the global table that contains the initial page-rank values and saves them in a sequence DFS file. All these files generated by the reducer nodes are saved in the same DFS directory, thus forming the initial global table. The mapper that evaluates each iteration step is shown in Algorithm 4. It evaluates a merge-join between the graph partition G i (the mapper input) and the global page-rank table, since G i is read sorted by the source of the edges and the page-rank table is read sorted by the node. During this join, the mapper aggregates the incoming page-rank contributions for those vertices that are destinations of G i edges. When the source vertex f rom of the input vertex has the same ID as the current vertex n read from the page-rank table, it adds a page-rank contribution from the source vertex f rom to the destination vertex to of the edge. These contributions are aggregated together and saved in the dictionary D. After the entire G i is processed, this dictionary will contain the updated page-ranks of the vertices belonging to that partition. These values are sorted and flushed out to the disk in a sequence file format to form a single partition P i of the new global page-rank table, which is used by next iteration. The mapper does not write anything to the disk other than the new pagerank values. It should be noted that, to decrease the disk write overhead, we set the data replication parameter of the DFS sequence file that contains the global page-rank table to one, so that there is always a single replica of global page-rank table in DFS. Algorithm 4 The Mapper for Map-Based Parallel Merge-Join for Computing Page-Rank 1: function INITIALIZE(()) 2: P.OpenPageRankFile() 3: Dictionary D empty 4: (n, rank) P.Read() 5: end function 6: function MAP(Vertex f rom, Vertex to) 7: if from.id = n then 8: D[to] + = rank/f rom.numof OutLinks 9: end if 10: if from.id n then 11: repeat 12: (n, rank) P.Read() 13: until from.id n 14: end if 15: end function 16: function CLOSE 17: P.WritePageRankFile(D) 18: end function VII. EXPERIMENTATION To evaluate the different approaches to implement the pagerank algorithm, we generate a synthetic graph. These graphs are generated by R-MAT algorithm [18] using the parameters a=0.57, b=0.19 and d=0.5. We used the MRQL system [19] to generate the synthetic large graph of size 7GB, 14GB and 28 GB. They contain 500 million, 1 billion and 2 billion edges respectively and all of them have 15 million vertices. The file generated is represented as a flat list of edges stored in the text format. 28

Fig. 3. Evaluation of Various Design Patterns on a Synthetic Graph The experimentations are performed on a cluster consisting of 1 frontend server and 17 worker nodes, connected through a Gigabit Ethernet switch. The cluster is managed by Rocks Cluster 6.3 running CentOS-5 Linux. For our experiments, we used Hadoop 1.0.4, distributed by Apache. The cluster frontend was used exclusively as a NameNode/JobTracker, while the rest 17 worker nodes were used as DataNodes/TaskTrackers. The frontend server and each of the worker nodes contain 3.2 Ghz Intel Xeon quad-core CPUs, 4 GB memory. As each worker machine has 4 cores, there are total of 68 cores available for map/reduce tasks. We evaluated the efficiency of the map-reduce optimizations by computing 5 iterations of the page-rank algorithm. We compared the time taken to perform the page-rank iterations using the basic, schimmy and our map-based implementations. We used the range-partitioning method to partition the graph for each of the methods. The computation time for 5 iterations of the page-rank algorithm for different graphs is given in Figure 3. Before starting the page-rank iterations, the synthetic graph is preprocessed to compute metadata information of the vertices, such as number of outgoing edges from the vertices, and to initialize the initial page-rank for each of the vertices. For the basic and schimmy implementations, there is only one preprocessing step, but for our map-based implementation there are two preprocessing steps: one to compute the metadata information and the other to initialize the first global table with initial page-rank values for every vertex of the graph. Our Map-Based approach improves the performance of graph-analysis over the basic approach as well as the Schimmy approach. As our appraoch separates the immutable graph topology from the graph analysis results, there is no shuffling and sorting phase and hence our approach improves the performance of the graph analysis. There has been an improvement of around 10% over the schimmy based approach. It should be noted that our approach is applicable to a general class of graph algorithms, as discussed in section IV. It should also be noted that the page-rank is computed for each vertex of the graph and hence, it is an exhaustive graph analysis. Graph analysis other than page-rank may be less exhaustive and hence can benefit more from our approach. VIII. CONCLUSION Graph analysis in a distributed frameworks, such as Map- Reduce, is a challenge. There has been approaches for the analysis of graph algorithms, but most of these approaches has a high communication cost because of the shuffling and sorting phases of the map-reduce. Our approach detaches the immutable graph topology from the analysis and as a result we get an improved performance as there is less communication cost. IX. ACKNOWLEDGEMENTS This work is supported in part by the National Science Foundation under the grant 1117369. 29

REFERENCES [1] J. Dean and S. Ghemawat, Mapreduce: simplified data processing on large clusters, vol. 51, no. 1. New York, NY, USA: ACM, Jan. 2008, pp. 107 113. [Online]. Available: http://doi.acm.org/10.1145/1327452. 1327492 [2] S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine, in Proceedings of the seventh international conference on World Wide Web 7, ser. WWW7. Amsterdam, The Netherlands, The Netherlands: Elsevier Science Publishers B. V., 1998, pp. 107 117. [Online]. Available: http://dl.acm.org/citation.cfm?id=297805.297827 [3] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox, Twister: a runtime for iterative mapreduce, in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC 10. New York, NY, USA: ACM, 2010, pp. 810 818. [Online]. Available: http://doi.acm.org/10.1145/1851476.1851593 [4] Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, Haloop: efficient iterative data processing on large clusters, vol. 3, no. 1-2. VLDB Endowment, Sep. 2010, pp. 285 296. [Online]. Available: http://dl.acm.org/citation.cfm?id=1920841.1920881 [5] L. G. Valiant, A bridging model for parallel computation, Commun. ACM, vol. 33, no. 8, pp. 103 111, Aug. 1990. [Online]. Available: http://doi.acm.org/10.1145/79173.79181 [6] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, Pregel: a system for large-scale graph processing, in Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, ser. SIGMOD 10. New York, NY, USA: ACM, 2010, pp. 135 146. [Online]. Available: http://doi.acm.org/10.1145/1807167.1807184 [7] S. Seo, E. J. Yoon, J. Kim, S. Jin, J.-S. Kim, and S. Maeng, Hama: An efficient matrix computation with the mapreduce framework, in Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science, ser. CLOUDCOM 10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 721 726. [Online]. Available: http://dx.doi.org/10.1109/cloudcom.2010.17 [8] Giraph, http://incubator.apache.org/giraph/. [9] T. White, Hadoop: The Definitive Guide, 1st ed. O Reilly Media, Inc., 2009. [10] R. Pike, S. Dorward, R. Griesemer, and S. Quinlan, Interpreting the data: Parallel analysis with sawzall, Sci. Program., vol. 13, no. 4, pp. 277 298, Oct. 2005. [Online]. Available: http://dl.acm.org/citation.cfm? id=1239655.1239658 [11] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, Dryad: distributed data-parallel programs from sequential building blocks, in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, ser. EuroSys 07. New York, NY, USA: ACM, 2007, pp. 59 72. [Online]. Available: http://doi.acm.org/10.1145/ 1272996.1273005 [12] U. Kang, C. Tsourakakis, A. P. Appel, C. Faloutsos, and J. Leskovec, Hadi: Fast diameter estimation and mining in massive graphs with hadoop, 2008. [13] U. Kang, C. E. Tsourakakis, and C. Faloutsos, Pegasus: A peta-scale graph mining system implementation and observations, in Proceedings of the 2009 Ninth IEEE International Conference on Data Mining, ser. ICDM 09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 229 238. [Online]. Available: http://dx.doi.org/10.1109/icdm.2009.14 [14] J. Lin and M. Schatz, Design patterns for efficient graph algorithms in mapreduce, in Proceedings of the Eighth Workshop on Mining and Learning with Graphs, ser. MLG 10. New York, NY, USA: ACM, 2010, pp. 78 85. [Online]. Available: http://doi.acm.org/10.1145/1830252.1830263 [15] S. S. S. Lattanzi, B. Moseley and S. Vassilvitskii, Filtering: a method for solving graph problems in mapreduce, in SPAA, 2011, pp. 85 94. [16] R. R. U. A. A. P. Bhatotia, A. Wieder and R. Pasquin, Incoop: Mapreduce for incremental computations, in SoCC, 2011, p. 7. [17] D. Peng and F. Dabek, Large-scale incremental processing using distributed transactions and notifications, in OSDI, 2010, pp. 251 264. [18] D. Chakrabarti, Y. Zhan, and C. Faloutsos, R-mat: A recursive model for graph mining, in In SDM, 2004. [19] Mrql, http://lambda.uta.edu/mrql/. 30