Hadoop Optimizations for BigData Analytics



Similar documents
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

TCP/IP Implementation of Hadoop Acceleration. Cong Xu

Can High-Performance Interconnects Benefit Memcached and Hadoop?

DESIGN AND ESTIMATION OF BIG DATA ANALYSIS USING MAPREDUCE AND HADOOP-A

Different Technologies for Improving the Performance of Hadoop

Big Fast Data Hadoop acceleration with Flash. June 2013

Enabling High performance Big Data platform with RDMA

Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture

Lifetime Management of Cache Memory using Hadoop Snehal Deshmukh 1 Computer, PGMCOE, Wagholi, Pune, India

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

Extending Hadoop beyond MapReduce

Hadoop Architecture. Part 1

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Accelerating and Simplifying Apache

Performance and Energy Efficiency of. Hadoop deployment models

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP

Hadoop/MapReduce. Object-oriented framework presentation CSCI 5448 Casey McTaggart

Chapter 7. Using Hadoop Cluster and MapReduce

MapReduce Evaluator: User Guide

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

CSE-E5430 Scalable Cloud Computing Lecture 2

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Survey on Scheduling Algorithm in MapReduce Framework

Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. ddn.com

Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013

Hadoop on the Gordon Data Intensive Cluster

DyScale: a MapReduce Job Scheduler for Heterogeneous Multicore Processors

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Application Development. A Paradigm Shift

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Hadoop IST 734 SS CHUNG

A Framework for Performance Analysis and Tuning in Hadoop Based Clusters

Hadoop Scheduler w i t h Deadline Constraint

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Hadoop implementation of MapReduce computational model. Ján Vaňo

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Record Setting Hadoop in the Cloud By M.C. Srivas, CTO, MapR Technologies

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics

Hadoop Cluster Applications

marlabs driving digital agility WHITEPAPER Big Data and Hadoop

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

MapReduce with Apache Hadoop Analysing Big Data

Big Data With Hadoop

MAPREDUCE [1] is proposed by Google in 2004 and

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Large scale processing using Hadoop. Ján Vaňo

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Task Scheduling in Hadoop

How MapReduce Works 資碩一 戴睿宸

Energy Efficient MapReduce

Hadoop Parallel Data Processing

Big Data Analysis and HADOOP

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

MapReduce and Hadoop Distributed File System V I J A Y R A O

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

6. How MapReduce Works. Jari-Pekka Voutilainen

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

L1: Introduction to Hadoop

Map Reduce & Hadoop Recommended Text:

A Cost-Evaluation of MapReduce Applications in the Cloud

GraySort on Apache Spark by Databricks

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

An In-Memory RDMA-Based Architecture for the Hadoop Distributed Filesystem

Hadoop Deployment and Performance on Gordon Data Intensive Supercomputer!

Big Data in the Enterprise: Network Design Considerations

Data-intensive computing systems

MapReduce Online. Tyson Condie, Neil Conway, Peter Alvaro, Joseph Hellerstein, Khaled Elmeleegy, Russell Sears. Neeraj Ganapathy

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

MapReduce, Hadoop and Amazon AWS

A Brief Outline on Bigdata Hadoop

Big Data Analytics - Accelerated. stream-horizon.com

Open source Google-style large scale data analysis with Hadoop

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

A very short Intro to Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop

High-Performance Networking for Optimized Hadoop Deployments

Big Data and Apache Hadoop s MapReduce

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

Apache Hadoop. Alexandru Costan

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Towards MapReduce Performance Optimization: A Look into the Optimization Techniques in Apache Hadoop for BigData Analytics

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

High Speed I/O Server Computing with InfiniBand

Transcription:

Hadoop Optimizations for BigData Analytics Weikuan Yu Auburn University

Outline WBDB, Oct 2012 S-2 Background Network Levitated Merge JVM-Bypass Shuffling Fast Completion Scheduler

WBDB, Oct 2012 S-3 Emerging Demand for BigData Analytics Big demand from many organizations in various domains Scalable computing power without worrying about system maintenance. Ubiquitously accessible computing and storage resources. Low cost, highly reliable, trusted computing infrastructure. Commercial companies are gearing up resources for BigData

MapReduce l l l l A simple data processing model to process big data Designed for commodity off-the-shelf hardware components. Strong merits for big data analytics l l Scalability: increase throughput by increasing # of nodes Fault-tolerance (quick and low cost recovery of the failures of tasks) Hadoop, An open-source implementation of MapReduce: l Widely deployed by many big data companies: AOL, Baidu, EBay, Facebook, IBM, NY Times, Yahoo!. WBDB, Oct 2012, S-4

WBDB, Oct 2012 S-5 High-Level Overview of Hadoop l l l HDFS and the MapReduce Framework Data processing with MapTasks and ReduceTasks Three main steps of data movement. l Intermediate data shuffling in the MapReduce is time-consuming Applications JobTracker Job Submission Task Tracker/Runner 1 HDFS 3 Task Tracker/Runner MapTasks 2 Shuffle Intermediate Data ReduceTasks

Outline WBDB, Oct 2012 S-6 Background Network Levitated Merge JVM-Bypass Shuffling Fast Completion Scheduler

WBDB, Oct 2012 S-7 Motivation for Network-Levitated Merge 1: Serialization between shuffle/merge and reduce phases shuffle map merge reduce Start First MOF Serialization Time

WBDB, Oct 2012 S-8 Repetitive Merges and Disk Access Hadoop data spilling controlled through parameters To limit the number of outstanding files An example with io.sort.factor=3 1: merge more 2: insert 3: merge 4: to merge soon

WBDB, Oct 2012 S-9 Hadoop Acceleration (Hadoop-A) Pipelined shuffle, merge and reduce Network-levitated data merge Hadoop JobTracker TaskTracker TaskTracker Java MapTask ReduceTask C++ NetMerger Data Engine MOFSupplier RDMA Server RDMA Client Fetch Manager Merged Data Merging Thread Merge Manager RDMA Interconnects Acceleration

WBDB, Oct 2012 S-10 Pipelined Data Shuffle, Merge and Reduce shuffle map header merge reduce PQ setup start First MOF Last MOF Time

WBDB, Oct 2012 S-11 Network-Levitated Merge Algorithm S1 S1 Merge Point <k1,v1>, S2 S3 S2 S3 <k2,v2>, <k3,v3>, (a) Fetching Header (b) Priority Queue Setup S2 <k2,v2>, <k2,v2 >, S1 <k1,v1>, <k1,v1 >, S1 <k1,v1>, <k1,v1 > S2 <k2,v2>, <k2,v2 >, S3 <k3,v3>, Merged Data: <k1,v1><k2,v2><k3,v3>, <k3,v3 >, S3 <k3,v3>, <k3,v3 >, Merged Data: <k1,v1><k2,v2><k3,v3>,,<k2,v2 ><k1,v1 ><k3,v3 >, (c) Concurrent Fetching & Merging (d) Towards Completion

WBDB, Oct 2012 S-12 Job Progression with Network-levitated Merge a) Hadoop-A speeds up the execution time by more than 47% b) Both MapTasks and ReduceTasks are improved 140 120 Hadoop-A (Map) Hadoop on IPoIB (Map) Hadoop on GigE (Map) 140 120 Hadoop-A (Reduce) Hadoop on IPoIB (Reduce) Hadoop on GigE (Reduce) Progress (%) 100 80 60 Progress (%) 100 80 60 40 40 20 20 0 0 500 1000 1500 2000 Time (sec) 0 0 500 1000 1500 2000 Time (sec) a) Map Progress of TeraSort b) Reduce Progress of TeraSort

WBDB, Oct 2012 S-13 Breakdown of ReduceTask Execution Time (sec) Significantly reduced the execution time of ReduceTasks Most came from reduced shuffle/merge time An improvement of 2.5 times Also improved the time to reduce data An improvement of 15% Category PQ-Setup Shuffle/Merge Reduce or Merge/Reduce Hadoop-GigE 1150.01 (65.0%) 599.97 (35.0%) Hadoop-IPoIB 1148.26 (65.9%) 597.09 (34.1%) Hadoop-A 460.01 (47.4%) 509.81 (52.6%)

Outline WBDB, Oct 2012 S-14 Background Network Levitated Merge JVM-Bypass Shuffling Fast Completion Scheduler

JVM-Dependent Intermediate Data Shuffling MapTask Map TaskTracker HttpServlet JobTracker TaskTracker ReduceTask MOF1 MOF2 MOF1 MOF2 MOF1 MOF2 Staging HttpServlet Staging HttpServlet Staging TCP/IP-Only MOFCopiers Sort/Merge Reduce HDFS Heavily relies on Java WBDB, Oct 2012, S-15

JVM-Bypass Shuffling (JBS) JBS removes JVM from the critical path of intermediate data shuffling JBS is a portable library supporting both TCP/IP and RDMA protocols Data Analytics Applications MapTask TaskTracker ReduceTask HTTP Servlet Java C Sockets TCP/IP HTTP GET JVM-Bypass C JVM-Bypass Shuffling (JBS) MOFSupplier NetMerger RDMA Verbs, TCP/IP Ethernet InfiniBand/Ethernet WBDB, Oct 2012, S-16

Benefits of JBS: 1/10 Gigabit Ethernets JBS is effective for intermediate data of different sizes Ø Using Terasort benchmark, size of intermediate data = size of input data JBS reduces the execution time by 20.9% on average in 1GigE, 19.3% on average in 10GigE Terasort Job Execution Time (sec) 2000 1500 1000 500 Hadoop on 1GigE JBS on 1GigE Terasort Job Execution Time (sec) 2000 1500 1000 500 Hadoop on 10GigE JBS on 10GigE 0 0 16 32 64 128 256 16 32 64 128 256 Input Data Size (GB) Input Data Size (GB) (a): 1 Gigabit Ethernet (b): 10 Gigabit Ethernet WBDB, Oct 2012, S-17

Benefits of JBS: InfiniBand Cluster JBS on IPoIB outperforms Hadoop on IPoIB and SDP by 14.1%, 14.8%, respectively. Hadoop performs similarly when using IPoIB or SDP. Terasort Job Execution Time (sec) 2000 1500 1000 500 Hadoop on IPoIB Hadoop on SDP JBS on IPoIB 0 16 32 64 128 256 Input Data Size (GB) WBDB, Oct 2012, S-18

Outline WBDB, Oct 2012 S-19 Background Network Levitated Merge JVM-Bypass Shuffling Fast Completion Scheduler

Hadoop Fair Scheduler WBDB, Oct 2012 S-20 Scheduler assigns tasks to the TaskTrackers Tasks occupy slots until completion or failure Slot-M5 J-2 J-3 J-3 Slot-M4 J-2 J-3 J-3 Slot-M3 Slot-M2 Slot-M1 J-1 J-1 J-1 J-2 J-3 J-2 J-2 J-2 J-2 shuffle reduce Slot-R3 Slot-R2 Slot-R1 1 2 3 Job Arrival Time

WBDB, Oct 2012 S-21 Fair Completion Scheduler Prioritize ReduceTasks based on the shortest remaining map phase When remaining map phases are equal, prioritize ReduceTasks of jobs with least remaining reduce data Track the slowdown of preempted ReduceTasks Prevent large jobs from being preempted for too long

Average ReduceTask Waiting Time WBDB, Oct 2012 S-22 ReduceTasks in small jobs are significantly speedup Average ReduceTask Waiting Time (sec) 10000 1000 100 10 1 19,5 FCS 12,4 22 HFS 21 32.2 27.3 1 2 3 4 5 6 7 8 9 10 Groups 1.22 4.51 0.52 0.8

WBDB, Oct 2012 S-23 Conclusions Examined the design and architecture of Hadoop MapReduce framework and reveal critical issues faced by the existing implementation Designed and implemented Hadoop-A as an extensible acceleration framework which addresses all these issues Provided JVM-Bypass Shuffling to avoid JVM overhead, meanwhile we enable it to be a portable library that can run on both TCP/IP and RDMA protocols. Designed and Implemented Fast Completion Scheduler for fast job completion and job fairness.

Sponsors of our research WBDB, Oct 2012, S-24