MapReduce Support in Slurm: Releasing the Elephant



Similar documents
MR+ A Technical Overview. Ralph H. Castain Wangda Tan. Greenplum/EMC. Copyright 2012 EMC Corporation. All rights reserved.

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Enabling High performance Big Data platform with RDMA

Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing

Apache HBase. Crazy dances on the elephant back

Big Data Technology Core Hadoop: HDFS-YARN Internals

MPJ Express Meets YARN: Towards Java HPC on Hadoop Systems

Extending Hadoop beyond MapReduce

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

A Brief Introduction to Apache Tez

Scaling Out With Apache Spark. DTL Meeting Slides based on

Hadoop in the Enterprise

6. How MapReduce Works. Jari-Pekka Voutilainen

YARN Apache Hadoop Next Generation Compute Platform

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Big Data With Hadoop

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

MapReduce and Lustre * : Running Hadoop * in a High Performance Computing Environment

Big Data in Test and Evaluation by Udaya Ranawake (HPCMP PETTT/Engility Corporation)

DeIC Watson Agreement - hvad betyder den for DeIC medlemmerne

Cloud Computing using MapReduce, Hadoop, Spark

The Hadoop Distributed File System

I/O Considerations in Big Data Analytics

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems

CSE-E5430 Scalable Cloud Computing Lecture 2

Hadoop Ecosystem B Y R A H I M A.

HADOOP MOCK TEST HADOOP MOCK TEST I

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Architecting Low Latency Cloud Networks

Gluster Filesystem 3.3 Beta 2 Hadoop Compatible Storage

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

How To Scale Out Of A Nosql Database

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Cray XC30 Hadoop Platform Jonathan (Bill) Sparks Howard Pritchard Martha Dumler

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Enabling the Use of Data

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

A Performance Analysis of Distributed Indexing using Terrier

How To Use Big Data For Telco (For A Telco)

The Inside Scoop on Hadoop

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

HPC ABDS: The Case for an Integrating Apache Big Data Stack

Simple Introduction to Clusters

Big Data Testbed for Research and Education Networks Analysis. SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung

Hadoop. Sunday, November 25, 12

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Clusters: Mainstream Technology for CAE

Design and Evolution of the Apache Hadoop File System(HDFS)

Big Data Analytics - Accelerated. stream-horizon.com

Apache Hadoop YARN: The Nextgeneration Distributed Operating. System. Zhijie Shen & Jian Hortonworks

Architectures for Big Data Analytics A database perspective

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

Introduction to Cloud Computing

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Platfora Big Data Analytics

HPCHadoop: MapReduce on Cray X-series

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Distributed File Systems An Overview. Nürnberg, Dr. Christian Boehme, GWDG

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Big Data and Apache Hadoop s MapReduce

Large scale processing using Hadoop. Ján Vaňo

Unified Batch & Stream Processing Platform

A Brief Outline on Bigdata Hadoop

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Apache Hadoop. Alexandru Costan

Constructing a Data Lake: Hadoop and Oracle Database United!

Big + Fast + Safe + Simple = Lowest Technical Risk

Apache Hadoop: Past, Present, and Future

Fair Scheduler. Table of contents

Mr. Apichon Witayangkurn Department of Civil Engineering The University of Tokyo

Cloudera Backup and Disaster Recovery

Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Transcription:

MapReduce Support in Slurm: Releasing the Elephant Ralph H. Castain, Wangda Tan, Jimmy Cao, and Michael Lv Greenplum/EMC 1

What is MR+? Port of Hadoop s MR classes to the general computing environment Allow execution of MapReduce programs on any cluster, under any resource manager, without modification Utilize common HPC capabilities MPI-based libraries Fault recovery, messaging Co-exist with other uses No dedicated Hadoop cluster required 2

Why would someone use it? Flexibility Let customer select their preferred environment (e.g., SLURM) Share resources Scalability Launch scaling: Hadoop (~N), MR+ (~logn) Wireup: Hadoop (~N2), MR+ (~logn) Performance Launches ~1000x faster, runs faster Enables interactive use-case MPI library access ScaLAPACK, CompLearn, PetSc, 3

How does it work? Overlay JobClient class JNI-based integration to Open MPI s run-time (ORTE) ORTE provides virtualized shim on top of native resource manager Launch, monitoring, and wireup at logn scaling Inherent MPI support, but can run non-mpi apps Staged execution to replicate MR behavior Preposition files using logn-scaled system Extend FileSystem class Remote access to intermediate files Open, close, read, write access Pre-wired TCP-based interconnect, other interconnects (e.g., Infiniband, UDP) automatically utilized to maximize performance 4

What about faults? Processes automatically restarted Time from failure to relocation and restart Hadoop: ~5-10 seconds MR+: ~5 milliseconds Tunable number of local restarts, relocations Sensors provided to monitor resource usage, progress Future state recovery based on HPC methods Process periodically saves bookmark Restart provided with bookmark so it knows where to start processing Prior intermediate results are preserved, appended to new results during communication 5

Resource management Currently need to get allocation prior to execution Mismatch with typical MapReduce procedure MR procedure Program identifies files needing to be accessed and queries file system for locations Files may be shared across multiple locations File system returns list of nodes housing files and/or shards Execution optimized if it can occur on those nodes, but can proceed if placed anywhere Preference: nodes on same switch (typically same rack) 6

SLURM extension (via socket) Resource queries Number of nodes and slots in system Number currently available (non-guaranteed) Resource allocations Provided a list of desired nodes, allocate a specified number to this job Flag indicates whether nodes are mandatory, or optional Mandatory: allocate all specified nodes Optional: allocate all immediately available desired nodes, fill rest with anything Async callback Rolling allocation (as nodes available), or block allocation 7

Benefits Allows SLURM to manage MapReduce jobs Avoids cross-licensing issues via use of defined socket-based message exchange Retains transparency to RM in MapReduce Allows MapReduce to operate on SLURM clusters without modifying code or user behavior Execution transparently ports to SLURM environment Avoids requiring SLURM integration to file system File location queries maintained at the MapReduce level 8

Status Proof of concept Running MR on example problems Benchmark results to be announced in December Initial SLURM integration code prototype complete Near-term plans Demonstrate on-the-fly fault recovery Performance comparisons at large scale Contribute SLURM-MR+ integration to SLURM Release MR+ (open source, probably Apache) 9