Running a typical ROOT HEP analysis on Hadoop/MapReduce. Stefano Alberto Russo Michele Pinamonti Marina Cobal



Similar documents
Energy Efficient MapReduce

Hadoop and Map-Reduce. Swati Gore

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Storage Architectures for Big Data in the Cloud

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

GraySort on Apache Spark by Databricks

Design and Evolution of the Apache Hadoop File System(HDFS)

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Chapter 7. Using Hadoop Cluster and MapReduce

A very short Intro to Hadoop

Optimize the execution of local physics analysis workflows using Hadoop

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

Big Fast Data Hadoop acceleration with Flash. June 2013

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Big Data Analytics with MapReduce VL Implementierung von Datenbanksystemen 05-Feb-13

Apache Hadoop. Alexandru Costan

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Managing large clusters resources

Big Data and Apache Hadoop s MapReduce

HADOOP PERFORMANCE TUNING

In Memory Accelerator for MongoDB

Data-intensive computing systems

Hadoop Architecture. Part 1

THE HADOOP DISTRIBUTED FILE SYSTEM

Hadoop Cluster Applications

A Brief Outline on Bigdata Hadoop

Distributed File Systems

PARALLELS CLOUD STORAGE

NoSQL and Hadoop Technologies On Oracle Cloud

CSE-E5430 Scalable Cloud Computing Lecture 2

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Networking in the Hadoop Cluster

Java Garbage Collection Characteristics and Tuning Guidelines for Apache Hadoop TeraSort Workload

HDFS Architecture Guide

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Large scale processing using Hadoop. Ján Vaňo

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

HDFS Users Guide. Table of contents

White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

marlabs driving digital agility WHITEPAPER Big Data and Hadoop

Big Data and Hadoop. Sreedhar C, Dr. D. Kavitha, K. Asha Rani

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

Developing a MapReduce Application

Mambo Running Analytics on Enterprise Storage

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Hadoop IST 734 SS CHUNG

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

GeoGrid Project and Experiences with Hadoop

Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013

COURSE CONTENT Big Data and Hadoop Training

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Big Data With Hadoop

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

In-Memory BigData. Summer 2012, Technology Overview

Generic Log Analyzer Using Hadoop Mapreduce Framework

Introduction to HDFS. Prasanth Kothuri, CERN

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE

Google File System. Web and scalability

HADOOP MOCK TEST HADOOP MOCK TEST I

ImprovedApproachestoHandleBigdatathroughHadoop

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Bringing Big Data Modelling into the Hands of Domain Experts

Survey on Scheduling Algorithm in MapReduce Framework

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Navigating the Big Data infrastructure layer Helena Schwenk

Hadoop implementation of MapReduce computational model. Ján Vaňo

Yuji Shirasaki (JVO NAOJ)

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

Intro to Map/Reduce a.k.a. Hadoop

Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique

Hadoop and ecosystem * 本 文 中 的 言 论 仅 代 表 作 者 个 人 观 点 * 本 文 中 的 一 些 图 例 来 自 于 互 联 网. Information Management. Information Management IBM CDL Lab

Transcription:

Running a typical ROOT HEP analysis on Hadoop/MapReduce Stefano Alberto Russo Michele Pinamonti Marina Cobal CHEP 2013 Amsterdam 14-18/10/2013

Topics The Hadoop/MapReduce model Hadoop and High Energy Physics How to run ROOT on Hadoop A real case: a top quark analysis Results and conclusions DISCLAIMER: This talk is about computing architecture, it is not a not performance study. 2

Background Standard distributed computing model: storage and computational resources of a cluster as two independent, well logically-separated components. Bottleneck for data-intensive applications! 3

The Hadoop/MapReduce model New idea: overlap storage elements with the computing ones the computation can be scheduled on the cluster elements holding a copy of the data to analyze: data locality Two components: 1. The Hadoop Distributed File System (HDFS) 2. The MapReduce computational model and framework 4

The Hadoop Distributed File System (HDFS) On HDFS, files are: Stored by slicing them in chunks (i.e. 64 MB, 1 GB) FILE Chunk Chunk Chunk..which are replicated across the cluster for redundancy and workload distribution. - No RAID - Commodity hardware: a disk can (and will) fail, sooner or later 5

The MapReduce model and framework FILE CPUS Node 1 Chunk 3 Chunk 1 Map(3) Map(1) The The Map() Map() functions functions are are executed executed in-place in-place on on the the chunks, chunks, on on the the nodes nodes where where data data is is stored. stored. Chunk 7 Map(7) Node 2 Chunk 2 Map(2) Reduce(All) Chunk 5 Map(5) Node 3 Chunk 4 Chunk 6 Map(4) Map(6) Example: word count! You do not ask Hadoop for cpu slots, you ask to analyze a dataset 6

The MapReduce model and framework MapReduce requires an embarrassing parallel problem. No communication between Maps... Another basic assumption: a trivial Reduce phase. easy to compute and almost I/O free NOT I/O OPTIMIZED 7

Hadoop and HEP (1) In High Energy Physics (HEP): Particle collision events are independent: embarrassing parallel problems Simple merging operations: sum numbers, sum historgrams.. Usually, data to analyse accessed over and over again to finalize physics results: potential advantage from data localiy (Store once, read many) 8

Hadoop and HEP (2) Natural approach: Map: processes a chunk of the data set, analysing it event by event Reduce: collect Map's partial result merging them. 9

Hadoop and HEP (2) Natural approach: Map: processes a chunk of the data set, analysing it event by event Reduce: collect Map's partial result merging them. Drowbacks: 1) Events in plain text, CSV style: lot of unnecessary I/O reads Not column-based storage... (typical HEP analysis requires only a few out of the many variables available) Ref: Maaike Limper, An SQL-based approach to Physics Analysis, CHEP2013 10

Hadoop and HEP (2) Natural approach: Map: processes a chunk of the data set, analysing it event by event Reduce: collect Map's partial result merging them. Drowbacks: 1) Events in plain text, CSV style: lot of unnecessary I/O reads Not column-based storage... (typical HEP analysis requires only a few out of the many variables available) Ref: Maaike Limper, An SQL-based approach to Physics Analysis, CHEP2013 2) Frameworks for HEP developed, maintained and used by large communities over several years (ROOT): - porting code could be very challenging and time consuming...and non-optimised MapReduce code can easily lead to waste CPU Ref: Zbigniew Baranowski, et Al,Sequential Data access with Oracle and Hadoop: a performance comparison, CHEP2013 11

Hadoop and HEP (3) IDEA: - run ROOT on Hadoop, and - use its original data format which provides column-based storage. GOALS: 1) Transparency for the data: let binary datasets be uploaded on HDFS without changing format; 2) Transparency for the code: let the original code run without having to modify a single line; 3) Transparency for the user: avoid the users to have to learn Hadoop/MapReduce, and let them interact with Hadoop in a classic, batch-fashioned behavior. 12

Hadoop and HEP (4) PROBLEMS: The Hadoop/MapReduce framework and its native API are written in the Java programming language. Support for other programming languages is provided, but: serious limitations on the input/output side when working with binary data sets. (Hadoop was developed with textual analyses in mind) ROOT data is binary Chunk 3? Map(3)...chunking binary files without corrupting data is NOT possible! 13

ROOT on Hadoop/MapReduce (1) SOLUTIONS: Transparency for the (binary) data NO chunking: One Map = One file = one HDFS block (chunk) (set chunk size >= file size per file) Map tasks will be in charge of analyzing one file, in its entirety Corruptions due to chunking binary data are avoided Data can be stored on the Hadoop cluster without conversions, in its original format. Other approaches are possible, but much more effort required 14

ROOT on Hadoop/MapReduce (1.1) SOLUTIONS:...and what about parallelism? Working conditions imposed: One Map Task = One chunk = one file to analyze Proposed approach Input Parallelizable unit Natural approach FILE(s) Chunk Chunk Chunk Input Parallelizable unit SET OF FILES FILE FILE FILE Chunk Chunk Chunk Now the parallelization degree goes with the number of files! 15

ROOT on Hadoop/MapReduce (1.2) HEP datasets are usually composed by several files I.e. ATLAS D3PD's storage schema: Object SOLUTIONS:...and what about parallelism? Order of Magnitude Type On Hadoop/Mapreduce Event 1 ROOT data Unknown (binary) File 10 2-10 4 ROOT file One chunk Luminosity block 10 4 Set of Files Directory LHC Run 10 5-10 6 Set of Lum. blocks Directory Data set 10 5-10 9 Set of LHC Runs Directory (input dataset) Dataset: ~ 10 3-10 5 files 16

ROOT on Hadoop/MapReduce (2) 1. Java Map and Reduce tasks as wrappers for ROOT 2. Let ROOT access the data from a standard file system For every Map task: Transparency for ROOT: SOLUTIONS: Transparency for the code Local replica available: Bottom line: bypass Hadoop HDFS file (block) to analyze can be found and therefore accessed on the local, standard file system, i.e. Ext3. Local replica not available: access the file to analyze via network using Hadoop's file system tools or.. use FUSE 17

ROOT on Hadoop/MapReduce (3) SOLUTIONS: Transparency for the user Easy to write a Java MapReduce job acting as a wrapper for user's code, i.e RootOnHadoop.java: # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Just few guidelines for the user code to make it work 18

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) HDFS Ext3 Obtain file location and set access method Hadoop/MapReduce framework 19

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Binary input Data set HDFS Ext3 Hadoop/MapReduce framework 20

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Binary input Data set HDFS Ext3 Hadoop/MapReduce framework 21

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Maps Output Binary input Data set HDFS Ext3 Binary output HDFS location Hadoop/MapReduce framework 22

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Maps Output Binary output HDFS location Binary input Data set HDFS Ext3 Hadoop/MapReduce framework Binary output HDFS location Java Reduce task (wrapper) 23

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Maps Output Binary output HDFS location Binary input Data set HDFS Ext3 Hadoop/MapReduce framework All Maps outputs Binary output HDFS location Java Reduce task (wrapper) User Reduce code 24

Under the hood.. # hadoop run RootOnHadoop user Map code user Reduce code HDFS input dataset HDFS output location Java Map task (wrapper) User Map code Maps Output Binary output HDFS location Binary input Data set HDFS Ext3 Hadoop/MapReduce framework All Maps outputs Final output Binary output HDFS location Java Reduce task (wrapper) User Reduce code 25

A real case: a top quark analysis (1) ROOT on Hadoop has been tested on a real case: the top quark pair production search and cross section measurement analysis performed by the ATLAS collaboration Basics of the analysis: Based on a cut-and-count code: every event undergoes a series of selection criteria, and at the end is accepted or not. Map Cross section obtained by comparing numbers (number of selected events with the luminosity, the efficiency in the selection of signal events, and the expected background events.) Reduce 26

A real case: a top quark analysis (2) The dataset, data taking conditions: Data has been taken with all the subsystems of the ATLAS detector in fully operational mode, with the LHC producing proton-proton collisions corresponding to a centre of mass energy of 7 TeV with stable beams condition during the 2011 run up to August. The dataset, in numbers: 338,6 GB (only electron channel D3PDs) 8830 files average size: ~ 38 MB maximum file size: ~ 48 MB Every file fits in a default HDFS chunk size of 64 MB! Data copied straightforward from CERN Tier-0 to the Hadoop Cluster 27

A real case: a top quark analysis (3) The test cluster: Provided by CERN IT-DSS group 10 nodes, 8 cpus per node Max 10 Map tasks per node 2 replicas per file The top quark analysis code: ROOT-based, treated as a black magic box Compiled without any modification! Has ben stored on the Hadoop File System as well 28

Results (1) Worked as expected: Data locality ratio: 100% (every file is read locally) Using the Delayed Fair Scheduler By Facebook designed for (and tested to) give data locality ratios close to 100% in the majority of the use-cases. 29

Results (2) Data locality 100% and data transfers at runtime: Performance in terms of time still to be evaluated...coparision is hard (apples Vs bananas issue) 30

Conclusions Pros and Cons Typical HEP analyses can be easily ported to a MapReduce model In Hadoop network usage for accessing the data reduced by several orders of magnitude thanks to the data locality feature Transparency can be achieved quite easily Bypassing some Hadoop components permits to: run standard code on standard, local file systems at maximum speed fine tuning (SSD caching, BLAS/LAPACK..)..while: exploiting the innovative features of Hadoop/MapReduce and HDFS easy to manage, fault tollerant and scalable infrastructure (plug/unplug) open source, widely used and well maintained...and the method actually works, positive feedback received i.e. Uni LMU ATLAS group, poster here at CHEP 2013 Evaluation of Apache Hadoop for parallel data analysis with ROOT 31

Conclusions Pros and Cons Java and ROOT overhead to start many jobs Performance to be evaluated - JVM reuse, Map startup improvement; Tuning: - Latency (Heartbeat) optimization... Bottomline: Hadoop forced to work unnaturally bugs when working with blocksize > 2 Gb to be fixed (already investigated by the community)...worth to investigate, spend time for tuning, find a metric to measure performance? 32

Conclusions Pros and Cons Typical HEP analyses can be easily ported to a MapReduce model Network usage for accessing the data reduced by several orders of magnitude thanks to Hadoop's data locality feature. Same data accessed over and over. Transparency can be achieved quite easily Bypassing some Hadoop components permits to: run standard code on standard, local file systems at maximum speed fine tuning (SSD caching, BLAS/LAPACK..)..while: exploiting the innovative features of Hadoop/MapReduce and HDFS easy to manage, fault tollerant and scalable infrastructure..and is open source, widely used and well maintained Hadoop and ROOT overhead to start many jobs (Performance to be evaluated) Hadoop forced to work unnaturally bugs when working with blocksize > 2 Gb to be fixed (already investigated) Thanks for your attention! Demo code stefano.alberto.russo@cern.ch...questions? 33