Automating Big Data Benchmarking for Different Architectures with ALOJA

Similar documents
How To Study Hadoop Efficiency

Assignment # 1 (Cloud Computing Security)

Microsoft Research Microsoft Azure for Research Training

Standards and Open Source: Trends Affecting Microsoft and You. October 9, :00 am 8:50 am

Microsoft Research Windows Azure for Research Training

DevOps Course Content

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Modernizing Your Data Warehouse for Hadoop

Savanna Hadoop on. OpenStack. Savanna Technical Lead

ALOJA: a Systematic Study of Hadoop Deployment Variables to Enable Automated Characterization of Cost-Effectiveness

Mark Bennett. Search and the Virtual Machine

Deploy Your First CF App on Azure with Template and Service Broker. Thomas Shao, Rita Zhang, Bin Xia Microsoft Azure Team

BIG DATA TRENDS AND TECHNOLOGIES

Oracle Big Data SQL Technical Update

"Charting the Course to Your Success!" MOC A Understanding and Administering Windows HPC Server Course Summary

w w w. u l t i m u m t e c h n o l o g i e s. c o m Infrastructure-as-a-Service on the OpenStack platform

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

Big Data Trends and HDFS Evolution

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

From Performance Profiling to Predictive Analytics while Evaluating Hadoop Cost-Efficiency in ALOJA

Big Analytics in the Cloud. Matt Winkler PM, Big

A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML

Cloud Computing through Virtualization and HPC technologies

Platform as a Service and Container Clouds

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Advances in Virtualization In Support of In-Memory Big Data Applications

Scyld Cloud Manager User Guide

Executive summary. Best environment

Private Cloud for WebSphere Virtual Enterprise Application Hosting

Building a BI Solution in the Cloud

SURFsara HPC Cloud Workshop

How To Understand The 2013 Cio Agenda For A Cloud Server

Linux A first-class citizen in Windows Azure. Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise

HDP Hadoop From concept to deployment.

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

Zabbix for Hybrid Cloud Management

Performance Testing at Scale

The Greenplum Analytics Workbench

Cloud Computing. Adam Barker

Big data variety, 179 velocity, 179 volume, 179 Blob storage containers

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Open Source Technologies on Microsoft Azure

Hadoop & Spark Using Amazon EMR

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Benchmarking Sahara-based Big-Data-as-a-Service Solutions. Zhidong Yu, Weiting Chen (Intel) Matthew Farrellee (Red Hat) May 2015

Realizing the Benefits of Hybrid Cloud. Anand MS Cloud Solutions Architect Microsoft Asia Pacific

<Insert Picture Here> Private Cloud with Fusion Middleware

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

HiBench Introduction. Carson Wang Software & Services Group

How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning

Building Success on Acquia Cloud:

Data Centers and Cloud Computing

Workshop on Hadoop with Big Data

Welcome to the 6 th Workshop on Big Data Benchmarking

Cloudy Middleware MARK LITTLE TOBIAS KUNZE

OpenShift on OpenStack

Scalable Architecture on Amazon AWS Cloud

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Big Data on Microsoft Platform

Nutanix Solutions for Private Cloud. Kees Baggerman Performance and Solution Engineer

NCTA Cloud Operations

OpenShift Enterprise PaaS by Red Hat. Andrey Markelov RHCA Red Hat, Presales Solution Architect

Cloud Computing Now and the Future Development of the IaaS

Open Cirrus: Towards an Open Source Cloud Stack

Sistemi Operativi e Reti. Cloud Computing

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Azure VM Performance Considerations Running SQL Server

Cloud Computing. Lecture 24 Cloud Platform Comparison

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

An HPC Application Deployment Model on Azure Cloud for SMEs

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

Cloud/SaaS enablement of existing applications

MySQL Enterprise Monitor

Course 20533: Implementing Microsoft Azure Infrastructure Solutions

Operating Systems Virtualization mechanisms

Running Oracle Databases in a z Systems Cloud environment

Pedraforca: ARM + GPU prototype

Virtual InfiniBand Clusters for HPC Clouds

PaaS - Platform as a Service Google App Engine

Maintaining Non-Stop Services with Multi Layer Monitoring

PeopleSoft Cloud Architecture Automating PeopleSoft Deployment

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Cloud Computing. Chapter 1 Introducing Cloud Computing

Amazon AWS in.net. Presented by: Scott Reed

Big Data Processing: Past, Present and Future

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Demystifying the Cloud Computing

How To Use Arcgis For Free On A Gdb (For A Gis Server) For A Small Business

Hadoop: Embracing future hardware

Last time. Today. IaaS Providers. Amazon Web Services, overview

Getting Started Hacking on OpenNebula

ADAM 5.5. System Requirements

SURFsara HPC Cloud Workshop

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

CSCI6900 Assignment 2: Naïve Bayes on Hadoop

Microsoft Compute Clusters in High Performance Technical Computing. Björn Tromsdorf, HPC Product Manager, Microsoft Corporation

Efficient Network Marketing - Fabien Hermenier A.M.a.a.a.C.

Transcription:

www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher

Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2. ALOJA project 1. Background 2. Open source tools 4. Results 1. HW and SW speedups 2. Cost/Performance 3. Scalability 5. Predictive Analytics and conclusions 3. Benchmarking 1. Benchmarking workflow 2. DEMO

Hadoop design Hadoop was designed to solve complex data Structured and non structured with [close to] linear scalability and application reliability Simplifying the programming model From MPI, OpenMP, CUDA, Operating as a blackbox for data analysts, but Complex runtime for admins YARN abstracts even more Image source: Hadoop, the definitive guide

Hadoop highly-scalable but Not a high-performance solution! Requires Design, Clusters, topology clusters Setup, OS, Hadoop config Fine tuning required Iterative approach Time consuming and extensive benchmarking

Setting up your Big Data system Hadoop > 100+ tunable parameters obscure and interrelated mapred.map/reduce.tasks.speculative.execution io.sort.mb 100 (300) io.sort.record.percent 5% (15%) io.sort.spill.percent 80% (95 100%) Similar for Hive, Spark, HBase Dominated by rules-of-thumb Number of containers in parallel: 0.5-2 per CPU core Large stack for tuning Image source: Intel Distribution for Apache Hadoop

How do I set my system, too many options!!! Default values in Apache source not ideal Large and spread eco system Different distributions Product claims Each job is different No one-fits-all solution Cloud vs. On-premise IaaS PaaS Tens of different VMs to choose HDInsight, CloudBigData, EMR New economic HW SSDs, InfiniBand Networking

BSC s project ALOJA: towards cost-effective Big Data Open research project for improving the cost-effectiveness of Big Data deployments Benchmarking and Analysis tools Big Data Benchmarking Online Repository Web Analytics Online repository and largest Big Data repo http://aloja.bsc.es 50,000+ runs of HiBench, TPC-H, and [some] BigBench Over 100 HW configurations tested Of different Node/VM, disks, and networks Cloud: Multi-cloud provider including both IaaS and PaaS On-premise: High-end, HPC, commodity, low-power Community Collaborations with industry and Academia Presented in different conferences and workshops Visibility: 47 different countries

Workflow in ALOJA Cluster(s) definition VM sizes # nodes OS, disks Capabilities Execution plan Start cluster Setup Exec Benchmarks Cleanup Import data Convert perf metric Parse logs Import into DB Historic Repo Evaluate data Data views in Vagrant VM Or http://aloja.bsc.es PA and KD Predictive Analytics Knowledge Discovery

Challenges (circa end 2013) Test different clusters and architectures On-premise and HPC Commodity, high-end, appliance, low-power (ARM) Cloud IaaS 32 different VMs in Azure, similar in other providers Cloud PaaS HDInsight (Windows and Linux), EMR, CloudBigData Different access level Full admin, user-only, request-toinstall, everything ready, queuing systems (SGE) Different versions Hadoop, JVM, Spark, Hive, etc Other benchmarks Problems All systems though for PROD Not for comparison No Azure support Many different packages No one-fits-all solution Dev environments and testing Big Data usually requires a cluster to develop and test Solution Custom implementation Abstracting differences Based in simple components Wrapping commands

ALOJA Platform main components 2 Online Repository Explore results Execution details Cluster details Costs Data sharing NGINX, PHP, MySQL 1 Big Data Benchmarking Deploy & Provision Conf Management Parameter selection & Queuing Perf counters Low-level instrumentation App logs 3 Web Analytics Data views and evaluations Aggregates Abstracted Metrics Job characterization Machine Learning Predictions and clustering BASH, Unix tools, CLIs R, SQL, JS 10

Extending and collaborating in ALOJA Setting up a DEV environment: Installs a Web Server with sample data Sets a local cluster to test benchmarking 1. Install prerequisites git, vagrant, VirtualBox 2. git clone https://github.com/aloja/aloja.git 3. cd aloja 4. vagrant up 5. Open your browser at: http://localhost:8080 6. Optional start the benchmarking cluster vagrant up /.*/

Commands and providers Provisioning commands Connect Node and Cluster Builds SSH cmd line SSH proxies Deploy Creates a cluster Sets SSH credentials If created, updates config as needed If stopped, starts nodes Start, Stop Delete Queue jobs to clusters Providers On-premise and HPC Custom settings for clusters Multiple disk types Different architectures Resource/Job control Cloud IaaS Azure, OpenStack, Rackspace, AWS Cloud PaaS HDInsight, Cloud Big Data, EMR soon Code at: https://github.com/aloja/aloja/tree/master/aloja-deploy

Running benchmarks in ALOJA Benchmarking with defaults: /repo_location/aloja-bench/run_benchs.sh To queue jobs: /repo_location/shell/exeq.sh Code at: https://github.com/aloja/aloja/blob/master/aloja-bench/run_benchs.sh

ALOJA-WEB Entry point for explore the results collected from the executions, Provides insights on the obtained results through continuously evolving data views. Online DEMO at: http://aloja.bsc.es

Impact of HW configurations in Speedup Disks and Network Cloud remote volumes HDD-ETH Local only 1 Remote SSD-ETH 2 Remotes 3 Remotes HDD-IB 1 Remotes /tmp local SDD-IB 2 Remotes /tmp local 3 Remotes /tmp local Speedup (higher is better) Results using: http://hadoop.bsc.es/configimprovement Details: https://raw.githubusercontent.com/aloja/aloja/master/publications/bsc-msr_aloja.pdf

Clusters by cost-effectiveness Io1-30 Performance2-30 Io1-15 general1-8 performance1-8 Cheapest exec Fastest Exec

Cost/Performance Scalability of cluster size Recommended size Execution time Execution cost X axis number of data nodes (cluster size) Left Y Execution time (lower is better) Right Y Execution cost (lower is better)

Predictive Analytics and automated learning

Modeling and predicting Hadoop time Methodology 3-step learning process: Tune algorithm, re-train NO ALOJA Data-Set Training Validation Train Model Test the model Select this model? YES Final Model Testing Test the model Use cases Anomaly detection Predict best configurations Guided benchmarking Knowledge Discovery

Concluding remarks In ALOJA we are benchmarking from Low-powered to cloud and super computers Testing both HW components and SW configs Each system has it s own peculiarities and failures! Different access levels Sharing Public cloud very difficult to measure correctly! Versions of software Benchmarking its fun!, or at least It will save you and allow you to scale But it is also tough The industry needs more transparency, We still have a lot to do In ALOJA we provide the benchmarking scripts And also de results, that should be your first entry point We are adding constantly new features Benchmarks, systems providers It is an open initiate, you re invited to participate Find me around the conference for more details on the tools

More info: ALOJA Benchmarking platform and online repository http://aloja.bsc.es http://aloja.bsc.es/publications Benchmarking Big Data http://www.slideshare.net/ni_po/benchmarking-hadoop BDOOP meetup group in Barcelona Big Data Benchmarking Community (BDBC) mailing list (~200 members from ~80organizations) http://clds.sdsc.edu/bdbc/community Workshop Big Data Benchmarking (WBDB) Next: http://clds.sdsc.edu/wbdb2015.ca SPEC Research Big Data working group http://research.spec.org/working-groups/big-data-workinggroup.html Slides and video: Michael Frank on Big Data benchmarking http://www.tele-task.de/archive/podcast/20430/ Tilmann Rabl Big Data Benchmarking Tutorial http://www.slideshare.net/tilmann_rabl/ieee2014-tutorialbarurabl

The MareNostrum 3 Supercomputer Over 10 15 Floating Point Operations per second Nearly 50,000 cores 100.8 TB of main memory 2 PB of disk storage 70% distributed through PRACE 24% distributed through RES 6% for BSC-CNS use

www.bsc.es Thanks! Q&A Contact: nicolas.poggi@bsc.es