Dell Apache Hadoop Performance Analysis

Similar documents

High Performance SQL Server with Storage Center 6.4 All Flash Array

Dell Wyse Datacenter for View RDS Desktops and Remote Applications

How To Create A Web Server On A Zen Nlb (Networking) With A Web Browser On A Linux Server On An Ipad Or Ipad On A Raspberry Web 2.4 (

Dell Solutions Configurator Guide for the Dell Blueprint for Big Data & Analytics

Installing idrac Certificate Using RACADM Commands

Dell Reference Configuration for Hortonworks Data Platform

Using Dell EqualLogic and Multipath I/O with Citrix XenServer 6.2

Managing Web Server Certificates on idrac

Dell Fabric Manager Installation Guide 1.0.0

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Hadoop on the Gordon Data Intensive Cluster

Active Fabric Manager (AFM) Plug-in for VMware vcenter Virtual Distributed Switch (VDS) CLI Guide

Dell PowerEdge Blades Outperform Cisco UCS in East-West Network Performance

Scaling the Deployment of Multiple Hadoop Workloads on a Virtualized Infrastructure

HiBench Introduction. Carson Wang Software & Services Group

Hadoop Size does Hadoop Summit 2013

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Increasing Hadoop Performance with SanDisk Solid State Drives (SSDs)

Recommended Methods for Updating Firmware on Dell Servers

HADOOP PERFORMANCE TUNING

Optimizing SQL Server Storage Performance with the PowerEdge R720

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Accessing Remote Desktop using VNC on Dell PowerEdge Servers

Dell Server Management Pack Suite Version For Microsoft System Center Operations Manager And System Center Essentials User s Guide

Accelerating Server Storage Performance on Lenovo ThinkServer

Understanding Hadoop Performance on Lustre

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

MapReduce Evaluator: User Guide

Performance measurement of a Hadoop Cluster

Dell Virtual Remote Desktop Reference Architecture. Technical White Paper Version 1.0

High Performance Tier Implementation Guideline

CSE-E5430 Scalable Cloud Computing Lecture 2

Dell PowerVault MD32xx Deployment Guide for VMware ESX4.1 Server

idrac7 Version With Lifecycle Controller 2 Version 1.1 Quick Start Guide

Intel Distribution for Apache Hadoop on Dell PowerEdge Servers

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS

Enabling High performance Big Data platform with RDMA

Microsoft SharePoint Server 2010

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Installing Hadoop over Ceph, Using High Performance Networking

Reference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper

Benchmarking Hadoop & HBase on Violin

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters

Intel Distribution for Apache Hadoop* Software: Optimization and Tuning Guide

Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

Protecting Hadoop with VMware vsphere. 5 Fault Tolerance. Performance Study TECHNICAL WHITE PAPER

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Tuning Hadoop on Dell PowerEdge Servers

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

VMware ESX 2.5 Server Software Backup and Restore Guide on Dell PowerEdge Servers and PowerVault Storage

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

Dell Desktop Virtualization Solutions Stack with Teradici APEX 2800 server offload card

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Dell EqualLogic Best Practices Series

Accelerating and Simplifying Apache

The Methodology Behind the Dell SQL Server Advisor Tool

Intel RAID SSD Cache Controller RCS25ZB040

Reference Architecture for Dell VIS Self-Service Creator and VMware vsphere 4

Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers

DVS Enterprise. Reference Architecture. VMware Horizon View Reference

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Reference Architecture - Microsoft Exchange 2013 on Dell PowerEdge R730xd

An Oracle White Paper September Oracle Exadata Database Machine - Backup & Recovery Sizing: Tape Backups

DELL s Oracle Database Advisor

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

Deploying Hadoop with Manager

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

HP reference configuration for entry-level SAS Grid Manager solutions

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

DELL. Dell Microsoft Windows Server 2008 Hyper-V TM Reference Architecture VIRTUALIZATION SOLUTIONS ENGINEERING

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Big Fast Data Hadoop acceleration with Flash. June 2013

A Performance Analysis of Distributed Indexing using Terrier

Optimizing LTO Backup Performance

SUN DUAL PORT 10GBase-T ETHERNET NETWORKING CARDS

Applied Storage Performance For Big Analytics. PRESENTATION TITLE GOES HERE Hubbert Smith LSI

Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra

A Framework for Performance Analysis and Tuning in Hadoop Based Clusters

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Dell EqualLogic Best Practices Series. Dell EqualLogic PS Series Reference Architecture for Cisco Catalyst 3750X Two-Switch SAN Reference

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Data Migration: Moving from Dell PowerVault MD3000i/MD3000 to MD3200i/MD3220i and MD3600i/MD3620i Series Storage Arrays

THE HADOOP DISTRIBUTED FILE SYSTEM

Dell Microsoft SQL Server 2008 Fast Track Data Warehouse Performance Characterization

Upgrade to Microsoft Windows Server 2012 R2 on Dell PowerEdge Servers Abstract

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Evaluation of Enterprise Data Protection using SEP Software

Exar. Optimizing Hadoop Is Bigger Better?? March Exar Corporation Kato Road Fremont, CA

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Transcription:

Dell Apache Hadoop Performance Analysis Dell PowerEdge R720/R720XD Benchmarking Report Nicholas Wakou Hadoop/Big Data Benchmarking Engineer Dell Revolutionary Cloud and Big Data Engineering November 2013 November2013

Revisions Date November 2013 Description Initial release THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. 2013 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. PRODUCT WARRANTIES APPLICABLE TO THE DELL PRODUCTS DESCRIBED IN THIS DOCUMENT MAY BE FOUND AT: http://www.dell.com/learn/us/en/19/terms-of-sale-commercial-and-public-sector Performance of network reference architectures discussed in this document may vary with differing deployment conditions, network loads, and the like. Third party products may be included in reference architectures for the convenience of the reader. Inclusion of such third party products does not necessarily constitute Dell s recommendation of those products. Please consult your Dell representative for additional information. Trademarks used in this text: Dell, the Dell logo, Dell Boomi, Dell Precision,OptiPlex, Latitude, PowerEdge, PowerVault, PowerConnect, OpenManage, EqualLogic, Compellent, KACE, FlexAddress, Force10 and Vostro are trademarks of Dell Inc. Other Dell trademarks may be used in this document. Cisco Nexus, Cisco MDS, Cisco NX- 0S, and other Cisco Catalyst are registered trademarks of Cisco System Inc. EMC VNX, and EMC Unisphere are registered trademarks of EMC Corporation. Intel, Pentium, Xeon, Core and Celeron are registered trademarks of Intel Corporation in the U.S. and other countries. AMD is a registered trademark and AMD Opteron, AMD Phenom and AMD Sempron are trademarks of Advanced Micro Devices, Inc. Microsoft, Windows, Windows Server, Internet Explorer, MS-DOS, Windows Vista and Active Directory are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Red Hat and Red Hat Enterprise Linux are registered trademarks of Red Hat, Inc. in the United States and/or other countries. Novell and SUSE are registered trademarks of Novell Inc. in the United States and other countries. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Citrix, Xen, XenServer and XenMotion are either registered trademarks or trademarks of Citrix Systems, Inc. in the United States and/or other countries. VMware, Virtual SMP, vmotion, vcenter and vsphere are registered trademarks or trademarks of VMware, Inc. in the United States or other countries. IBM is a registered trademark of International Business Machines Corporation. Broadcom and NetXtreme are registered trademarks of Broadcom Corporation. Qlogic is a registered trademark of QLogic Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and/or names or their products and are the property of their respective owners. Dell disclaims proprietary interest in the marks and names of others. 2 Dell Apache Hadoop Performance Analysis

3 Dell Apache Hadoop Performance Analysis

Table of Contents Revisions... 2 Executive Summary... 7 1 Introduction... 8 1.1 Strategic Goals... 8 1.2 Benchmark Testing... 8 1.3 Functionality... 8 1.4 Performance Characterization Tests... 9 1.4.1 Characterization Workloads... 9 1.5 Stress Tests... 9 1.5.1 MapReduce Stress Testing... 10 1.5.2 HDFS Stress Testing... 10 1.6 Bottleneck Investigation... 10 1.7 Performance Tuning... 10 2 System under Test (SUT)... 11 2.1 Hardware Configuration... 12 2.2 Software Stack... 12 2.3 Network Configuration... 13 3 Run Time Environment... 14 3.1 TestDFSIO... 14 3.2 Terasort... 14 3.3 K-Means... 15 3.4 Flush Memory... 15 4 Performance Results... 16 4.1 MapReduce Performance... 16 4.1.1 Performance Characterization... 16 4.1.2 MapReduce Job: CPU Profile... 17 4.1.3 MapReduce Networking... 17 4.2 Bottleneck Investigation... 18 4.2.1 Performance Tuning... 19 4.2.2 MapReduce Resource Utilization... 21 4.2.3 Analysis of a Teragen/Terasort Job... 23 4 Dell Apache Hadoop Performance Analysis

4.3 MapReduce Performance under a complex application K-Means... 24 4.3.1 K-Means Job Time... 24 4.3.2 K-Means CPU Utilization... 25 4.3.3 Thoroughput... 26 4.3.4 MapReduce Resource Utilization Under a Complex Application... 28 4.3.5 Analysis of a K-Means Job... 29 4.4 HDFS Performance... 29 4.5 HDFS Write Performance... 29 4.5.1 HDFS Read Performance... 30 4.5.2 IO Bottleneck Investigation... 31 4.5.3 HDFS Distributed Processing... 31 4.5.4 Analysis of a TestDFSIO Job... 33 5 Conclusion... 34 6 Appendix A... 35 6.1.1 Apache Hadoop TestDFSIO... 35 6.1.2 Intel HiBench... 36 6.1.3 HiBench: Terasort... 37 6.1.4 Intel HiBench: K-Means... 38 6.2 Appendix A2: Test Methodology... 42 6.2.1 Hadoop Infrastructure Deployment... 42 6.2.2 Cloudera Manager Deployment... 42 6.3 Appendix A3: Installing Benchmarks... 42 6.3.1 TestDFSIO... 42 6.3.2 Intel HiBench... 42 7 Appendix B: Hadoop Configuration Parameters... 44 5 Dell Apache Hadoop Performance Analysis

6 Dell Apache Hadoop Performance Analysis

Executive Summary The purpose of this document is helping you gain better insights when deploying and tuning hadoop clusters by understanding the performance of the RA using data points and workloads that are typical of a big data environment. This white paper discusses the Hi-Bench 2.2 benchmarking tests that were conducted based on the R720/R720XD Reference Architecture (RA) of the Dell Cloudera Apache Hadoop Solution while focussing on the performance of MapReduce and HDFS. The reference architecture discussed in this document improves performance by tuning and adjusting O/S parameters like HDFS block size, and hadoop configuration setting within CDH. 7 Dell Apache Hadoop Performance Analysis

1 Introduction This report is based on benchmarking tests that were carried out on the R720/R720xd Reference Architecture (RA) of the Dell Cloudera Apache Hadoop Solution. This performance review focused on the performance of the Hadoop core components such as MapReduce and HDFS. Note: The performance of eco-system components (Hive, Impala, HBase etc.) is not part of this review. 1.1 Strategic Goals Gain an in-depth understanding of the performance of the RA using data points and workloads that are typical of a big data environment. Obtain baseline performance data for the hardware platform. Assess the performance impact of some hardware components on the RA. Tune and optimize the performance of the cluster. Identify and clear bottlenecks. 1.2 Benchmark Testing The benchmarking plan proposes 3 categories of benchmark tests as listed below. This benchmark review focusses on engineering analysis tests that were used to obtain an in-depth understanding of the RA. Engineering Analysis o Functionality QA o Characterization o Stress testing o Bottleneck Investigation o Performance Tuning Business Recovery o Not performed in this iteration Comparative analysis for marketing purposes o Not performed in this iteration 1.3 Functionality These benchmarking tests were performed after QA tests in the release cycle. The hardware platforms and software stacks of the System under Test (SUT) were stable and ready for shipping. 8 Dell Apache Hadoop Performance Analysis

1.4 Performance Characterization Tests These are benchmark tests that were used to characterize the performance of the RA. Performance data from these tests is typically used for architectural designs and modifications, capacity analysis, and identification of bottlenecks. To get a good understanding of the cluster, these tests were modular and the following hardware components were characterized (stressed): IO Network CPU The goal of these tests was to: Record and analyze MapReduce and HDFS performance of the RA under varying loads. Analyze RA behavior under the Map, Shuffle, Reduce and Replication phases of MapReduce jobs. 1.4.1 Characterization Workloads Standard, Open-Source workloads were used Teragen/Terasort o Data generator o Primary Metric: Latency (s) o Secondary Metrics: CPU utilization (%), Network utilization (%), Network Throughput (MB/s) TestDFSIO o Read/Write IO characterization tool o Primary Metric: Throughput (MB/s), Latency (s) K-Means o Machine-Learning tool o Cluster analysis: partitions (n) samples into (k) clusters. The dimension (d) of each sample can be varied to obtain desired levels of complexity o Primary Metric: Wall clock time 1.5 Stress Tests These are basically characterization tests performed under peak load conditions to analyze the behavior of the RA at full load and to identify possible bottlenecks. 9 Dell Apache Hadoop Performance Analysis

1.5.1 MapReduce Stress Testing The goal was to obtain 100% CPU utilization on the slave nodes of the cluster. A MapReduce job was submitted using Teragen/Terasort. The load (dataset size in MB) was increased until 100% CPU utilization was observed on the Slave nodes. Performance at 100% CPU was sustained and monitored for long job durations. 1.5.2 HDFS Stress Testing The goal was to obtain peak IO throughput (MB/s) while targeting the SAS limit of the disk controller on the slave nodes and/or 100% of the network utilization of the cluster. TestDFSIO was used to vary the file size of a dataset that was read from or written to the cluster until peak throughput was attained and sustained. 1.6 Bottleneck Investigation Identifying bottlenecks was a prime goal of this review. Ideally, the SUT must run at full capacity (optimal utilization of available resources). When the SUT does not perform as expected, it is essential to identify the source of the problem (bottlenecks). Bottlenecks can be caused either due to hardware limitations or due to inefficient software configurations or both MapReduce jobs used in this review were CPU-intensive. The inability of a MapReduce job to fully maximize the CPU resources (attain 100% CPU utilization) on the slave nodes at full load is an indication of a bottleneck on the SUT. HDFS jobs used in this review were IO-intensive. Attaining the SAS throughput limit of the disk controller is a good indication of a bottleneck-free SUT. 1.7 Performance Tuning It is imperative that bottlenecks are eliminated or their impact mitigated in order to fully utilize the resources of the SUT. In this review, software parameters (OS and Hadoop) were tweaked (tuned) in order to attain the desired CPU and IO profiles. 10 Dell Apache Hadoop Performance Analysis

2 System under Test (SUT) This benchmark review was undertaken on the Dell PowerEdge R720/R720 hardware platform. Figure 1 System Under Test 11 Dell Apache Hadoop Performance Analysis

2.1 Hardware Configuration Table 1: Hardware Configuration Machine Function Active and Secondary Name Node Admin Node, HA Node Edge Node Data node Platform PowerEdge R720 PowerEdge R720xd CPU 2 x E5-2630 (6-core) 2 x E5-2640 (6- core) RAM (Minimum) LOM 96 GB 36 GB 4 x 1GbE DISK 6 x 600-GB 10K SAS 3.5-inch 24 x 1-TB SATA 7.2K 2.5-inch Storage Controller PERC H710 RAID RAID 10 Single Drive RAID 0 2.2 Software Stack Table 2: Software Stack Component Version Operating System Red Hat Enterprise Linux 6.2 Hadoop Cloudera Distribution of Hadoop CDH 4.1.1 Cloudera Manager Cloudera Manager 4.1.2 Java Sun Oracle Java version 6 Update 31 12 Dell Apache Hadoop Performance Analysis

2.3 Network Configuration Figure 2 Network Configuration Table 3 Server Side Cabling Component NICs to Switch Port LOM1 LOM2 LOM3 LOM4 BMC Admin node Name Nodes N/A Data Node N/A N/A Edge Node Legend Production LAN Management LAN Public LAN In order to segregate network traffic and enable dedicated network links, the Dell Cloudera solution configures 3 distinct vlans: Network Description vlan tag Tagged Production Management Public Used by the Hadoop system to handle traffic between all nodes for HDFS operations, MapReduce jobs, and other Hadoop traffic. Network links are bonded in a team of 2 or more links. Used for connecting to the BMC of each node. Additionally used for administrative functions such as Crowbar node installation, backups and other monitoring. Used for connections to devices external to the Hadoop Cluster. Bonded in a team of 2 links. 100 802.1q Tagged 300 Not tagged 500 802.1q Tagged 13 Dell Apache Hadoop Performance Analysis

3 Run Time Environment Performance tests were executed on the Masternode from the command line or scripts. 3.1 TestDFSIO Executed from the command line. See Appendix A1 (1.1: Apache Hadoop TestDFSIO) Before every performance run, remove previous test data by running the command: # sudo u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-test-2.0.0-mr1- cdh4.1.1.jar TestDFSIO -clean Data set size was used to characterize and stress the SUT. The following command line options were used to vary the data set size from 100GB to 5000GB - - nrfiles (number of files) - - filesize (size of each file) Performance Metrics provided at program completion - Throughput MB/s - IO Rate - MB/s - Execution time (s) - Standard Deviation Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 3.2 Terasort Executed from HiBench scripts as shown in Appendix A1 (1.3 HiBench: Terasort) Follow instructions to flush cache before any performance run. Performance characterization and stress testing was done by varying the dataset size from 10GB to 10,000GB. Modify the ~/HiBench-2.1/terasort/conf/configure.sh to set the data set size - # for prepare (total) - 1TB - DATASIZE=10000000000 Data generation (teragen) run the ~/HiBench-2.1/terasort/bin/prepare.sh Program Execution (terasort) run the ~/HiBench-2.1/terasort/bin/run.sh Primary Performance Metrics - Latency job duration shown by the Cloudera Manager Activity Monitor. Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 14 Dell Apache Hadoop Performance Analysis

3.3 K-Means Executed from HiBench scripts as shown in Appendix A1 (1.4 Intel HiBench: K-Means) Follow instructions in Memory to flush cache before any performance run. Performance characterization and stress testing using a complex application was done by varying the dataset size from 0.3GB to 2,500GB. Modify the ~/HiBench-2.1/terasort/conf/configure.sh as shown in Appendix A1 to define the data set size. - Number of samples (n) - 10^3, 10^4, 10^10 - Dimension of each sample (d): 2,4,8,16,32,64 - Number of clusters (k): 2,4,8,16,32,64,128 - Samples per input File: (Number of samples / 5) Data generation run the ~/HiBench-2.1/kmeans/bin/prepare.sh Program Execution run the ~/HiBench-2.1/kmeans/bin/run.sh Primary Performance Metrics - Results obtained from hibench.report Throughput (MB/s) Latency Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 3.4 Flush Memory Before any performance run, it is necessary to flush cache memory on all slave nodes. A script to flush cache on slave nodes with IP addresses ranging from 172.16.2.22 to 172.16.2.30: 1. doall.sh for i in 22 23 24 25 26 27 28 29 30; do echo ne 172.16.2.$i:; ssh 172.16.2.$1; done 2. From the command line #./doall free m #./doall sync & #./do all echo 3> /proc/sys/vm/drop_caches #./doall free m 15 Dell Apache Hadoop Performance Analysis

4 Performance Results 4.1 MapReduce Performance The performance of the MapReduce layer was analyzed by varying the dataset size. At each instance, the CPU characteristics, network utilization and performance metrics were obtained. Performance results were analyzed for evidence of bottlenecks. The SUT was tuned to improve performance. 4.1.1 Performance Characterization Terasort was used to characterize MapReduce performance. The instructions in 4.4.2 were followed to vary the size of the dataset from 10GB to 10000GB. At each instance, performance data was collected and recorded. Figure 3 Performance Characterization Chart Latency (s) 8000 7000 6000 5000 4000 3000 2000 1000 0 10 20 50 80 100 300 500 700 1000 1500 3000 5000 Data Size (GB) 100 90 80 70 60 50 40 30 20 10 0 CPU_utiliization (%) 16 Dell Apache Hadoop Performance Analysis

4.1.2 MapReduce Job: CPU Profile The CPU profile of the Map and Reduce phases of a 1TB sort job was captured. Figure 4 MapReduce Job CPU Profile 4.1.3 MapReduce Networking The nodes in a Hadoop cluster are interconnected through the network. Typically, one or more of the following phases of MapReduce jobs transfers data over the network: 1. Writing data: This phase occurs when the initial data is either streamed or bulk-delivered to HDFS. Data blocks of the loaded files are replicated, transferring additional data over the network. 2. Workload execution: The MapReduce algorithm is run. a. Map phase: In the map phase of the algorithm, almost no traffic is sent over the network. The network is used at the beginning of the map phase only if a HDFS locality miss occurs (the data block is not locally available and has to be requested from another data node). b. Shuffle phase: This is the phase of workload execution in which traffic is sent over the network, the degree to which depends on the workload. Data is transferred over the network when the output of the mappers is shuffled to the reducers. c. Reduce phase: In this phase, almost no traffic is sent over the network because the reducers have all the data they need from the shuffle phase. d. Output replication: MapReduce output is stored as a file in HDFS. The network is used when the blocks of the result file have to be replicated by HDFS for redundancy. 3. Reading data: This phase occurs when the final data is read from HDFS for consumption by the end application, such as the website, indexing, or SQL database. 17 Dell Apache Hadoop Performance Analysis

Figure 5 Network Utilization by MapReduce Phases 4.2 Bottleneck Investigation MapReduce jobs are CPU-intensive. In this review it was possible to stress the slave nodes to attain 100% CPU utilization indicating the absence of MapReduce performance bottlenecks (particularly IO and network bottlenecks). Further analysis of the CPU profile showed very high CPU system time > 30%. Typically, CPU system time should be < 15% and user time should be > 80%. Using techniques described in 5.1.5, the CPU System time was reduced to < 15%. There was evidence of memory swapping with dataset sizes > 3TB indicating that memory is an issue at those sizes. 18 Dell Apache Hadoop Performance Analysis

Figure 6 Tuning CPU System Time 4.2.1 Performance Tuning Typically, performance tuning is performed to fix bottlenecks or to mitigate their impact. For dataset sizes considered in this review (10GB-10,000GB) CPU utilization was identified as the only bottleneck for MapReduce performance indicating that the SUT was already optimized for the best CPU and Memory performance. Further analysis (see Figure 6) of the CPU profile indicated that CPU system time was very high and had to be reduced to further improve performance. Based on best-practices adopted from Cloudera Performance Engineers, the impact of tuning several hardware and software parameters was investigated. In this review, the blocksize and the hadoop configuration parameters provided the most significant boost to performance. These parameters were tuned as shown below and performance characterization tests were repeated with the dataset size of the MapReduce jobs being varied from 100GB to 1000GB. See the results in Figure 7. 1. OS Parameters: Use the doall.sh script shown in Flush Memory in order to apply to all the slave nodes a. Turn down swappiness: # doall.sh echo 0 > /proc/sys/vm/swappiness b. Turn off huge page defrag: #doall.sh echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag 19 Dell Apache Hadoop Performance Analysis

2. Blocksize: The blocksize was increased from 128MB to 512MB using the configuration parameter: dfs.blocksize=536870912 3. Hadoop configuration parameters. The following parameters were tuned: Table 4: Tuning Hadoop Configuration Parameters Parameter Value Default mapred.map.tasks 180 1 mapred.reduce.tasks 64 1 mapred.tasktracker.map.tasks.maximum 24 2 mapred.tasktracker.reduce.tasks.maximum 8 2 MapReduce Child Java Maximum Heap size 1073741824 NULL Datanode Java Heap Size 1073741824 NULL Task Tracker Java Heap Size 1073741824 NULL Namenode Java Heap Size 6442450944 1073741824 Secondary Namenode Java Heap Size 6442450944 1073741824 io.sort.mb 610 256 io.sort.record.percent 0.1379 0.05 io.sort.spill.percent 0.98 0.80 The overall performance improvement due to tuning CPU system time, the blocksize and hadoop configuration parameters was found to be 50.84% based on the reduction in job duration times. Table 5 Performance Boost by Tuning Parameter Performance boost OS parameters 38.10% Blocksize 12.20% Hadoop configuration parameters 0.54% Figure 7 Comparing the Performance of a Tuned to Non-tuned SUT 20 Dell Apache Hadoop Performance Analysis

1600 1400 1200 1000 Job Duration (s) 800 600 400 200 0 100 300 500 700 1000 Data Size (GB) 4.2.2 MapReduce Resource Utilization Resource utilization by MapReduce jobs was observed for 1 day with Teragen/Terasort jobs running on the cluster. It was observed that datanode servers are the work horses of the hadoop cluster and they fully utilized their memory, network and CPU resources. There was evidence of memory swapping on datanode servers when large datasets (> 3TB) were sorted using terasort. The namenode and other infrastructure servers used these resources very lightly. This could have an impact on how these servers are scoped for small clusters. Figure 8 Memory Used by a Datanode Server 21 Dell Apache Hadoop Performance Analysis

Figure 9 Memory Used by a Namenode Serve Figure 10 CPU Utilization by a Datanode Server 22 Dell Apache Hadoop Performance Analysis

Figure 11 CPU Utilization by a Namenode Server 4.2.3 Analysis of a Teragen/Terasort Job The result of a single Teragen/Terasort job was captured as a data point for comparison with future benchmark reviews. The result of a single Teragen/Terasort job was captured as a data point for comparison with future benchmark reviews. Table 6: Analysis of a Teragen/Terasort Job Parameter Teragen Terasort Input parameters Number of rows 10000000000 Dataset size 1TB Results Job duration 1032s (17mins 12s) 725s (12mins 5s) Network utilization 83.4% 47.9% CPU Utilization Map phase Shuffle/Reduce phase 90% 31% 97% 45% 23 Dell Apache Hadoop Performance Analysis

4.3 MapReduce Performance under a complex application K-Means K-Means provides an application that can be configured to match the complexity of real-world use-cases. Refer to Appendix A1 (Intel HiBench: K-Means) and K-Means for details on how K-Means was installed and implemented. It was observed that performance characteristics were largely impacted by the sample size (n); Performance characterization tests were performed for small samples with n =< 103, and then repeated for large samples with n=> 1010. In all cases and for each job, sample size (n), the dimension (d) and the number of clusters (k) was varied. For each sample size (small or large) and number of dimensions, the cluster size was varied. 4.3.1 K-Means Job Time It was observed that time complexity for a small sample was exponential on the number of clusters * samples: Where d=dimension, k = clusters, n = samples Figure 12 Small Sample Duration Small Sample : Duration - # Samples = 10^3 - Data generated x, Run Time (seconds) 14000 12000 10000 8000 6000 4000 2000 d=2 d=4 d=8 d=16 d=32 d=64 0 2 4 8 16 32 64 128 # Clusters For large samples, job completion time is almost linearly proportional to the number of clusters, k. 24 Dell Apache Hadoop Performance Analysis

Figure 13 Large Sample Duration and CPU Large Sample (10 10 ) Duration and CPU Run Time (s) 12,000.00 10,000.00 8,000.00 6,000.00 4,000.00 2,000.00 0.00 98 96 96 98 98 98 2 4 8 16 32 64 # Clusters 100 90 80 70 60 50 40 30 20 10 0 CPU Utilization (%) Duration CPU 4.3.2 K-Means CPU Utilization For small samples, 80% CPU was attained when d>16 and k > 32. Figure 14 K-Means Small Sample CPU utilization Small Sample: CPU CPU Utilization (%) 100 90 80 70 60 50 40 30 20 10 0 2 4 8 16 32 64 128 # Clusters d=2 d=4 d=8 d=16 d=32 d=64 For large samples, 80% CPU is attained even with k=1. Refer to CPU Profile Large Sample Jobs. 25 Dell Apache Hadoop Performance Analysis

Figure 15 CPU Profile Large Sample Jobs - Large Samples = 10 10 - CPU Utilization ~95% - User time ~ 90% - System time ~ 5% 4.3.3 Thoroughput MapReduce throughput under K-Means, a complex application was obtained and analyzed. K-Means Throughput was provided by the HiBench report and was an indication of the rate at which data was analyzed. This was based on the amount of data and time taken to compute centroids (for each iteration) before convergence. For low complex (d<8) small samples, throughput increases linearly with number of clusters k. For high complex (d>16) small samples, throughput starts to drop for large numbers of clusters. Figure 16 Small Sample Throughputss Throughput (MB/s) 80 70 60 50 40 30 20 10 0 Small Sample : Throughput 2 4 8 16 32 64 128 # Clusters d=2 d=4 d=8 d=16 d=32 d=64 For large samples, throughput drops linearly with the number of clusters. 26 Dell Apache Hadoop Performance Analysis

Figure 17 Large Sample Throughput 250.00 Large Sample: Throughput Throughput (MB/s) 200.00 150.00 100.00 50.00 d=8 0.00 2 4 8 16 32 64 # Clusters 27 Dell Apache Hadoop Performance Analysis

4.3.4 MapReduce Resource Utilization Under a Complex Application Figure 18 Resource Utilization Under K-Means 28 Dell Apache Hadoop Performance Analysis

The figure above show how CPU and IO resources of the SUT were utilized with a large-sample K-Means job (d=2, k=2) running for about 1 hour. The charts show that CPU utilization was high (98%) throughout the 5 iterations. There was significant read I/O activity during the run but write I/O activity kicked in after the 5 iterations had completed. Network traffic was noticeable after the iterations. Memory utilization was high (~80%) throughout the run. 4.3.5 Analysis of a K-Means Job The result of a single K-Means job was captured as a data point for comparison with future benchmark reviews. Table 7 Analysis of a K-Means Job Parameter Value Input Parameters Sample size (n) 10^10 Dimensions (d) 32 Clusters (k) 8 Samples per input file 6000000 Maximum number of Iterations 5 Results Input size Total time Throughput 2648448007960 bytes (2.4 TB) 13,874.786 seconds (~ 3hrs 48 mins) 190882079 bytes/second (182.04 MB/s) CPU 100% Network utilization 84% 4.4 HDFS Performance The TestDFSIO benchmark was used to analyze the performance of the HDFS layer. For instructions on how to run this benchmark refer to Appendix A1 ( Apache Hadoop TestDFSIO) and TestDFSIO. 4.5 HDFS Write Performance For this SUT, job execution times rise linearly with the size of the dataset up to 1000GB. For dataset sizes of 1000GB and bigger, job execution times rise exponentially. This is mainly due to replication and the limitations of the network bandwidth. The default replication factor of 3 was maintained for all HDFS tests. As dataset sizes increase, the multiplier effect of the replication factor comes into play requiring more data 29 Dell Apache Hadoop Performance Analysis

to be transferred across the network. As the network becomes the bottleneck, data transfer is constrained and leads to increased job execution times. Figure 19 HDFS Write Performance Chart IO throughput (MB/s) 4000 3500 3000 2500 2000 1500 1000 500 0 nrfiles = 1000 time 100 200 400 600 800 1000 1500 2000 3000 5000 Data size (GB) 6000 5000 4000 3000 2000 1000 0 Job duration time (s) 4.5.1 HDFS Read Performance TestDFSIO Read jobs are processed locally within each Data Node with no significant transfer of data across the network. Network traffic is therefore not as significant as that expected in write jobs. In addition to the underlying I/O hardware architecture (disks, controllers), the number of files to process and the number of available hadoop map and reduce slots have a significant impact on read performance. Figure 20 HDFS Read Performance Chart 12000 10000 8000 IO throughput (MB/s) 6000 4000 2000 0 nrfiles = 1000 tim e 100 200 400 600 800 1000 1500 2000 3000 5000 Data size (GB) 3000 2500 2000 1500 1000 500 0 Job duration time (s) 30 Dell Apache Hadoop Performance Analysis

4.5.2 IO Bottleneck Investigation 3 possible IO limits were considered. 4.5.2.1 SAS Limit An LSI whitepaper, Switched SAS: Sharable, Scalable SAS Infrastructure shows how to calculate the SAS limit of an 8 lane controller port with a SAS bandwidth of 6Gbps: 6Gb/s x 8 lanes = 48Gb/s per x8 port 48Gb/s (8b/10b encoding) = 4.8GB/sec per port (per node) 4.8GB/s per port x 88.33% (arbitration delays and additional framing) = 4320MB/s per port 4.5.2.2 PCI-E Slot The Dell R720 provides integrated PCI-E Gen-3 capable slots. Gen-3 is defined at 8 Gbps; this gives a bandwidth of 8.0 Gb/s (Scrambling + 128b/130b encoding instead of 8b/10b encoding) per lane, so for example a PCIe Gen-3 x8 link delivers an aggregate bandwidth of 8 GB/s 4.5.2.3 Network Each Slave node has 2x1GbE NIC bonded interfaces. The full-duplex bandwidth (BW) per node: BW = 1 Gb/s x 2 (interfaces) x 2 (full duplex) / 8 (bits) = 0.5 GB/s Allowing for 20% transmission overheads, the nominal BW is expected at ~ 400MB/s per node The IO limits per node are summarized in the following table. Table 8 IO Bottlenecks Component SAS Controller PCI-E Gen-3 Slot 2x1GbE NIC Interfaces Max Bandwidth 4.8 GB/s 8.0 GB/s 400MB/s It is clear that the network has the lowest bandwidth limit. Write IO performance is severely impacted by the network limit due to the requirement to transfer data across the network. Read IO performance is more dependent on the IO bandwidth limitations of the underlying IO components (SAS controller, PCI slots, disks etc.). Since these limits are high for each node, the Read Performance of this SUT depended more on TestDFSIO (number of Files nrfiles) and hadoop parameters (number of map slots). 4.5.3 HDFS Distributed Processing The number of distributed partitions significantly impacts HDFS performance. The more the partitions (distributed files) the better the performance as shown in the chart below which shows how write performance is impacted by the number of files (nrfiles). 31 Dell Apache Hadoop Performance Analysis

Figure 21 HDFS Distributed Processing Chart 4000 3500 time 14000 12000 IO write throughput (MB/s) 3000 2500 2000 1500 1000 500 10000 Job duration time (s) 8000 6000 4000 2000 0 1 10 100 200 500 1000 nrfiles 0 32 Dell Apache Hadoop Performance Analysis

4.5.4 Analysis of a TestDFSIO Job The results of read and write TestDFSIO jobs were captured as a data point that could be used for future reference. Table 9 Analysis of a TestDFSIO Job Parameter Write Performance Read Performance Input parameters nrfiles 1000 1000 filesize 1000MB 1000 option -write -read Results Throughput 3310 MB/s 8820 MB/s IO Rate 4000 MB/s 19940 MB/s Execution time 1224.3s 488.39s Network utilization 85.4% 79.2% CPU Utilization Map phase Shuffle/Reduce phase 100% 25% 70% 27% 33 Dell Apache Hadoop Performance Analysis

5 Conclusion This R720/R720xd benchmarking review is the first in a series that are planned to be conducted with every major release of the Dell Cloudera Hadoop solution. Every attempt was made to perform all tests recommended in the Benchmarking Plan and Guide but for various reasons a number of them could not be performed. The results obtained in this review will be used as baseline data for comparing the performance of subsequent RA revisions, configurations and performance optimizations. The main achievements of this benchmarking review are: Performance characterization of the R720/720xd RA using CPU and IO intensive workloads. Stress testing the RA. Understanding the behavior of the RA as the load increases up to the point when bottlenecks become evident. Bottleneck investigation. For the size of the hadoop cluster under review, the main bottlenecks are CPU and the network. Results from this review show when each bottleneck comes into play. Performance tuning. Software (OS & hadoop) tuning techniques were employed to mitigate the impact of the CPU bottleneck. These tweaks provided a performance boost of 50.84% over a nontuned system. This implies that a Terasort job will run at least 50% faster after the tweaks. These tuning tweaks should be incorporated into the solution. Based on the performance results and analysis, some recommendations have been made and should be considered in order to improve the performance of the Dell Cloudera Hadoop Solution: 1. Performance tuning use techniques in Performance Tuning to improve the performance of the solution by over 50%. a. Apply hadoop configuration parameters. b. Apply OS parameters. c. Increase the block size from the default 128MB to 512MB or more. 2. RA changes should be subject to a cost/benefit analysis. a. More memory to the Slave nodes > 64GB. b. Less memory on Infrastructure nodes. c. Fewer processing elements (CPU/Cores) on Infrastructure nodes. d. More processing elements (CPU/Cores) on Slave nodes. 34 Dell Apache Hadoop Performance Analysis

6 Appendix A Appendix A Test Environment: Open source workloads were used to generate data and submit jobs to the hadoop cluster. Appendix A1 Test Suites: The high level goal of this benchmarking review was to test the architectural components of Hadoop; MapReduce, HDFS and how they interact with the underlying hardware infrastructure. Workloads were selected based on how best they could exercise the hardware components (IO, CPU and Network) that have the biggest impact on MapReduce and HDFS: Table 10 Test Suites Benchmark Distribution Hadoop Stressed Component Hardware Component Stressed TestDFSIO ApacheHadoop HDFS IO, Network Teragen HiBench 2.2 HDFS IO, Network Terasort HiBench 2.2 MapReduce CPU K-means HiBench 2.2 MapReduce Application level, CPU 6.1.1 Apache Hadoop TestDFSIO The TestDFSIO benchmark is a read and write test for HDFS. TestDFSIO is used to measure performance of HDFS and stresses both the network and IO subsystems. The command reads and writes files in HDFS which is useful in measuring system-wide performance and exposing network bottlenecks on the Hadoop cluster. A majority of HDFS workloads are IO bound more than compute and hence TestDFSIO can provide an accurate initial picture of such scenarios. Nevertheless, because this test is run as a MapReduce job, the MapReduce stack of the cluster must be correctly working. In other words, this test cannot be used to benchmark HDFS in isolation from MapReduce. The benchmark can be run for writing, using the write switch, and using read for the read test. The command line accepts a number of files and sizes of each file in HDFS. The command used to generate and write 1000 files each 1000MB is as below: # sudo u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-test-2.0.0-mr1-cdh4.1.1.jar TestDFSIO write nrfiles 1000 filesize 1000 resfile /tests/testdfsio/results.txt TestDFSIO generates 1 map task per file and splits are defined such that each map processes a single file. After every run, the command generates a log file indicating performance in terms of 4 metrics: Throughput in MB/s, Average IO rate in MB/s, IO rate standard deviation and execution time. The most notable metrics are throughput and average IO, both of which are based on file size read or written by the individual map task and the elapsed time in performing the task. The throughput and IO rate for N map tasks is defined as: 35 Dell Apache Hadoop Performance Analysis

If the cluster has 50 map slots and TestDFSIO creates 1000 files, the throughput can be calculated as: Concurrent Throughput = Reported Throughput x Number of Map Slots The IO rate can be calculated in a similar fashion. While measuring cluster performance using TestDFSIO may be considered sufficient, the HDFS replication factor (value of dfs.replication ) also plays an important role. A lower replication factor leads to higher throughput performance due to reduced background traffic. 6.1.2 Intel HiBench HiBench is a benchmarking suite for Hadoop. It consists of a set of Hadoop programs, including both synthetic micro-benchmarks and real-world Hadoop applications. An overview of the benchmark can be obtained at github (https://github.com/intel-hadoop/hibench). This review used the following HiBench 2.1 micro-benchmarks: Terasort CPU intensive workload to characterize the performance of and stress-test the MapReduce layer. K-means CPU intensive workload to characterize MapReduce performance on a SUT running complex hadoop applications. The HiBench suite is hierarchically organized with each micro-benchmark having a similar directory structure. Each micro-benchmark has the following files with tunable parameters: ~/conf/configure.sh sets the environment data size compression run-time hadoop parameters ~/bin/prepare.sh Data generation Run-time hadoop parameters ~/bin/run.sh Benchmark execution Run-time hadoop parameters 36 Dell Apache Hadoop Performance Analysis

6.1.3 HiBench: Terasort Terasort is part of the Apache Hadoop distribution and is available on any cluster. This review used the package that was distributed with the HiBench suite. It is distributed as a 2-part package: Teragen is a map/reduce data generator. Given a dataset size, It divides the desired number of rows by the desired number of tasks and assigns ranges of rows to each map. The map uses the random number generator to jump to the correct value for the first row and generates the subsequent rows. Teragen is executed by the prepare.sh scripts. Terasort is a standard sort program that samples the input data generated by teragen and uses map/reduce to sort the data into a total order. Terasort is executed via run.sh. 6.1.3.1 Run-time Scripts These were the tunable parameters implemented in running Terasort: ~/HiBench/bin/hibench-config.sh # switch on/off compression: 0-off, 1-on export COMPRESS_GLOBAL=0 export COMPRESS_CODEC_GLOBAL=org.apache.hadoop.io.compress.DefaultCodec ~/HiBench/terasort/conf/configure.sh # for prepare (total) - 1TB DATASIZE=10000000000 #Number of Map tasks NUM_MAPS=180 #Number of Reduce tasks NUM_REDS=64 ~/HiBench/terasort/bin/prepare.sh 37 Dell Apache Hadoop Performance Analysis

# Generate the terasort data hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-2.0.0-mr1-cdh4.1.1-examples.jar teragen \ -D mapred.map.tasks=$num_maps \ $DATASIZE $INPUT_HDFS ~/HiBench/terasort/bin/run.sh # run bench hadoop jar $HADOOP_HOME/hadoop-2.0.0-mr1-cdh4.1.2-examples.jar terasort -D mapred.reduce.tasks=$num_reds $INPUT_HDFS $OUTPUT_HDFS # post-running END_TIME=`timestamp` gen_report "TERASORT" ${START_TIME} ${END_TIME} ${SIZE} >> ${HIBENCH_REPORT} 6.1.4 Intel HiBench: K-Means K-means is a data mining, cluster analysis algorithm that aims to partition n observations (x 1, x 2,, x n ), into k sets (clusters) (k n) where S = {S 1, S 2,, S k } Each observation belongs to the cluster with the nearest mean i.e. the one with the most similar items: 1. k centroids are selected. 2. Each item in the sample is placed in a cluster with the least distance (nearest centroid). 3. For each group of points assigned to the same center, compute a new center by taking the centroid of the points. 4. Repeat until there is convergence. This review used the K-means benchmark from Intel HiBench 2.1 suite 6.1.4.1 Runtime scripts ~/HiBench/bin/hibench-config.sh - - 38 Dell Apache Hadoop Performance Analysis

###################### Global Paths ################## export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce HADOOP_CONF_DIR=$HADOOP_HOME/conf HADOOP_EXAMPLES_JAR=$HADOOP_HOME/hadoop-examples*.jar if [ -z "$HIBENCH_HOME" ]; then fi export HIBENCH_HOME=/var/lib/hadoop-hdfs/hibench if [ -z "$HIBENCH_CONF" ]; then fi export HIBENCH_CONF=${HIBENCH_HOME}/conf if [ -f "${HIBENCH_CONF}/funcs.sh" ]; then fi source "${HIBENCH_CONF}/funcs.sh" if [ -z "$HIVE_HOME" ]; then fi export HIVE_HOME=/usr/lib/hive if [ -z "$MAHOUT_HOME" ]; then fi export MAHOUT_HOME=/usr/lib/mahout 39 Dell Apache Hadoop Performance Analysis

if [ -z "$DATATOOLS" ]; then export DATATOOLS=${HIBENCH_HOME}/common/autogen/dist/datatools.jar fi ~/HiBench/kmeans/conf/configure.sh for prepare # Number of clusters (k) NUM_OF_CLUSTERS=128 # Number of samples (n) NUM_OF_SAMPLES=100 #SAMPLES_PER_INPUTFILE=4000000 SAMPLES_PER_INPUTFILE=6000000 # Number of dimensions (d) DIMENSIONS=4 # for running MAX_ITERATION=5 ~/HiBench/kmeans/bin/prepare.sh # generate data 40 Dell Apache Hadoop Performance Analysis

OPTION="-sampleDir ${INPUT_SAMPLE} -clusterdir ${INPUT_CLUSTER} -numclusters ${NUM_OF_CLUSTERS} -numsamples ${NUM_OF_SAMPLES} -samplesperfile ${SAMPLES_PER_INPUTFILE} -sampledimension ${DIMENSIONS}" export HADOOP_CLASSPATH=`mahout classpath tail -1` hadoop jar /var/lib/hadoop-hdfs/hibench/common/autogen/dist/datatools.jar org.apache.mahout.clustering.kmeans.genkmeansdataset -libjars $MAHOUT_HOME/mahout-examples- 0.7-cdh4.1.2-job.jar ${OPTION} ~/HiBench/kmeans/bin/run.sh OPTION="-i ${INPUT_SAMPLE} -c ${INPUT_CLUSTER} -o ${OUTPUT_HDFS} -x ${MAX_ITERATION} -ow - cl -cd 0.5 -dm org.apache.mahout.common.distance.euclideandistancemeasure -xm mapreduce" START_TIME=`timestamp` #START_TIME=date +%s echo $MAHOUT_HOME # run bench mahout kmeans ${OPTION} # post-running END_TIME=`timestamp` echo $END_TIME gen_report "KMEANS" ${START_TIME} ${END_TIME} ${SIZE}>> ${HIBENCH_REPORT 41 Dell Apache Hadoop Performance Analysis

6.2 Appendix A2: Test Methodology The hardware and network configurations are set up. Crowbar was used to deploy the Hadoop cluster infrastructure and Cloudera Manager deployed hadoop. 6.2.1 Hadoop Infrastructure Deployment The hadoop cluster infrastructure was setup and configured by Crowbar. Please refer to the Dell Cloudera Solution Deployment Guide and Dell Cloudera Solution Crowbar Administrator User Guide for details. 1. Follow instructions to install the Crowbar Admin node. 2. Use a browser to connect to Crowbar. 3. Power up servers that will be part of the Hadoop cluster. 4. Allow the servers to PXE boot from the Crowbar admin node. 5. Note that the nodes get discovered in Crowbar. 6. Create, Edit, save and apply the Cloudera Manager Barclamp. 7. Crowbar configures BIOS and RAID configurations and installs the OS on all the nodes. 8. After completing the OS install, the nodes should transition to the Ready state in Crowbar UI. 9. Note that all the Hadoop cluster nodes and the Cloudera Manager Barclamp are in the Ready state (Green LED) in Crowbar UI. 6.2.2 Cloudera Manager Deployment Instructions on how to deploy Cloudera Manager can be found in the Dell Cloudera Solution Crowbar Administrator User Guide. Follow the link available from the Cloudera-manager server to login to Cloudera Manager. Provide a license in order to use the Enterprise Edition. This review used the Enterprise Edition of Cloudera Manager for its monitoring capabilities and tools. Install Hadoop Core services (HDFS, Map Reduce, HUE and Oozie). Verify that these services are up and running with good health. 6.3 Appendix A3: Installing Benchmarks The benchmarks specified in Appendix A1 Test Suites are installed as shown in this section. 6.3.1 TestDFSIO TestDFSIO was used in this review and was available with CDH 4.1.1 distribution of Hadoop. The program can be run from the command line. 6.3.2 Intel HiBench HiBench 2.1 was used to provide and manage Teragen, Terasort and K-Means packages. 1. Download the latest version of HiBench 2.x (ZIP) from github: https://github.com/intelhadoop/hibench. 2. For the HiBench 2.1 used in this review, the download is available at: https://github.com/hibench/hibench-2.1. 42 Dell Apache Hadoop Performance Analysis

3. Download the zipball to the following suggested directory of the Master node server /home/hibench. 4. Unzip and extract the zipball. 5. Rename (mv) ~/hibench/hibench-hibench-2.1-* to ~/hibench/hibench-2.1 # chown hdfs:hdfs R /home/hibench/hibench-2.1/. 6. Mahout packages are required for implementation of K-Means. The default installation of Hadoop does not provide Mahout. The version of Mahout that is downloaded with HiBench 2.1 has compatibility issues with CDH4. The Cloudera version of Mahout has those issues settled. Download the mahout package from the following link: http://www.cloudera.com/content/cloudera-content/clouderadocs/cdhtarballs/3.25.2013/cdh4-downloadable-tarballs/cdh4-downloadable-tarballs.html 7. Search for mahout-0.7+16. 8. Unzip and untar. 9. Modify configuration and run scripts as shown in Appendix A1 Test Suites 43 Dell Apache Hadoop Performance Analysis

7 Appendix B: Hadoop Configuration Parameters A complete listing of the hadoop configuration parameters for a 1TB Terasort job is shown in the following table. name value job.end.retry.interval 30000 mapred.job.tracker.retiredjobs.cache.size 1000 mapred.queue.default.acl-administer-jobs * dfs.image.transfer.bandwidthpersec 0 mapred.task.profile.reduces 0-2 mapreduce.jobtracker.staging.root.dir ${hadoop.tmp.dir}/mapred/staging mapred.job.reuse.jvm.num.tasks -1 dfs.block.access.token.lifetime 600 fs.abstractfilesystem.file.impl org.apache.hadoop.fs.local.localfs mapred.reduce.tasks.speculative.execution hadoop.ssl.keystores.factory.class FALSE org.apache.hadoop.security.ssl.filebasedkeystor esfactory mapred.job.name hadoop.http.authentication.kerberos.keyta b TeraSort ${user.home}/hadoop.keytab io.seqfile.sorter.recordlimit 1000000 s3.blocksize 67108864 dfs.namenode.num.checkpoints.retained 2 hadoop.relaxed.worker.version.check TRUE mapred.task.tracker.http.address 0.0.0.0:50060 dfs.namenode.delegation.token.renewinterval 86400000 io.map.index.interval 128 s3.client-write-packet-size 65536 dfs.namenode.http-address dd4-ae-52-89-6e-31.dell.com:50070 ha.zookeeper.session-timeout.ms 5000 mapred.system.dir ${hadoop.tmp.dir}/mapred/system hadoop.hdfs.configuration.version 1 s3.replication 3 dfs.datanode.balance.bandwidthpersec 1048576 44 Dell Apache Hadoop Performance Analysis

mapred.task.tracker.report.address 127.0.0.1:0 mapred.jobtracker.plugins org.apache.hadoop.thriftfs.thriftjobtrackerplugi n jobtracker.thrift.address dd4-ae-52-89-6e-31.dell.com:9290 mapreduce.reduce.shuffle.connect.timeou 180000 t dfs.journalnode.rpc-address 0.0.0.0:8485 hadoop.ssl.enabled FALSE mapreduce.job.counters.max 120 dfs.datanode.readahead.bytes 4193404 ipc.client.connect.max.retries.on.timeouts 45 mapred.healthchecker.interval 60000 mapreduce.job.complete.cancel.delegatio TRUE n.tokens dfs.client.failover.max.attempts 15 dfs.namenode.checkpoint.dir file://${hadoop.tmp.dir}/dfs/namesecondary dfs.namenode.replication.work.multiplier.p 2 er.iteration fs.trash.interval 0 hadoop.jetty.logs.serve.aliases TRUE mapred.skip.map.auto.incr.proc.count TRUE hadoop.http.authentication.kerberos.princi HTTP/_HOST@LOCALHOST pal terasort.num-rows 10000000000 s3native.blocksize 67108864 mapred.child.tmp./tmp mapred.tasktracker.taskmemorymanager. 5000 monitoring-interval dfs.namenode.edits.dir ${dfs.namenode.name.dir} dfs.encrypt.data.transfer FALSE dfs.datanode.http.address 0.0.0.0:50075 io.sort.spill.percent 0.98 dfs.client.use.datanode.hostname FALSE mapred.job.shuffle.input.buffer.percent 0.7 hadoop.skip.worker.version.check FALSE hadoop.security.instrumentation.requires.a FALSE 45 Dell Apache Hadoop Performance Analysis

dmin mapred.skip.map.max.skip.records 0 mapreduce.reduce.shuffle.maxfetchfailure 10 s hadoop.security.authorization FALSE user.name hdfs dfs.client.failover.connection.retries.on.tim 0 eouts hadoop.security.group.mapping.ldap.searc (objectclass=group) h.filter.group dfs.namenode.safemode.extension 30000 mapred.task.profile.maps 0-2 dfs.datanode.sync.behind.writes FALSE dfs.https.server.keystore.resource ssl-server.xml mapred.local.dir ${hadoop.tmp.dir}/mapred/local hadoop.security.group.mapping.ldap.searc cn h.attr.group.name mapred.merge.recordsbeforeprogress 10000 mapred.job.tracker.http.address 0.0.0.0:50030 dfs.namenode.replication.min 1 mapred.compress.map.output TRUE mapred.userlog.retain.hours 24 s3native.bytes-per-checksum 512 tfile.fs.output.buffer.size 262144 mapred.tasktracker.reduce.tasks.maximum 8 fs.abstractfilesystem.hdfs.impl org.apache.hadoop.fs.hdfs dfs.namenode.safemode.min.datanodes 0 mapred.disk.healthchecker.interval 60000 dfs.client.https.need-auth FALSE dfs.client.https.keystore.resource ssl-client.xml dfs.namenode.max.objects 0 mapred.cluster.map.memory.mb -1 hadoop.ssl.client.conf ssl-client.xml dfs.namenode.safemode.threshold-pct 0.999f dfs.blocksize 536870912 dfs.thrift.threads.max 20 mapreduce.job.submithost dd4-ae-52-89-6e-31.dell.com hue.kerberos.principal.shortname hue 46 Dell Apache Hadoop Performance Analysis

mapreduce.tasktracker.outofband.heartbe FALSE at io.native.lib.available TRUE dfs.client-write-packet-size 65536 mapred.jobtracker.restart.recover FALSE mapred.reduce.child.log.level INFO mapreduce.shuffle.ssl.address 0.0.0.0 dfs.namenode.name.dir file://${hadoop.tmp.dir}/dfs/name dfs.ha.log-roll.period 120 dfs.client.failover.sleep.base.millis 500 dfs.datanode.directoryscan.threads 1 dfs.permissions.enabled TRUE dfs.support.append TRUE mapred.inmem.merge.threshold 1000 ipc.client.connection.maxidletime 10000 mapreduce.shuffle.ssl.enabled ${hadoop.ssl.enabled} dfs.namenode.invalidate.work.pct.per.iterat 0.32f ion dfs.blockreport.intervalmsec 21600000 fs.s3.sleeptimeseconds 10 dfs.namenode.replication.considerload TRUE dfs.client.block.write.retries 3 hadoop.ssl.server.conf ssl-server.xml mapred.jobtracker.retirejob.interval 86400000 dfs.namenode.name.dir.restore FALSE dfs.datanode.hdfs-blocksmetadata.enabled TRUE mapred.reduce.tasks 0 ha.zookeeper.parent-znode /hadoop-ha mapred.queue.names default io.seqfile.lazydecompress TRUE dfs.https.enable FALSE mapred.fairscheduler.preemption FALSE 47 Dell Apache Hadoop Performance Analysis

mapred.hosts.exclude /var/run/cloudera-scm-agent/process/705- mapreduce- JOBTRACKER/mapred_hosts_exclude.txt dfs.replication 3 ipc.client.tcpnodelay FALSE dfs.namenode.accesstime.precision 3600000 mapred.output.format.class org.apache.hadoop.examples.terasort.teraoutpu tformat mapred.acls.enabled FALSE s3.stream-buffer-size 4096 mapred.tasktracker.dns.nameserver default mapred.submit.replication 3 io.compression.codecs org.apache.hadoop.io.compress.defaultcodec,o rg.apache.hadoop.io.compress.gzipcodec,org.a pache.hadoop.io.compress.bzip2codec,org.apa che.hadoop.io.compress.deflatecodec,org.apac he.hadoop.io.compress.snappycodec io.file.buffer.size 65536 mapred.map.tasks.speculative.execution FALSE dfs.namenode.checkpoint.txns 40000 mapred.map.child.log.level INFO kfs.replication 3 rpc.engine.org.apache.hadoop.hdfs.protoc org.apache.hadoop.ipc.protobufrpcengine olpb.clientnamenodeprotocolpb mapred.map.max.attempts 4 dfs.ha.tail-edits.period 60 kfs.stream-buffer-size 4096 mapred.job.shuffle.merge.percent 0.66 hadoop.security.authentication simple fs.s3.buffer.dir ${hadoop.tmp.dir}/s3 mapred.skip.reduce.auto.incr.proc.count mapred.job.tracker.jobhistory.lru.cache.siz e TRUE 5 48 Dell Apache Hadoop Performance Analysis

dfs.client.file-block-storagelocations.timeout 60 dfs.datanode.drop.cache.behind.writes FALSE tfile.fs.input.buffer.size 262144 dfs.block.access.token.enable FALSE dfs.journalnode.http-address 0.0.0.0:8480 mapreduce.job.acl-view-job mapred.job.queue.name default ftp.blocksize 67108864 dfs.datanode.data.dir file://${hadoop.tmp.dir}/dfs/data mapred.job.tracker.persist.jobstatus.hours 0 dfs.https.port 50470 dfs.namenode.replication.interval 3 mapred.fairscheduler.assignmultiple TRUE mapreduce.tasktracker.cache.local.numbe 10000 rdirectories dfs.namenode.https-address dd4-ae-52-89-6e-31.dell.com:50470 dfs.ha.automatic-failover.enabled FALSE ipc.client.kill.max 10 mapred.healthchecker.script.timeout 600000 mapred.tasktracker.map.tasks.maximum 24 hadoop.proxyuser.oozie.hosts * dfs.client.failover.sleep.max.millis 15000 jobclient.completion.poll.interval 5000 mapred.job.tracker.persist.jobstatus.dir /jobtracker/jobsinfo mapreduce.shuffle.ssl.port 50443 dfs.default.chunk.view.size 32768 kfs.bytes-per-checksum 512 mapred.reduce.slowstart.completed.maps 0.8 hadoop.http.filter.initializers org.apache.hadoop.http.lib.staticuserwebfilter mapred.mapper.class org.apache.hadoop.examples.terasort.teragen$ SortGenMapper dfs.datanode.failed.volumes.tolerated 0 io.sort.mb 256 49 Dell Apache Hadoop Performance Analysis

mapred.hosts /var/run/cloudera-scm-agent/process/705- mapreduce- JOBTRACKER/mapred_hosts_allow.txt hadoop.http.authentication.type simple dfs.datanode.data.dir.perm 700 ipc.server.listen.queue.size 128 file.stream-buffer-size 4096 dfs.namenode.fs-limits.max-directoryitems 0 io.mapfile.bloom.size 1048576 ftp.replication 3 dfs.datanode.dns.nameserver default mapred.child.java.opts -Xmx1073741824 dfs.replication.max 512 mapred.queue.default.state RUNNING map.sort.class org.apache.hadoop.util.quicksort dfs.stream-buffer-size 4096 hadoop.job.history.location file:////var/log/hadoop-0.20-mapreduce/history dfs.namenode.backup.address 0.0.0.0:50100 mapred.jobtracker.instrumentation org.apache.hadoop.mapred.jobtrackermetricsin st hadoop.util.hash.type murmur dfs.block.access.key.update.interval 600 dfs.datanode.use.datanode.hostname FALSE dfs.datanode.dns.interface default dfs.namenode.backup.http-address 0.0.0.0:50105 mapred.output.compression.type BLOCK dfs.thrift.timeout 60 mapred.skip.attempts.to.start.skipping 2 kfs.client-write-packet-size 65536 ha.zookeeper.acl world:anyone:rwcda 50 Dell Apache Hadoop Performance Analysis

mapreduce.job.dir hdfs://dd4-ae-52-89-6e- 31.dell.com:8020/user/hdfs/.staging/job_201304 152124_0001 io.map.index.skip 0 net.topology.node.switch.mapping.impl org.apache.hadoop.net.scriptbasedmapping mapred.cluster.max.map.memory.mb -1 fs.s3.maxretries 4 dfs.namenode.logging.level info s3native.client-write-packet-size 65536 mapred.task.tracker.task-controller org.apache.hadoop.mapred.defaulttaskcontroll er mapred.userlog.limit.kb 0 hadoop.http.staticuser.user dr.who mapred.input.format.class org.apache.hadoop.examples.terasort.teragen$ RangeInputFormat mapreduce.ifile.readahead.bytes 4194304 hadoop.http.authentication.simple.anonym TRUE ous.allowed hadoop.fuse.timer.period 5 dfs.namenode.num.extra.edits.retained 1000000 hadoop.rpc.socket.factory.class.default org.apache.hadoop.net.standardsocketfactory dfs.namenode.handler.count 10 fs.automatic.close TRUE mapreduce.job.submithostaddress 172.16.2.21 dfs.datanode.directoryscan.interval 21600 mapred.map.tasks 180 mapred.local.dir.minspacekill 0 mapred.job.map.memory.mb -1 mapred.jobtracker.completeuserjobs.maxi 100 mum mapreduce.jobtracker.split.metainfo.maxsi 10000000 ze 51 Dell Apache Hadoop Performance Analysis

mapred.cluster.max.reduce.memory.mb -1 mapred.cluster.reduce.memory.mb -1 s3native.replication 3 mapred.task.profile mapred.reduce.parallel.copies 10 dfs.heartbeat.interval 3 FALSE dfs.ha.fencing.ssh.connect-timeout 30000 local.cache.size 10737418240 net.topology.script.file.name dfs.client.file-block-storagelocations.num-threads 10 jobclient.progress.monitor.poll.interval 1000 dfs.bytes-per-checksum 512 ftp.stream-buffer-size 4096 mapred.fairscheduler.allow.undeclared.po TRUE ols hadoop.security.group.mapping.ldap.searc member h.attr.member dfs.blockreport.initialdelay 0 mapred.min.split.size 0 hadoop.http.authentication.token.validity 36000 dfs.namenode.delegation.token.maxlifetime 604800000 mapred.output.compression.codec org.apache.hadoop.io.compress.defaultcodec /var/run/cloudera-scm-agent/process/705- mapreduce-jobtracker/topology.py io.sort.factor 64 kfs.blocksize 67108864 mapred.task.timeout 600000 mapred.fairscheduler.poolnameproperty user.name dfs.namenode.secondary.http-address 0.0.0.0:50090 ipc.client.idlethreshold 4000 ipc.server.tcpnodelay FALSE ftp.bytes-per-checksum 512 mapred.output.dir hdfs://dd4-ae-52-89-6e- 31.dell.com:8020/HiBench/Terasort/Input 52 Dell Apache Hadoop Performance Analysis

group.name hdfs s3.bytes-per-checksum 512 mapred.heartbeats.in.second 100 fs.s3.block.size 67108864 dfs.client.failover.connection.retries 0 mapred.map.output.compression.codec org.apache.hadoop.io.compress.snappycodec hadoop.rpc.protection mapred.task.cache.levels 2 mapred.tasktracker.dns.interface hadoop.security.auth_to_local dfs.secondary.namenode.kerberos.internal. spnego.principal authentication default DEFAULT ${dfs.web.authentication.kerberos.principal} hadoop.proxyuser.hue.hosts * ftp.client-write-packet-size 65536 mapred.output.key.class org.apache.hadoop.io.text fs.defaultfs hdfs://dd4-ae-52-89-6e-31.dell.com:8020 file.client-write-packet-size 65536 mapred.job.reduce.memory.mb -1 mapred.max.tracker.failures 4 fs.trash.checkpoint.interval 0 mapred.fairscheduler.allocation.file fair-scheduler.xml hadoop.http.authentication.signature.secre t.file ${user.home}/hadoop-http-auth-signaturesecret s3native.stream-buffer-size 4096 mapreduce.reduce.shuffle.read.timeout 180000 mapred.tasktracker.tasks.sleeptimebefore-sigkill 5000 dfs.namenode.checkpoint.edits.dir ${dfs.namenode.checkpoint.dir} fs.permissions.umask-mode 22 mapred.max.tracker.blacklists 4 hadoop.common.configuration.version 0.23.0 jobclient.output.filter FAILED hadoop.security.group.mapping.ldap.ssl FALSE 53 Dell Apache Hadoop Performance Analysis

mapreduce.ifile.readahead io.serializations TRUE org.apache.hadoop.io.serializer.writableserializat ion,org.apache.hadoop.io.serializer.avro.avrospe cificserialization,org.apache.hadoop.io.serializer. avro.avroreflectserialization fs.df.interval 60000 io.seqfile.compress.blocksize 1000000 mapred.jobtracker.taskscheduler org.apache.hadoop.mapred.jobqueuetasksche duler job.end.retry.attempts 0 ipc.client.connect.max.retries 10 hadoop.security.groups.cache.secs 300 dfs.namenode.delegation.key.updateinterval 86400000 webinterface.private.actions FALSE mapred.tasktracker.indexcache.mb 10 hadoop.security.group.mapping.ldap.searc (&(objectclass=user)(samaccountname={0})) h.filter.user mapreduce.reduce.input.limit -1 dfs.image.compress FALSE mapred.output.value.class org.apache.hadoop.io.text tasktracker.http.threads 40 dfs.namenode.kerberos.internal.spnego.pri ${dfs.web.authentication.kerberos.principal} ncipal fs.s3n.block.size 67108864 mapred.job.tracker.handler.count 10 fs.ftp.host 0.0.0.0 keep.failed.task.files FALSE mapred.output.compress FALSE hadoop.security.group.mapping org.apache.hadoop.security.shellbasedunixgrou psmapping mapred.jobtracker.job.history.block.size 3145728 mapred.skip.reduce.max.skip.groups 0 dfs.datanode.address 0.0.0.0:50010 54 Dell Apache Hadoop Performance Analysis

dfs.datanode.https.address 0.0.0.0:50475 file.replication 1 dfs.datanode.drop.cache.behind.reads FALSE hadoop.fuse.connection.timeout 300 mapred.jar /user/hdfs/.staging/job_201304152124_0001/job.jar hadoop.work.around.non.threadsafe.getp wuid mapreduce.client.genericoptionsparser.us ed hadoop.tmp.dir FALSE TRUE /tmp/hadoop-${user.name} dfs.client.block.write.replace-datanodeon-failure.policy DEFAULT mapred.line.input.format.linespermap 1 hadoop.kerberos.kinit.command kinit dfs.webhdfs.enabled FALSE dfs.datanode.du.reserved 0 file.bytes-per-checksum 512 dfs.thrift.socket.timeout 60000 mapred.local.dir.minspacestart 0 mapred.jobtracker.maxtasks.per.job -1 dfs.client.block.write.replace-datanodeon-failure.enable TRUE dfs.thrift.threads.min 10 mapred.user.jobconf.limit 5242880 mapred.reduce.max.attempts 4 net.topology.script.number.args 100 dfs.namenode.decommission.interval 30 mapred.job.tracker dd4-ae-52-89-6e-31.dell.com:8021 dfs.image.compression.codec org.apache.hadoop.io.compress.defaultcodec dfs.namenode.support.allow.format hadoop.ssl.hostname.verifier mapred.tasktracker.instrumentation TRUE DEFAULT org.apache.hadoop.mapred.tasktrackermetricsi nst io.mapfile.bloom.error.rate 0.005 55 Dell Apache Hadoop Performance Analysis

dfs.permissions.superusergroup supergroup mapred.tasktracker.expiry.interval 600000 hadoop.proxyuser.hue.groups * io.sort.record.percent 0.1379 mapred.job.tracker.persist.jobstatus.active FALSE dfs.namenode.checkpoint.check.period 60 io.seqfile.local.dir ${hadoop.tmp.dir}/io/local tfile.io.chunk.size 1048576 file.blocksize 67108864 hadoop.proxyuser.oozie.groups * mapreduce.job.acl-modify-job io.skip.checksum.errors dfs.namenode.edits.journal-plugin.qjournal FALSE org.apache.hadoop.hdfs.qjournal.client.quorum JournalManager mapred.temp.dir ${hadoop.tmp.dir}/mapred/temp dfs.datanode.handler.count 10 dfs.namenode.decommission.nodes.per.int 5 erval fs.ftp.host.port 21 dfs.namenode.checkpoint.period 3600 dfs.namenode.fs-limits.max-componentlength 0 fs.abstractfilesystem.viewfs.impl org.apache.hadoop.fs.viewfs.viewfs dfs.datanode.ipc.address 0.0.0.0:50020 mapred.working.dir hdfs://dd4-ae-52-89-6e- 31.dell.com:8020/user/hdfs hadoop.ssl.require.client.cert FALSE dfs.datanode.max.transfer.threads 4096 mapred.job.reduce.input.buffer.percent 0 hadoop.ssl.require.client.cert FALSE dfs.datanode.max.transfer.threads 4096 mapred.job.reduce.input.buffer.percent 0 56 Dell Apache Hadoop Performance Analysis

57 Dell Apache Hadoop Performance Analysis