Dell Apache Hadoop Performance Analysis

Transcription

1 Dell Apache Hadoop Performance Analysis Dell PowerEdge R720/R720XD Benchmarking Report Nicholas Wakou Hadoop/Big Data Benchmarking Engineer Dell Revolutionary Cloud and Big Data Engineering November 2013 November2013

2 Revisions Date November 2013 Description Initial release THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. PRODUCT WARRANTIES APPLICABLE TO THE DELL PRODUCTS DESCRIBED IN THIS DOCUMENT MAY BE FOUND AT: Performance of network reference architectures discussed in this document may vary with differing deployment conditions, network loads, and the like. Third party products may be included in reference architectures for the convenience of the reader. Inclusion of such third party products does not necessarily constitute Dell s recommendation of those products. Please consult your Dell representative for additional information. Trademarks used in this text: Dell, the Dell logo, Dell Boomi, Dell Precision,OptiPlex, Latitude, PowerEdge, PowerVault, PowerConnect, OpenManage, EqualLogic, Compellent, KACE, FlexAddress, Force10 and Vostro are trademarks of Dell Inc. Other Dell trademarks may be used in this document. Cisco Nexus, Cisco MDS, Cisco NX- 0S, and other Cisco Catalyst are registered trademarks of Cisco System Inc. EMC VNX, and EMC Unisphere are registered trademarks of EMC Corporation. Intel, Pentium, Xeon, Core and Celeron are registered trademarks of Intel Corporation in the U.S. and other countries. AMD is a registered trademark and AMD Opteron, AMD Phenom and AMD Sempron are trademarks of Advanced Micro Devices, Inc. Microsoft, Windows, Windows Server, Internet Explorer, MS-DOS, Windows Vista and Active Directory are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Red Hat and Red Hat Enterprise Linux are registered trademarks of Red Hat, Inc. in the United States and/or other countries. Novell and SUSE are registered trademarks of Novell Inc. in the United States and other countries. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Citrix, Xen, XenServer and XenMotion are either registered trademarks or trademarks of Citrix Systems, Inc. in the United States and/or other countries. VMware, Virtual SMP, vmotion, vcenter and vsphere are registered trademarks or trademarks of VMware, Inc. in the United States or other countries. IBM is a registered trademark of International Business Machines Corporation. Broadcom and NetXtreme are registered trademarks of Broadcom Corporation. Qlogic is a registered trademark of QLogic Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and/or names or their products and are the property of their respective owners. Dell disclaims proprietary interest in the marks and names of others. 2 Dell Apache Hadoop Performance Analysis

3 3 Dell Apache Hadoop Performance Analysis

4 Table of Contents Revisions... 2 Executive Summary Introduction Strategic Goals Benchmark Testing Functionality Performance Characterization Tests Characterization Workloads Stress Tests MapReduce Stress Testing HDFS Stress Testing Bottleneck Investigation Performance Tuning System under Test (SUT) Hardware Configuration Software Stack Network Configuration Run Time Environment TestDFSIO Terasort K-Means Flush Memory Performance Results MapReduce Performance Performance Characterization MapReduce Job: CPU Profile MapReduce Networking Bottleneck Investigation Performance Tuning MapReduce Resource Utilization Analysis of a Teragen/Terasort Job Dell Apache Hadoop Performance Analysis

5 4.3 MapReduce Performance under a complex application K-Means K-Means Job Time K-Means CPU Utilization Thoroughput MapReduce Resource Utilization Under a Complex Application Analysis of a K-Means Job HDFS Performance HDFS Write Performance HDFS Read Performance IO Bottleneck Investigation HDFS Distributed Processing Analysis of a TestDFSIO Job Conclusion Appendix A Apache Hadoop TestDFSIO Intel HiBench HiBench: Terasort Intel HiBench: K-Means Appendix A2: Test Methodology Hadoop Infrastructure Deployment Cloudera Manager Deployment Appendix A3: Installing Benchmarks TestDFSIO Intel HiBench Appendix B: Hadoop Configuration Parameters Dell Apache Hadoop Performance Analysis

7 Executive Summary The purpose of this document is helping you gain better insights when deploying and tuning hadoop clusters by understanding the performance of the RA using data points and workloads that are typical of a big data environment. This white paper discusses the Hi-Bench 2.2 benchmarking tests that were conducted based on the R720/R720XD Reference Architecture (RA) of the Dell Cloudera Apache Hadoop Solution while focussing on the performance of MapReduce and HDFS. The reference architecture discussed in this document improves performance by tuning and adjusting O/S parameters like HDFS block size, and hadoop configuration setting within CDH. 7 Dell Apache Hadoop Performance Analysis

8 1 Introduction This report is based on benchmarking tests that were carried out on the R720/R720xd Reference Architecture (RA) of the Dell Cloudera Apache Hadoop Solution. This performance review focused on the performance of the Hadoop core components such as MapReduce and HDFS. Note: The performance of eco-system components (Hive, Impala, HBase etc.) is not part of this review. 1.1 Strategic Goals Gain an in-depth understanding of the performance of the RA using data points and workloads that are typical of a big data environment. Obtain baseline performance data for the hardware platform. Assess the performance impact of some hardware components on the RA. Tune and optimize the performance of the cluster. Identify and clear bottlenecks. 1.2 Benchmark Testing The benchmarking plan proposes 3 categories of benchmark tests as listed below. This benchmark review focusses on engineering analysis tests that were used to obtain an in-depth understanding of the RA. Engineering Analysis o Functionality QA o Characterization o Stress testing o Bottleneck Investigation o Performance Tuning Business Recovery o Not performed in this iteration Comparative analysis for marketing purposes o Not performed in this iteration 1.3 Functionality These benchmarking tests were performed after QA tests in the release cycle. The hardware platforms and software stacks of the System under Test (SUT) were stable and ready for shipping. 8 Dell Apache Hadoop Performance Analysis

9 1.4 Performance Characterization Tests These are benchmark tests that were used to characterize the performance of the RA. Performance data from these tests is typically used for architectural designs and modifications, capacity analysis, and identification of bottlenecks. To get a good understanding of the cluster, these tests were modular and the following hardware components were characterized (stressed): IO Network CPU The goal of these tests was to: Record and analyze MapReduce and HDFS performance of the RA under varying loads. Analyze RA behavior under the Map, Shuffle, Reduce and Replication phases of MapReduce jobs Characterization Workloads Standard, Open-Source workloads were used Teragen/Terasort o Data generator o Primary Metric: Latency (s) o Secondary Metrics: CPU utilization (%), Network utilization (%), Network Throughput (MB/s) TestDFSIO o Read/Write IO characterization tool o Primary Metric: Throughput (MB/s), Latency (s) K-Means o Machine-Learning tool o Cluster analysis: partitions (n) samples into (k) clusters. The dimension (d) of each sample can be varied to obtain desired levels of complexity o Primary Metric: Wall clock time 1.5 Stress Tests These are basically characterization tests performed under peak load conditions to analyze the behavior of the RA at full load and to identify possible bottlenecks. 9 Dell Apache Hadoop Performance Analysis

10 1.5.1 MapReduce Stress Testing The goal was to obtain 100% CPU utilization on the slave nodes of the cluster. A MapReduce job was submitted using Teragen/Terasort. The load (dataset size in MB) was increased until 100% CPU utilization was observed on the Slave nodes. Performance at 100% CPU was sustained and monitored for long job durations HDFS Stress Testing The goal was to obtain peak IO throughput (MB/s) while targeting the SAS limit of the disk controller on the slave nodes and/or 100% of the network utilization of the cluster. TestDFSIO was used to vary the file size of a dataset that was read from or written to the cluster until peak throughput was attained and sustained. 1.6 Bottleneck Investigation Identifying bottlenecks was a prime goal of this review. Ideally, the SUT must run at full capacity (optimal utilization of available resources). When the SUT does not perform as expected, it is essential to identify the source of the problem (bottlenecks). Bottlenecks can be caused either due to hardware limitations or due to inefficient software configurations or both MapReduce jobs used in this review were CPU-intensive. The inability of a MapReduce job to fully maximize the CPU resources (attain 100% CPU utilization) on the slave nodes at full load is an indication of a bottleneck on the SUT. HDFS jobs used in this review were IO-intensive. Attaining the SAS throughput limit of the disk controller is a good indication of a bottleneck-free SUT. 1.7 Performance Tuning It is imperative that bottlenecks are eliminated or their impact mitigated in order to fully utilize the resources of the SUT. In this review, software parameters (OS and Hadoop) were tweaked (tuned) in order to attain the desired CPU and IO profiles. 10 Dell Apache Hadoop Performance Analysis

11 2 System under Test (SUT) This benchmark review was undertaken on the Dell PowerEdge R720/R720 hardware platform. Figure 1 System Under Test 11 Dell Apache Hadoop Performance Analysis

12 2.1 Hardware Configuration Table 1: Hardware Configuration Machine Function Active and Secondary Name Node Admin Node, HA Node Edge Node Data node Platform PowerEdge R720 PowerEdge R720xd CPU 2 x E (6-core) 2 x E (6- core) RAM (Minimum) LOM 96 GB 36 GB 4 x 1GbE DISK 6 x 600-GB 10K SAS 3.5-inch 24 x 1-TB SATA 7.2K 2.5-inch Storage Controller PERC H710 RAID RAID 10 Single Drive RAID Software Stack Table 2: Software Stack Component Version Operating System Red Hat Enterprise Linux 6.2 Hadoop Cloudera Distribution of Hadoop CDH Cloudera Manager Cloudera Manager Java Sun Oracle Java version 6 Update Dell Apache Hadoop Performance Analysis

13 2.3 Network Configuration Figure 2 Network Configuration Table 3 Server Side Cabling Component NICs to Switch Port LOM1 LOM2 LOM3 LOM4 BMC Admin node Name Nodes N/A Data Node N/A N/A Edge Node Legend Production LAN Management LAN Public LAN In order to segregate network traffic and enable dedicated network links, the Dell Cloudera solution configures 3 distinct vlans: Network Description vlan tag Tagged Production Management Public Used by the Hadoop system to handle traffic between all nodes for HDFS operations, MapReduce jobs, and other Hadoop traffic. Network links are bonded in a team of 2 or more links. Used for connecting to the BMC of each node. Additionally used for administrative functions such as Crowbar node installation, backups and other monitoring. Used for connections to devices external to the Hadoop Cluster. Bonded in a team of 2 links q Tagged 300 Not tagged q Tagged 13 Dell Apache Hadoop Performance Analysis

14 3 Run Time Environment Performance tests were executed on the Masternode from the command line or scripts. 3.1 TestDFSIO Executed from the command line. See Appendix A1 (1.1: Apache Hadoop TestDFSIO) Before every performance run, remove previous test data by running the command: # sudo u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-test mr1- cdh4.1.1.jar TestDFSIO -clean Data set size was used to characterize and stress the SUT. The following command line options were used to vary the data set size from 100GB to 5000GB - - nrfiles (number of files) - - filesize (size of each file) Performance Metrics provided at program completion - Throughput MB/s - IO Rate - MB/s - Execution time (s) - Standard Deviation Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 3.2 Terasort Executed from HiBench scripts as shown in Appendix A1 (1.3 HiBench: Terasort) Follow instructions to flush cache before any performance run. Performance characterization and stress testing was done by varying the dataset size from 10GB to 10,000GB. Modify the ~/HiBench-2.1/terasort/conf/configure.sh to set the data set size - # for prepare (total) - 1TB - DATASIZE= Data generation (teragen) run the ~/HiBench-2.1/terasort/bin/prepare.sh Program Execution (terasort) run the ~/HiBench-2.1/terasort/bin/run.sh Primary Performance Metrics - Latency job duration shown by the Cloudera Manager Activity Monitor. Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 14 Dell Apache Hadoop Performance Analysis

15 3.3 K-Means Executed from HiBench scripts as shown in Appendix A1 (1.4 Intel HiBench: K-Means) Follow instructions in Memory to flush cache before any performance run. Performance characterization and stress testing using a complex application was done by varying the dataset size from 0.3GB to 2,500GB. Modify the ~/HiBench-2.1/terasort/conf/configure.sh as shown in Appendix A1 to define the data set size. - Number of samples (n) - 10^3, 10^4, 10^10 - Dimension of each sample (d): 2,4,8,16,32,64 - Number of clusters (k): 2,4,8,16,32,64,128 - Samples per input File: (Number of samples / 5) Data generation run the ~/HiBench-2.1/kmeans/bin/prepare.sh Program Execution run the ~/HiBench-2.1/kmeans/bin/run.sh Primary Performance Metrics - Results obtained from hibench.report Throughput (MB/s) Latency Secondary Metrics - CPU utilization Ganglia, Cloudera Manager Host statistics - Network utilization Ganglia 3.4 Flush Memory Before any performance run, it is necessary to flush cache memory on all slave nodes. A script to flush cache on slave nodes with IP addresses ranging from to : 1. doall.sh for i in ; do echo ne $i:; ssh $1; done 2. From the command line #./doall free m #./doall sync & #./do all echo 3> /proc/sys/vm/drop_caches #./doall free m 15 Dell Apache Hadoop Performance Analysis

16 4 Performance Results 4.1 MapReduce Performance The performance of the MapReduce layer was analyzed by varying the dataset size. At each instance, the CPU characteristics, network utilization and performance metrics were obtained. Performance results were analyzed for evidence of bottlenecks. The SUT was tuned to improve performance Performance Characterization Terasort was used to characterize MapReduce performance. The instructions in were followed to vary the size of the dataset from 10GB to 10000GB. At each instance, performance data was collected and recorded. Figure 3 Performance Characterization Chart Latency (s) Data Size (GB) CPU_utiliization (%) 16 Dell Apache Hadoop Performance Analysis

17 4.1.2 MapReduce Job: CPU Profile The CPU profile of the Map and Reduce phases of a 1TB sort job was captured. Figure 4 MapReduce Job CPU Profile MapReduce Networking The nodes in a Hadoop cluster are interconnected through the network. Typically, one or more of the following phases of MapReduce jobs transfers data over the network: 1. Writing data: This phase occurs when the initial data is either streamed or bulk-delivered to HDFS. Data blocks of the loaded files are replicated, transferring additional data over the network. 2. Workload execution: The MapReduce algorithm is run. a. Map phase: In the map phase of the algorithm, almost no traffic is sent over the network. The network is used at the beginning of the map phase only if a HDFS locality miss occurs (the data block is not locally available and has to be requested from another data node). b. Shuffle phase: This is the phase of workload execution in which traffic is sent over the network, the degree to which depends on the workload. Data is transferred over the network when the output of the mappers is shuffled to the reducers. c. Reduce phase: In this phase, almost no traffic is sent over the network because the reducers have all the data they need from the shuffle phase. d. Output replication: MapReduce output is stored as a file in HDFS. The network is used when the blocks of the result file have to be replicated by HDFS for redundancy. 3. Reading data: This phase occurs when the final data is read from HDFS for consumption by the end application, such as the website, indexing, or SQL database. 17 Dell Apache Hadoop Performance Analysis

18 Figure 5 Network Utilization by MapReduce Phases 4.2 Bottleneck Investigation MapReduce jobs are CPU-intensive. In this review it was possible to stress the slave nodes to attain 100% CPU utilization indicating the absence of MapReduce performance bottlenecks (particularly IO and network bottlenecks). Further analysis of the CPU profile showed very high CPU system time > 30%. Typically, CPU system time should be < 15% and user time should be > 80%. Using techniques described in 5.1.5, the CPU System time was reduced to < 15%. There was evidence of memory swapping with dataset sizes > 3TB indicating that memory is an issue at those sizes. 18 Dell Apache Hadoop Performance Analysis

19 Figure 6 Tuning CPU System Time Performance Tuning Typically, performance tuning is performed to fix bottlenecks or to mitigate their impact. For dataset sizes considered in this review (10GB-10,000GB) CPU utilization was identified as the only bottleneck for MapReduce performance indicating that the SUT was already optimized for the best CPU and Memory performance. Further analysis (see Figure 6) of the CPU profile indicated that CPU system time was very high and had to be reduced to further improve performance. Based on best-practices adopted from Cloudera Performance Engineers, the impact of tuning several hardware and software parameters was investigated. In this review, the blocksize and the hadoop configuration parameters provided the most significant boost to performance. These parameters were tuned as shown below and performance characterization tests were repeated with the dataset size of the MapReduce jobs being varied from 100GB to 1000GB. See the results in Figure OS Parameters: Use the doall.sh script shown in Flush Memory in order to apply to all the slave nodes a. Turn down swappiness: # doall.sh echo 0 > /proc/sys/vm/swappiness b. Turn off huge page defrag: #doall.sh echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag 19 Dell Apache Hadoop Performance Analysis

20 2. Blocksize: The blocksize was increased from 128MB to 512MB using the configuration parameter: dfs.blocksize= Hadoop configuration parameters. The following parameters were tuned: Table 4: Tuning Hadoop Configuration Parameters Parameter Value Default mapred.map.tasks mapred.reduce.tasks 64 1 mapred.tasktracker.map.tasks.maximum 24 2 mapred.tasktracker.reduce.tasks.maximum 8 2 MapReduce Child Java Maximum Heap size NULL Datanode Java Heap Size NULL Task Tracker Java Heap Size NULL Namenode Java Heap Size Secondary Namenode Java Heap Size io.sort.mb io.sort.record.percent io.sort.spill.percent The overall performance improvement due to tuning CPU system time, the blocksize and hadoop configuration parameters was found to be 50.84% based on the reduction in job duration times. Table 5 Performance Boost by Tuning Parameter Performance boost OS parameters 38.10% Blocksize 12.20% Hadoop configuration parameters 0.54% Figure 7 Comparing the Performance of a Tuned to Non-tuned SUT 20 Dell Apache Hadoop Performance Analysis

21 Job Duration (s) Data Size (GB) MapReduce Resource Utilization Resource utilization by MapReduce jobs was observed for 1 day with Teragen/Terasort jobs running on the cluster. It was observed that datanode servers are the work horses of the hadoop cluster and they fully utilized their memory, network and CPU resources. There was evidence of memory swapping on datanode servers when large datasets (> 3TB) were sorted using terasort. The namenode and other infrastructure servers used these resources very lightly. This could have an impact on how these servers are scoped for small clusters. Figure 8 Memory Used by a Datanode Server 21 Dell Apache Hadoop Performance Analysis

22 Figure 9 Memory Used by a Namenode Serve Figure 10 CPU Utilization by a Datanode Server 22 Dell Apache Hadoop Performance Analysis

23 Figure 11 CPU Utilization by a Namenode Server Analysis of a Teragen/Terasort Job The result of a single Teragen/Terasort job was captured as a data point for comparison with future benchmark reviews. The result of a single Teragen/Terasort job was captured as a data point for comparison with future benchmark reviews. Table 6: Analysis of a Teragen/Terasort Job Parameter Teragen Terasort Input parameters Number of rows Dataset size 1TB Results Job duration 1032s (17mins 12s) 725s (12mins 5s) Network utilization 83.4% 47.9% CPU Utilization Map phase Shuffle/Reduce phase 90% 31% 97% 45% 23 Dell Apache Hadoop Performance Analysis

24 4.3 MapReduce Performance under a complex application K-Means K-Means provides an application that can be configured to match the complexity of real-world use-cases. Refer to Appendix A1 (Intel HiBench: K-Means) and K-Means for details on how K-Means was installed and implemented. It was observed that performance characteristics were largely impacted by the sample size (n); Performance characterization tests were performed for small samples with n =< 103, and then repeated for large samples with n=> In all cases and for each job, sample size (n), the dimension (d) and the number of clusters (k) was varied. For each sample size (small or large) and number of dimensions, the cluster size was varied K-Means Job Time It was observed that time complexity for a small sample was exponential on the number of clusters * samples: Where d=dimension, k = clusters, n = samples Figure 12 Small Sample Duration Small Sample : Duration - # Samples = 10^3 - Data generated x, Run Time (seconds) d=2 d=4 d=8 d=16 d=32 d= # Clusters For large samples, job completion time is almost linearly proportional to the number of clusters, k. 24 Dell Apache Hadoop Performance Analysis

25 Figure 13 Large Sample Duration and CPU Large Sample (10 10 ) Duration and CPU Run Time (s) 12, , , , , , # Clusters CPU Utilization (%) Duration CPU K-Means CPU Utilization For small samples, 80% CPU was attained when d>16 and k > 32. Figure 14 K-Means Small Sample CPU utilization Small Sample: CPU CPU Utilization (%) # Clusters d=2 d=4 d=8 d=16 d=32 d=64 For large samples, 80% CPU is attained even with k=1. Refer to CPU Profile Large Sample Jobs. 25 Dell Apache Hadoop Performance Analysis

26 Figure 15 CPU Profile Large Sample Jobs - Large Samples = CPU Utilization ~95% - User time ~ 90% - System time ~ 5% Thoroughput MapReduce throughput under K-Means, a complex application was obtained and analyzed. K-Means Throughput was provided by the HiBench report and was an indication of the rate at which data was analyzed. This was based on the amount of data and time taken to compute centroids (for each iteration) before convergence. For low complex (d<8) small samples, throughput increases linearly with number of clusters k. For high complex (d>16) small samples, throughput starts to drop for large numbers of clusters. Figure 16 Small Sample Throughputss Throughput (MB/s) Small Sample : Throughput # Clusters d=2 d=4 d=8 d=16 d=32 d=64 For large samples, throughput drops linearly with the number of clusters. 26 Dell Apache Hadoop Performance Analysis

27 Figure 17 Large Sample Throughput Large Sample: Throughput Throughput (MB/s) d= # Clusters 27 Dell Apache Hadoop Performance Analysis

28 4.3.4 MapReduce Resource Utilization Under a Complex Application Figure 18 Resource Utilization Under K-Means 28 Dell Apache Hadoop Performance Analysis

29 The figure above show how CPU and IO resources of the SUT were utilized with a large-sample K-Means job (d=2, k=2) running for about 1 hour. The charts show that CPU utilization was high (98%) throughout the 5 iterations. There was significant read I/O activity during the run but write I/O activity kicked in after the 5 iterations had completed. Network traffic was noticeable after the iterations. Memory utilization was high (~80%) throughout the run Analysis of a K-Means Job The result of a single K-Means job was captured as a data point for comparison with future benchmark reviews. Table 7 Analysis of a K-Means Job Parameter Value Input Parameters Sample size (n) 10^10 Dimensions (d) 32 Clusters (k) 8 Samples per input file Maximum number of Iterations 5 Results Input size Total time Throughput bytes (2.4 TB) 13, seconds (~ 3hrs 48 mins) bytes/second ( MB/s) CPU 100% Network utilization 84% 4.4 HDFS Performance The TestDFSIO benchmark was used to analyze the performance of the HDFS layer. For instructions on how to run this benchmark refer to Appendix A1 ( Apache Hadoop TestDFSIO) and TestDFSIO. 4.5 HDFS Write Performance For this SUT, job execution times rise linearly with the size of the dataset up to 1000GB. For dataset sizes of 1000GB and bigger, job execution times rise exponentially. This is mainly due to replication and the limitations of the network bandwidth. The default replication factor of 3 was maintained for all HDFS tests. As dataset sizes increase, the multiplier effect of the replication factor comes into play requiring more data 29 Dell Apache Hadoop Performance Analysis

30 to be transferred across the network. As the network becomes the bottleneck, data transfer is constrained and leads to increased job execution times. Figure 19 HDFS Write Performance Chart IO throughput (MB/s) nrfiles = 1000 time Data size (GB) Job duration time (s) HDFS Read Performance TestDFSIO Read jobs are processed locally within each Data Node with no significant transfer of data across the network. Network traffic is therefore not as significant as that expected in write jobs. In addition to the underlying I/O hardware architecture (disks, controllers), the number of files to process and the number of available hadoop map and reduce slots have a significant impact on read performance. Figure 20 HDFS Read Performance Chart IO throughput (MB/s) nrfiles = 1000 tim e Data size (GB) Job duration time (s) 30 Dell Apache Hadoop Performance Analysis

31 4.5.2 IO Bottleneck Investigation 3 possible IO limits were considered SAS Limit An LSI whitepaper, Switched SAS: Sharable, Scalable SAS Infrastructure shows how to calculate the SAS limit of an 8 lane controller port with a SAS bandwidth of 6Gbps: 6Gb/s x 8 lanes = 48Gb/s per x8 port 48Gb/s (8b/10b encoding) = 4.8GB/sec per port (per node) 4.8GB/s per port x 88.33% (arbitration delays and additional framing) = 4320MB/s per port PCI-E Slot The Dell R720 provides integrated PCI-E Gen-3 capable slots. Gen-3 is defined at 8 Gbps; this gives a bandwidth of 8.0 Gb/s (Scrambling + 128b/130b encoding instead of 8b/10b encoding) per lane, so for example a PCIe Gen-3 x8 link delivers an aggregate bandwidth of 8 GB/s Network Each Slave node has 2x1GbE NIC bonded interfaces. The full-duplex bandwidth (BW) per node: BW = 1 Gb/s x 2 (interfaces) x 2 (full duplex) / 8 (bits) = 0.5 GB/s Allowing for 20% transmission overheads, the nominal BW is expected at ~ 400MB/s per node The IO limits per node are summarized in the following table. Table 8 IO Bottlenecks Component SAS Controller PCI-E Gen-3 Slot 2x1GbE NIC Interfaces Max Bandwidth 4.8 GB/s 8.0 GB/s 400MB/s It is clear that the network has the lowest bandwidth limit. Write IO performance is severely impacted by the network limit due to the requirement to transfer data across the network. Read IO performance is more dependent on the IO bandwidth limitations of the underlying IO components (SAS controller, PCI slots, disks etc.). Since these limits are high for each node, the Read Performance of this SUT depended more on TestDFSIO (number of Files nrfiles) and hadoop parameters (number of map slots) HDFS Distributed Processing The number of distributed partitions significantly impacts HDFS performance. The more the partitions (distributed files) the better the performance as shown in the chart below which shows how write performance is impacted by the number of files (nrfiles). 31 Dell Apache Hadoop Performance Analysis

32 Figure 21 HDFS Distributed Processing Chart time IO write throughput (MB/s) Job duration time (s) nrfiles 0 32 Dell Apache Hadoop Performance Analysis

33 4.5.4 Analysis of a TestDFSIO Job The results of read and write TestDFSIO jobs were captured as a data point that could be used for future reference. Table 9 Analysis of a TestDFSIO Job Parameter Write Performance Read Performance Input parameters nrfiles filesize 1000MB 1000 option -write -read Results Throughput 3310 MB/s 8820 MB/s IO Rate 4000 MB/s MB/s Execution time s s Network utilization 85.4% 79.2% CPU Utilization Map phase Shuffle/Reduce phase 100% 25% 70% 27% 33 Dell Apache Hadoop Performance Analysis

34 5 Conclusion This R720/R720xd benchmarking review is the first in a series that are planned to be conducted with every major release of the Dell Cloudera Hadoop solution. Every attempt was made to perform all tests recommended in the Benchmarking Plan and Guide but for various reasons a number of them could not be performed. The results obtained in this review will be used as baseline data for comparing the performance of subsequent RA revisions, configurations and performance optimizations. The main achievements of this benchmarking review are: Performance characterization of the R720/720xd RA using CPU and IO intensive workloads. Stress testing the RA. Understanding the behavior of the RA as the load increases up to the point when bottlenecks become evident. Bottleneck investigation. For the size of the hadoop cluster under review, the main bottlenecks are CPU and the network. Results from this review show when each bottleneck comes into play. Performance tuning. Software (OS & hadoop) tuning techniques were employed to mitigate the impact of the CPU bottleneck. These tweaks provided a performance boost of 50.84% over a nontuned system. This implies that a Terasort job will run at least 50% faster after the tweaks. These tuning tweaks should be incorporated into the solution. Based on the performance results and analysis, some recommendations have been made and should be considered in order to improve the performance of the Dell Cloudera Hadoop Solution: 1. Performance tuning use techniques in Performance Tuning to improve the performance of the solution by over 50%. a. Apply hadoop configuration parameters. b. Apply OS parameters. c. Increase the block size from the default 128MB to 512MB or more. 2. RA changes should be subject to a cost/benefit analysis. a. More memory to the Slave nodes > 64GB. b. Less memory on Infrastructure nodes. c. Fewer processing elements (CPU/Cores) on Infrastructure nodes. d. More processing elements (CPU/Cores) on Slave nodes. 34 Dell Apache Hadoop Performance Analysis

35 6 Appendix A Appendix A Test Environment: Open source workloads were used to generate data and submit jobs to the hadoop cluster. Appendix A1 Test Suites: The high level goal of this benchmarking review was to test the architectural components of Hadoop; MapReduce, HDFS and how they interact with the underlying hardware infrastructure. Workloads were selected based on how best they could exercise the hardware components (IO, CPU and Network) that have the biggest impact on MapReduce and HDFS: Table 10 Test Suites Benchmark Distribution Hadoop Stressed Component Hardware Component Stressed TestDFSIO ApacheHadoop HDFS IO, Network Teragen HiBench 2.2 HDFS IO, Network Terasort HiBench 2.2 MapReduce CPU K-means HiBench 2.2 MapReduce Application level, CPU Apache Hadoop TestDFSIO The TestDFSIO benchmark is a read and write test for HDFS. TestDFSIO is used to measure performance of HDFS and stresses both the network and IO subsystems. The command reads and writes files in HDFS which is useful in measuring system-wide performance and exposing network bottlenecks on the Hadoop cluster. A majority of HDFS workloads are IO bound more than compute and hence TestDFSIO can provide an accurate initial picture of such scenarios. Nevertheless, because this test is run as a MapReduce job, the MapReduce stack of the cluster must be correctly working. In other words, this test cannot be used to benchmark HDFS in isolation from MapReduce. The benchmark can be run for writing, using the write switch, and using read for the read test. The command line accepts a number of files and sizes of each file in HDFS. The command used to generate and write 1000 files each 1000MB is as below: # sudo u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-test mr1-cdh4.1.1.jar TestDFSIO write nrfiles 1000 filesize 1000 resfile /tests/testdfsio/results.txt TestDFSIO generates 1 map task per file and splits are defined such that each map processes a single file. After every run, the command generates a log file indicating performance in terms of 4 metrics: Throughput in MB/s, Average IO rate in MB/s, IO rate standard deviation and execution time. The most notable metrics are throughput and average IO, both of which are based on file size read or written by the individual map task and the elapsed time in performing the task. The throughput and IO rate for N map tasks is defined as: 35 Dell Apache Hadoop Performance Analysis

36 If the cluster has 50 map slots and TestDFSIO creates 1000 files, the throughput can be calculated as: Concurrent Throughput = Reported Throughput x Number of Map Slots The IO rate can be calculated in a similar fashion. While measuring cluster performance using TestDFSIO may be considered sufficient, the HDFS replication factor (value of dfs.replication ) also plays an important role. A lower replication factor leads to higher throughput performance due to reduced background traffic Intel HiBench HiBench is a benchmarking suite for Hadoop. It consists of a set of Hadoop programs, including both synthetic micro-benchmarks and real-world Hadoop applications. An overview of the benchmark can be obtained at github ( This review used the following HiBench 2.1 micro-benchmarks: Terasort CPU intensive workload to characterize the performance of and stress-test the MapReduce layer. K-means CPU intensive workload to characterize MapReduce performance on a SUT running complex hadoop applications. The HiBench suite is hierarchically organized with each micro-benchmark having a similar directory structure. Each micro-benchmark has the following files with tunable parameters: ~/conf/configure.sh sets the environment data size compression run-time hadoop parameters ~/bin/prepare.sh Data generation Run-time hadoop parameters ~/bin/run.sh Benchmark execution Run-time hadoop parameters 36 Dell Apache Hadoop Performance Analysis

37 6.1.3 HiBench: Terasort Terasort is part of the Apache Hadoop distribution and is available on any cluster. This review used the package that was distributed with the HiBench suite. It is distributed as a 2-part package: Teragen is a map/reduce data generator. Given a dataset size, It divides the desired number of rows by the desired number of tasks and assigns ranges of rows to each map. The map uses the random number generator to jump to the correct value for the first row and generates the subsequent rows. Teragen is executed by the prepare.sh scripts. Terasort is a standard sort program that samples the input data generated by teragen and uses map/reduce to sort the data into a total order. Terasort is executed via run.sh Run-time Scripts These were the tunable parameters implemented in running Terasort: ~/HiBench/bin/hibench-config.sh # switch on/off compression: 0-off, 1-on export COMPRESS_GLOBAL=0 export COMPRESS_CODEC_GLOBAL=org.apache.hadoop.io.compress.DefaultCodec ~/HiBench/terasort/conf/configure.sh # for prepare (total) - 1TB DATASIZE= #Number of Map tasks NUM_MAPS=180 #Number of Reduce tasks NUM_REDS=64 ~/HiBench/terasort/bin/prepare.sh 37 Dell Apache Hadoop Performance Analysis

38 # Generate the terasort data hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop mr1-cdh4.1.1-examples.jar teragen \ -D mapred.map.tasks=$num_maps \ $DATASIZE $INPUT_HDFS ~/HiBench/terasort/bin/run.sh # run bench hadoop jar $HADOOP_HOME/hadoop mr1-cdh4.1.2-examples.jar terasort -D mapred.reduce.tasks=$num_reds $INPUT_HDFS $OUTPUT_HDFS # post-running END_TIME=`timestamp` gen_report "TERASORT" ${START_TIME} ${END_TIME} ${SIZE} >> ${HIBENCH_REPORT} Intel HiBench: K-Means K-means is a data mining, cluster analysis algorithm that aims to partition n observations (x 1, x 2,, x n ), into k sets (clusters) (k n) where S = {S 1, S 2,, S k } Each observation belongs to the cluster with the nearest mean i.e. the one with the most similar items: 1. k centroids are selected. 2. Each item in the sample is placed in a cluster with the least distance (nearest centroid). 3. For each group of points assigned to the same center, compute a new center by taking the centroid of the points. 4. Repeat until there is convergence. This review used the K-means benchmark from Intel HiBench 2.1 suite Runtime scripts ~/HiBench/bin/hibench-config.sh Dell Apache Hadoop Performance Analysis

39 ###################### Global Paths ################## export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce HADOOP_CONF_DIR=$HADOOP_HOME/conf HADOOP_EXAMPLES_JAR=$HADOOP_HOME/hadoop-examples*.jar if [ -z "$HIBENCH_HOME" ]; then fi export HIBENCH_HOME=/var/lib/hadoop-hdfs/hibench if [ -z "$HIBENCH_CONF" ]; then fi export HIBENCH_CONF=${HIBENCH_HOME}/conf if [ -f "${HIBENCH_CONF}/funcs.sh" ]; then fi source "${HIBENCH_CONF}/funcs.sh" if [ -z "$HIVE_HOME" ]; then fi export HIVE_HOME=/usr/lib/hive if [ -z "$MAHOUT_HOME" ]; then fi export MAHOUT_HOME=/usr/lib/mahout 39 Dell Apache Hadoop Performance Analysis

40 if [ -z "$DATATOOLS" ]; then export DATATOOLS=${HIBENCH_HOME}/common/autogen/dist/datatools.jar fi ~/HiBench/kmeans/conf/configure.sh for prepare # Number of clusters (k) NUM_OF_CLUSTERS=128 # Number of samples (n) NUM_OF_SAMPLES=100 #SAMPLES_PER_INPUTFILE= SAMPLES_PER_INPUTFILE= # Number of dimensions (d) DIMENSIONS=4 # for running MAX_ITERATION=5 ~/HiBench/kmeans/bin/prepare.sh # generate data 40 Dell Apache Hadoop Performance Analysis

41 OPTION="-sampleDir ${INPUT_SAMPLE} -clusterdir ${INPUT_CLUSTER} -numclusters ${NUM_OF_CLUSTERS} -numsamples ${NUM_OF_SAMPLES} -samplesperfile ${SAMPLES_PER_INPUTFILE} -sampledimension ${DIMENSIONS}" export HADOOP_CLASSPATH=`mahout classpath tail -1` hadoop jar /var/lib/hadoop-hdfs/hibench/common/autogen/dist/datatools.jar org.apache.mahout.clustering.kmeans.genkmeansdataset -libjars $MAHOUT_HOME/mahout-examples- 0.7-cdh4.1.2-job.jar ${OPTION} ~/HiBench/kmeans/bin/run.sh OPTION="-i ${INPUT_SAMPLE} -c ${INPUT_CLUSTER} -o ${OUTPUT_HDFS} -x ${MAX_ITERATION} -ow - cl -cd 0.5 -dm org.apache.mahout.common.distance.euclideandistancemeasure -xm mapreduce" START_TIME=`timestamp` #START_TIME=date +%s echo $MAHOUT_HOME # run bench mahout kmeans ${OPTION} # post-running END_TIME=`timestamp` echo $END_TIME gen_report "KMEANS" ${START_TIME} ${END_TIME} ${SIZE}>> ${HIBENCH_REPORT 41 Dell Apache Hadoop Performance Analysis

42 6.2 Appendix A2: Test Methodology The hardware and network configurations are set up. Crowbar was used to deploy the Hadoop cluster infrastructure and Cloudera Manager deployed hadoop Hadoop Infrastructure Deployment The hadoop cluster infrastructure was setup and configured by Crowbar. Please refer to the Dell Cloudera Solution Deployment Guide and Dell Cloudera Solution Crowbar Administrator User Guide for details. 1. Follow instructions to install the Crowbar Admin node. 2. Use a browser to connect to Crowbar. 3. Power up servers that will be part of the Hadoop cluster. 4. Allow the servers to PXE boot from the Crowbar admin node. 5. Note that the nodes get discovered in Crowbar. 6. Create, Edit, save and apply the Cloudera Manager Barclamp. 7. Crowbar configures BIOS and RAID configurations and installs the OS on all the nodes. 8. After completing the OS install, the nodes should transition to the Ready state in Crowbar UI. 9. Note that all the Hadoop cluster nodes and the Cloudera Manager Barclamp are in the Ready state (Green LED) in Crowbar UI Cloudera Manager Deployment Instructions on how to deploy Cloudera Manager can be found in the Dell Cloudera Solution Crowbar Administrator User Guide. Follow the link available from the Cloudera-manager server to login to Cloudera Manager. Provide a license in order to use the Enterprise Edition. This review used the Enterprise Edition of Cloudera Manager for its monitoring capabilities and tools. Install Hadoop Core services (HDFS, Map Reduce, HUE and Oozie). Verify that these services are up and running with good health. 6.3 Appendix A3: Installing Benchmarks The benchmarks specified in Appendix A1 Test Suites are installed as shown in this section TestDFSIO TestDFSIO was used in this review and was available with CDH distribution of Hadoop. The program can be run from the command line Intel HiBench HiBench 2.1 was used to provide and manage Teragen, Terasort and K-Means packages. 1. Download the latest version of HiBench 2.x (ZIP) from github: 2. For the HiBench 2.1 used in this review, the download is available at: 42 Dell Apache Hadoop Performance Analysis

43 3. Download the zipball to the following suggested directory of the Master node server /home/hibench. 4. Unzip and extract the zipball. 5. Rename (mv) ~/hibench/hibench-hibench-2.1-* to ~/hibench/hibench-2.1 # chown hdfs:hdfs R /home/hibench/hibench-2.1/. 6. Mahout packages are required for implementation of K-Means. The default installation of Hadoop does not provide Mahout. The version of Mahout that is downloaded with HiBench 2.1 has compatibility issues with CDH4. The Cloudera version of Mahout has those issues settled. Download the mahout package from the following link: 7. Search for mahout Unzip and untar. 9. Modify configuration and run scripts as shown in Appendix A1 Test Suites 43 Dell Apache Hadoop Performance Analysis

44 7 Appendix B: Hadoop Configuration Parameters A complete listing of the hadoop configuration parameters for a 1TB Terasort job is shown in the following table. name value job.end.retry.interval mapred.job.tracker.retiredjobs.cache.size 1000 mapred.queue.default.acl-administer-jobs * dfs.image.transfer.bandwidthpersec 0 mapred.task.profile.reduces 0-2 mapreduce.jobtracker.staging.root.dir ${hadoop.tmp.dir}/mapred/staging mapred.job.reuse.jvm.num.tasks -1 dfs.block.access.token.lifetime 600 fs.abstractfilesystem.file.impl org.apache.hadoop.fs.local.localfs mapred.reduce.tasks.speculative.execution hadoop.ssl.keystores.factory.class FALSE org.apache.hadoop.security.ssl.filebasedkeystor esfactory mapred.job.name hadoop.http.authentication.kerberos.keyta b TeraSort ${user.home}/hadoop.keytab io.seqfile.sorter.recordlimit s3.blocksize dfs.namenode.num.checkpoints.retained 2 hadoop.relaxed.worker.version.check TRUE mapred.task.tracker.http.address :50060 dfs.namenode.delegation.token.renewinterval io.map.index.interval 128 s3.client-write-packet-size dfs.namenode.http-address dd4-ae e-31.dell.com:50070 ha.zookeeper.session-timeout.ms 5000 mapred.system.dir ${hadoop.tmp.dir}/mapred/system hadoop.hdfs.configuration.version 1 s3.replication 3 dfs.datanode.balance.bandwidthpersec Dell Apache Hadoop Performance Analysis

45 mapred.task.tracker.report.address :0 mapred.jobtracker.plugins org.apache.hadoop.thriftfs.thriftjobtrackerplugi n jobtracker.thrift.address dd4-ae e-31.dell.com:9290 mapreduce.reduce.shuffle.connect.timeou t dfs.journalnode.rpc-address :8485 hadoop.ssl.enabled FALSE mapreduce.job.counters.max 120 dfs.datanode.readahead.bytes ipc.client.connect.max.retries.on.timeouts 45 mapred.healthchecker.interval mapreduce.job.complete.cancel.delegatio TRUE n.tokens dfs.client.failover.max.attempts 15 dfs.namenode.checkpoint.dir file://${hadoop.tmp.dir}/dfs/namesecondary dfs.namenode.replication.work.multiplier.p 2 er.iteration fs.trash.interval 0 hadoop.jetty.logs.serve.aliases TRUE mapred.skip.map.auto.incr.proc.count TRUE hadoop.http.authentication.kerberos.princi HTTP/_HOST@LOCALHOST pal terasort.num-rows s3native.blocksize mapred.child.tmp./tmp mapred.tasktracker.taskmemorymanager monitoring-interval dfs.namenode.edits.dir ${dfs.namenode.name.dir} dfs.encrypt.data.transfer FALSE dfs.datanode.http.address :50075 io.sort.spill.percent 0.98 dfs.client.use.datanode.hostname FALSE mapred.job.shuffle.input.buffer.percent 0.7 hadoop.skip.worker.version.check FALSE hadoop.security.instrumentation.requires.a FALSE 45 Dell Apache Hadoop Performance Analysis

46 dmin mapred.skip.map.max.skip.records 0 mapreduce.reduce.shuffle.maxfetchfailure 10 s hadoop.security.authorization FALSE user.name hdfs dfs.client.failover.connection.retries.on.tim 0 eouts hadoop.security.group.mapping.ldap.searc (objectclass=group) h.filter.group dfs.namenode.safemode.extension mapred.task.profile.maps 0-2 dfs.datanode.sync.behind.writes FALSE dfs.https.server.keystore.resource ssl-server.xml mapred.local.dir ${hadoop.tmp.dir}/mapred/local hadoop.security.group.mapping.ldap.searc cn h.attr.group.name mapred.merge.recordsbeforeprogress mapred.job.tracker.http.address :50030 dfs.namenode.replication.min 1 mapred.compress.map.output TRUE mapred.userlog.retain.hours 24 s3native.bytes-per-checksum 512 tfile.fs.output.buffer.size mapred.tasktracker.reduce.tasks.maximum 8 fs.abstractfilesystem.hdfs.impl org.apache.hadoop.fs.hdfs dfs.namenode.safemode.min.datanodes 0 mapred.disk.healthchecker.interval dfs.client.https.need-auth FALSE dfs.client.https.keystore.resource ssl-client.xml dfs.namenode.max.objects 0 mapred.cluster.map.memory.mb -1 hadoop.ssl.client.conf ssl-client.xml dfs.namenode.safemode.threshold-pct 0.999f dfs.blocksize dfs.thrift.threads.max 20 mapreduce.job.submithost dd4-ae e-31.dell.com hue.kerberos.principal.shortname hue 46 Dell Apache Hadoop Performance Analysis

47 mapreduce.tasktracker.outofband.heartbe FALSE at io.native.lib.available TRUE dfs.client-write-packet-size mapred.jobtracker.restart.recover FALSE mapred.reduce.child.log.level INFO mapreduce.shuffle.ssl.address dfs.namenode.name.dir file://${hadoop.tmp.dir}/dfs/name dfs.ha.log-roll.period 120 dfs.client.failover.sleep.base.millis 500 dfs.datanode.directoryscan.threads 1 dfs.permissions.enabled TRUE dfs.support.append TRUE mapred.inmem.merge.threshold 1000 ipc.client.connection.maxidletime mapreduce.shuffle.ssl.enabled ${hadoop.ssl.enabled} dfs.namenode.invalidate.work.pct.per.iterat 0.32f ion dfs.blockreport.intervalmsec fs.s3.sleeptimeseconds 10 dfs.namenode.replication.considerload TRUE dfs.client.block.write.retries 3 hadoop.ssl.server.conf ssl-server.xml mapred.jobtracker.retirejob.interval dfs.namenode.name.dir.restore FALSE dfs.datanode.hdfs-blocksmetadata.enabled TRUE mapred.reduce.tasks 0 ha.zookeeper.parent-znode /hadoop-ha mapred.queue.names default io.seqfile.lazydecompress TRUE dfs.https.enable FALSE mapred.fairscheduler.preemption FALSE 47 Dell Apache Hadoop Performance Analysis

48 mapred.hosts.exclude /var/run/cloudera-scm-agent/process/705- mapreduce- JOBTRACKER/mapred_hosts_exclude.txt dfs.replication 3 ipc.client.tcpnodelay FALSE dfs.namenode.accesstime.precision mapred.output.format.class org.apache.hadoop.examples.terasort.teraoutpu tformat mapred.acls.enabled FALSE s3.stream-buffer-size 4096 mapred.tasktracker.dns.nameserver default mapred.submit.replication 3 io.compression.codecs org.apache.hadoop.io.compress.defaultcodec,o rg.apache.hadoop.io.compress.gzipcodec,org.a pache.hadoop.io.compress.bzip2codec,org.apa che.hadoop.io.compress.deflatecodec,org.apac he.hadoop.io.compress.snappycodec io.file.buffer.size mapred.map.tasks.speculative.execution FALSE dfs.namenode.checkpoint.txns mapred.map.child.log.level INFO kfs.replication 3 rpc.engine.org.apache.hadoop.hdfs.protoc org.apache.hadoop.ipc.protobufrpcengine olpb.clientnamenodeprotocolpb mapred.map.max.attempts 4 dfs.ha.tail-edits.period 60 kfs.stream-buffer-size 4096 mapred.job.shuffle.merge.percent 0.66 hadoop.security.authentication simple fs.s3.buffer.dir ${hadoop.tmp.dir}/s3 mapred.skip.reduce.auto.incr.proc.count mapred.job.tracker.jobhistory.lru.cache.siz e TRUE 5 48 Dell Apache Hadoop Performance Analysis

49 dfs.client.file-block-storagelocations.timeout 60 dfs.datanode.drop.cache.behind.writes FALSE tfile.fs.input.buffer.size dfs.block.access.token.enable FALSE dfs.journalnode.http-address :8480 mapreduce.job.acl-view-job mapred.job.queue.name default ftp.blocksize dfs.datanode.data.dir file://${hadoop.tmp.dir}/dfs/data mapred.job.tracker.persist.jobstatus.hours 0 dfs.https.port dfs.namenode.replication.interval 3 mapred.fairscheduler.assignmultiple TRUE mapreduce.tasktracker.cache.local.numbe rdirectories dfs.namenode.https-address dd4-ae e-31.dell.com:50470 dfs.ha.automatic-failover.enabled FALSE ipc.client.kill.max 10 mapred.healthchecker.script.timeout mapred.tasktracker.map.tasks.maximum 24 hadoop.proxyuser.oozie.hosts * dfs.client.failover.sleep.max.millis jobclient.completion.poll.interval 5000 mapred.job.tracker.persist.jobstatus.dir /jobtracker/jobsinfo mapreduce.shuffle.ssl.port dfs.default.chunk.view.size kfs.bytes-per-checksum 512 mapred.reduce.slowstart.completed.maps 0.8 hadoop.http.filter.initializers org.apache.hadoop.http.lib.staticuserwebfilter mapred.mapper.class org.apache.hadoop.examples.terasort.teragen$ SortGenMapper dfs.datanode.failed.volumes.tolerated 0 io.sort.mb Dell Apache Hadoop Performance Analysis

50 mapred.hosts /var/run/cloudera-scm-agent/process/705- mapreduce- JOBTRACKER/mapred_hosts_allow.txt hadoop.http.authentication.type simple dfs.datanode.data.dir.perm 700 ipc.server.listen.queue.size 128 file.stream-buffer-size 4096 dfs.namenode.fs-limits.max-directoryitems 0 io.mapfile.bloom.size ftp.replication 3 dfs.datanode.dns.nameserver default mapred.child.java.opts -Xmx dfs.replication.max 512 mapred.queue.default.state RUNNING map.sort.class org.apache.hadoop.util.quicksort dfs.stream-buffer-size 4096 hadoop.job.history.location file:////var/log/hadoop-0.20-mapreduce/history dfs.namenode.backup.address :50100 mapred.jobtracker.instrumentation org.apache.hadoop.mapred.jobtrackermetricsin st hadoop.util.hash.type murmur dfs.block.access.key.update.interval 600 dfs.datanode.use.datanode.hostname FALSE dfs.datanode.dns.interface default dfs.namenode.backup.http-address :50105 mapred.output.compression.type BLOCK dfs.thrift.timeout 60 mapred.skip.attempts.to.start.skipping 2 kfs.client-write-packet-size ha.zookeeper.acl world:anyone:rwcda 50 Dell Apache Hadoop Performance Analysis

51 mapreduce.job.dir hdfs://dd4-ae e- 31.dell.com:8020/user/hdfs/.staging/job_ _0001 io.map.index.skip 0 net.topology.node.switch.mapping.impl org.apache.hadoop.net.scriptbasedmapping mapred.cluster.max.map.memory.mb -1 fs.s3.maxretries 4 dfs.namenode.logging.level info s3native.client-write-packet-size mapred.task.tracker.task-controller org.apache.hadoop.mapred.defaulttaskcontroll er mapred.userlog.limit.kb 0 hadoop.http.staticuser.user dr.who mapred.input.format.class org.apache.hadoop.examples.terasort.teragen$ RangeInputFormat mapreduce.ifile.readahead.bytes hadoop.http.authentication.simple.anonym TRUE ous.allowed hadoop.fuse.timer.period 5 dfs.namenode.num.extra.edits.retained hadoop.rpc.socket.factory.class.default org.apache.hadoop.net.standardsocketfactory dfs.namenode.handler.count 10 fs.automatic.close TRUE mapreduce.job.submithostaddress dfs.datanode.directoryscan.interval mapred.map.tasks 180 mapred.local.dir.minspacekill 0 mapred.job.map.memory.mb -1 mapred.jobtracker.completeuserjobs.maxi 100 mum mapreduce.jobtracker.split.metainfo.maxsi ze 51 Dell Apache Hadoop Performance Analysis

52 mapred.cluster.max.reduce.memory.mb -1 mapred.cluster.reduce.memory.mb -1 s3native.replication 3 mapred.task.profile mapred.reduce.parallel.copies 10 dfs.heartbeat.interval 3 FALSE dfs.ha.fencing.ssh.connect-timeout local.cache.size net.topology.script.file.name dfs.client.file-block-storagelocations.num-threads 10 jobclient.progress.monitor.poll.interval 1000 dfs.bytes-per-checksum 512 ftp.stream-buffer-size 4096 mapred.fairscheduler.allow.undeclared.po TRUE ols hadoop.security.group.mapping.ldap.searc member h.attr.member dfs.blockreport.initialdelay 0 mapred.min.split.size 0 hadoop.http.authentication.token.validity dfs.namenode.delegation.token.maxlifetime mapred.output.compression.codec org.apache.hadoop.io.compress.defaultcodec /var/run/cloudera-scm-agent/process/705- mapreduce-jobtracker/topology.py io.sort.factor 64 kfs.blocksize mapred.task.timeout mapred.fairscheduler.poolnameproperty user.name dfs.namenode.secondary.http-address :50090 ipc.client.idlethreshold 4000 ipc.server.tcpnodelay FALSE ftp.bytes-per-checksum 512 mapred.output.dir hdfs://dd4-ae e- 31.dell.com:8020/HiBench/Terasort/Input 52 Dell Apache Hadoop Performance Analysis

53 group.name hdfs s3.bytes-per-checksum 512 mapred.heartbeats.in.second 100 fs.s3.block.size dfs.client.failover.connection.retries 0 mapred.map.output.compression.codec org.apache.hadoop.io.compress.snappycodec hadoop.rpc.protection mapred.task.cache.levels 2 mapred.tasktracker.dns.interface hadoop.security.auth_to_local dfs.secondary.namenode.kerberos.internal. spnego.principal authentication default DEFAULT ${dfs.web.authentication.kerberos.principal} hadoop.proxyuser.hue.hosts * ftp.client-write-packet-size mapred.output.key.class org.apache.hadoop.io.text fs.defaultfs hdfs://dd4-ae e-31.dell.com:8020 file.client-write-packet-size mapred.job.reduce.memory.mb -1 mapred.max.tracker.failures 4 fs.trash.checkpoint.interval 0 mapred.fairscheduler.allocation.file fair-scheduler.xml hadoop.http.authentication.signature.secre t.file ${user.home}/hadoop-http-auth-signaturesecret s3native.stream-buffer-size 4096 mapreduce.reduce.shuffle.read.timeout mapred.tasktracker.tasks.sleeptimebefore-sigkill 5000 dfs.namenode.checkpoint.edits.dir ${dfs.namenode.checkpoint.dir} fs.permissions.umask-mode 22 mapred.max.tracker.blacklists 4 hadoop.common.configuration.version jobclient.output.filter FAILED hadoop.security.group.mapping.ldap.ssl FALSE 53 Dell Apache Hadoop Performance Analysis

54 mapreduce.ifile.readahead io.serializations TRUE org.apache.hadoop.io.serializer.writableserializat ion,org.apache.hadoop.io.serializer.avro.avrospe cificserialization,org.apache.hadoop.io.serializer. avro.avroreflectserialization fs.df.interval io.seqfile.compress.blocksize mapred.jobtracker.taskscheduler org.apache.hadoop.mapred.jobqueuetasksche duler job.end.retry.attempts 0 ipc.client.connect.max.retries 10 hadoop.security.groups.cache.secs 300 dfs.namenode.delegation.key.updateinterval webinterface.private.actions FALSE mapred.tasktracker.indexcache.mb 10 hadoop.security.group.mapping.ldap.searc (&(objectclass=user)(samaccountname={0})) h.filter.user mapreduce.reduce.input.limit -1 dfs.image.compress FALSE mapred.output.value.class org.apache.hadoop.io.text tasktracker.http.threads 40 dfs.namenode.kerberos.internal.spnego.pri ${dfs.web.authentication.kerberos.principal} ncipal fs.s3n.block.size mapred.job.tracker.handler.count 10 fs.ftp.host keep.failed.task.files FALSE mapred.output.compress FALSE hadoop.security.group.mapping org.apache.hadoop.security.shellbasedunixgrou psmapping mapred.jobtracker.job.history.block.size mapred.skip.reduce.max.skip.groups 0 dfs.datanode.address : Dell Apache Hadoop Performance Analysis

55 dfs.datanode.https.address :50475 file.replication 1 dfs.datanode.drop.cache.behind.reads FALSE hadoop.fuse.connection.timeout 300 mapred.jar /user/hdfs/.staging/job_ _0001/job.jar hadoop.work.around.non.threadsafe.getp wuid mapreduce.client.genericoptionsparser.us ed hadoop.tmp.dir FALSE TRUE /tmp/hadoop-${user.name} dfs.client.block.write.replace-datanodeon-failure.policy DEFAULT mapred.line.input.format.linespermap 1 hadoop.kerberos.kinit.command kinit dfs.webhdfs.enabled FALSE dfs.datanode.du.reserved 0 file.bytes-per-checksum 512 dfs.thrift.socket.timeout mapred.local.dir.minspacestart 0 mapred.jobtracker.maxtasks.per.job -1 dfs.client.block.write.replace-datanodeon-failure.enable TRUE dfs.thrift.threads.min 10 mapred.user.jobconf.limit mapred.reduce.max.attempts 4 net.topology.script.number.args 100 dfs.namenode.decommission.interval 30 mapred.job.tracker dd4-ae e-31.dell.com:8021 dfs.image.compression.codec org.apache.hadoop.io.compress.defaultcodec dfs.namenode.support.allow.format hadoop.ssl.hostname.verifier mapred.tasktracker.instrumentation TRUE DEFAULT org.apache.hadoop.mapred.tasktrackermetricsi nst io.mapfile.bloom.error.rate Dell Apache Hadoop Performance Analysis

56 dfs.permissions.superusergroup supergroup mapred.tasktracker.expiry.interval hadoop.proxyuser.hue.groups * io.sort.record.percent mapred.job.tracker.persist.jobstatus.active FALSE dfs.namenode.checkpoint.check.period 60 io.seqfile.local.dir ${hadoop.tmp.dir}/io/local tfile.io.chunk.size file.blocksize hadoop.proxyuser.oozie.groups * mapreduce.job.acl-modify-job io.skip.checksum.errors dfs.namenode.edits.journal-plugin.qjournal FALSE org.apache.hadoop.hdfs.qjournal.client.quorum JournalManager mapred.temp.dir ${hadoop.tmp.dir}/mapred/temp dfs.datanode.handler.count 10 dfs.namenode.decommission.nodes.per.int 5 erval fs.ftp.host.port 21 dfs.namenode.checkpoint.period 3600 dfs.namenode.fs-limits.max-componentlength 0 fs.abstractfilesystem.viewfs.impl org.apache.hadoop.fs.viewfs.viewfs dfs.datanode.ipc.address :50020 mapred.working.dir hdfs://dd4-ae e- 31.dell.com:8020/user/hdfs hadoop.ssl.require.client.cert FALSE dfs.datanode.max.transfer.threads 4096 mapred.job.reduce.input.buffer.percent 0 hadoop.ssl.require.client.cert FALSE dfs.datanode.max.transfer.threads 4096 mapred.job.reduce.input.buffer.percent 0 56 Dell Apache Hadoop Performance Analysis