Linux Performance Optimizations for Big Data Environments Dominique A. Heger Ph.D. DHTechnologies (Performance, Capacity, Scalability) www.dhtusa.com Data Nubes (Big Data, Hadoop, ML) www.datanubes.com
Performance & Capacity Studies Availability & Reliability Studies Systems Modeling Scalability & Speedup Studies Linux & UNIX Internals Design, Architecture & Feasibility Studies Systems Stress- Testing & Benchmarking Cloud Computing Research, Education & Training Machine Learning Operations Research BI, Data Analytics, Data Mining, Predictive Analytics Hadoop Ecosystem & MapReduce www.dhtusa.com
Agenda Linux & Big Data (Hadoop Ecosystem) Performance Management Methodology Linux 3.x Task & I/O Framework Quantifying Linux & Application Performance Q&A
Linux Engineers Big Demand & Small Talent Pool Big Data, Hadoop Ecosystem & Cloud Computing in general is powered by Linux 91.4% of the top 500 supercomputers are Linux-based (Source TOP500, 2012) Linux Talent needed now A 2013 job report compiled by Dice showed that 93% of the contacted US companies (850 firms) are hiring Linux professionals this year (2013) The same study revealed that 90% of the firms stated that it is very difficult at the moment to even find Linux talent in the US. This number is up from 80% for the 2012 study According to Dice, the average salary increase for a Linux professional in the US is approximately 9% this year. At the same time, the average IT salary increase in the US is approximately 5%
Hadoop Ecosystem (Partial View) Twitter Real-Time Processing Data Handlers Data Serialization System Configuration Management Tools KAFKA Distributed Messaging System Schedulers RDBMS Database & No
Hadoop Linux Interaction Language Abstraction Java API MapReduce Framework(*) Hadoop Distributed Filesystem Hadoop (*) Some Hadoop Projects Bypass MapReduce Linux OS Node HW OS & Local FS HW Components
Hadoop MR2 Environment
Performance Management - Building Blocks Phase 1: Understand Goals & Objectives Phase 2: Phase 3: HW Profiles Workload Profiles Application & OS Traces Data Post-Processing Phase 4: Performance Study CSA Study Phase 5: Capacity Study Scalability Study Speedup Study
Performance Evaluation - Goals & Objectives Identify bottlenecks, predict future capacity shortcomings, and determine the most adequate (cost effective) way to configure, tune, and optimize computing environments to overcome performance problems and cope with increasing application workload demands. Combination of analytical, simulation, and empirical study based approaches that utilizes tracing techniques, HW profiles, actual application workload profiles, application log files, and performance data collected either in a Lab or production environment. If no empirical data is available, performance budgets are being used (PE). 9
Application Centric Systems Analysis System Hierarchy Application Abstraction Operating System Abstraction Hardware Abstraction Performance Hierarchy Application Vector Performance Code Path - Application to OS Interface Performance Code Path - High to Low Level OS Interface Performance Code Path - OS to HW Interface OS Vector Application Primitives High-Level OS Primitives Low-Level OS Primitives Hardware Primitives Process/Thread Monitors Application Trace Tools & Macro Benchmarks OS Performance Tools & Micro Benchmarks
Linux & Hadoop Tools & Techniques Linux Performance Evaluation Tools (Code Path Analysis) strace nmon, blktrace, blkparse, btt, blkiomon, iostat perf valgrind, kcachegrind Workload Generators (Macro & Micro Benchmarks) DHTUX toolset (Unix, Linux 46 systems benchmarks) TeraSort (Hadoop) K-Means Clustering (Hadoop) Bayesian Classification (Hadoop)
Performance by the Numbers (Ballpark Figures) L1 cache reference TLB miss Branch misprediction L2 cache reference Mutex lock/unlock Main memory reference Compress 1Kbytes with Zippy Send 2Kbytes over 1Gbps network Read 1MB sequentially from memory Round trip within same datacenter Disk seek (HD) Read 1MB sequentially from disk (HD) Send packet CA->UK->CA 1ns 4ns 5ns 7ns 25ns 100ns 3,000ns 20,000ns 250,000ns 500,000ns 10,000,000ns 20,000,000ns 150,000,000ns Execute Micro & Macro Benchmarks to Baseline the HW and the OS Hadoop MapReduce: With large-scale projects, the performance focus is on disk and interconnect/network performance rather than on the CPU and the DRAM subsystems
Micro & Macro Benchmarks Benchmarking & Stress-Testing the HW & the OS prior to deploying the Cluster Nodes Establish a Sound Performance Baseline
Application User Space Linux I/O Requests File System Layer Linux bio Layer Linux dequeue Function I/O Task Queue Linux enqueue Function Device Driver I/O Scheduler Disk/RAID/SAN Subsystem
Linux 3.x IO Schedulers (3.x) CFQ (default) synchronous verses asynchronous requests, IO priority, read favored over write requests, time-out value noop unordered FIFO queue, only merging, good for environments where IO is optimized at a lower level Deadline 5 IO queues, reorder requests, deadline value, read favored over write requests
Application Layer - strace
Kernel Layer - blktrace/blkparse
Kernel Layer - blktrace - Summary
Kernel Layer - btt
Kernel Layer - btt (time-line)
perf Linux Performance Tool
valgrin memcheck (Memory Leaks)
valgrin kcachegrin (Call Profiler)
Q & A