Big Fast Data Hadoop acceleration with Flash June 2013
Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results
The Big Data Problem Big Data Output Facebook Traditional Relational Database Friend Map Approaches Information comes from a wide Data Models are developed variety of sources based on queries and data Value can often be derived d from by sources requirements combining this with other sources of Traditional process involves information significant expense and time Traditional Approach Challenges Time to insight, scale, importing large quantities of Data.
Big Data Answer Hadoop architecture allows a cluster of commodity servers to work together to solve big data analytical problems. Hadoop Architecture Save everything. Scan all of the dt data via brute force. Focus on making brute force scanning efficient. Traditional Architecture Massage data into a structured database discarding everything outside of the data model. Build an efficient data model to processs queries efficiently. Hadoop can be best understood as a two step process: Structure & Query Which corresponds to the Hadoop nomenclature of Map & Reduce
Hadoop Hadoop architecture is a combination of three components: 1. An implementation of Map-Reduce to utilize clusters more effectively. 2. HDFS distributed file system 3. Bringing the processing to the data rather than alternative of bringing the data to the processing Hadoop architecture & clusters go together Hadoop architecture utilizes computer hardware components that are cheap and powerful. Developed to allow efficient use thousands of CPU cores and disks Hadoop architecture is rigid in the processing steps (Map & Reduce) to enable massive horizontal cluster scaling and uses multiple passes over a dataset.
Hadoop Design & Flash
Hadoop Data Flow 1. Map (Structure) 2. Shuffle, Sort & Merge (Organize structured intermediate data to query) 3. Reduce (Query)
Where to use Flash Shuffle, Sort and Merge! The shuffle, sort and merge (The Shuffle Phase) uses local temporary storage on each node outside of HDFS. The results of the maps have to be committed to disk before the reduce processes start. The reducers fetch this intermediate data over the network. This can be very IO intensive and cannot leverage bringing the processing to the data,, instead the data is brought to the processing nodes.
Apache Hadoop Map Reduce Local IO Access Pattern I/O i b th d d ti l i diff t t f th j b I/O is both random and sequential in different parts of the job. Shuffle reads are random with temporal locality (cache friendly).
Map Reduce Requirements and Guidelines Require high IOPS and high bandwidth for different parts of the shuffle phase. Must be large enough to handle the biggest intermediate data set that a cluster node will run. If the directory is filled, the job fails. Intermediate data is deleted when it is no longer needed. Hadoop uses a balanced read/write workload, emlc is the ideal media.
The Solution Nytro MegaRAID Key Features Transparent to Applications, File system, OS and device drivers Based on industry hardened MegaRAID technology Supports Read and Write caching Integrated in the HBA and runs locally on the controller Limited CPU and memory overhead Accelerates rebuild Accelerates workloads spanning from Analytics, OLTP to virtualized servers Local HDD Array (DAS) Seamless, Plug-n-Play and Transparent acceleration for Server/Workstation Storage
Test Environment
Test Environment Worker nodes 12 cores 32 GB RAM 7 500 GB SAS Disks Mirrored boot drives 10 GigE networking Apache Hadoop 1.0.2 Map reduce local 7 Volumes (1 per disk) Boot HDFS 7V Volumes (1 per disk)
Full test setup 3 Worker Nodes 1 Name Node/ Worker node 10 GigE Interconnect
Nytro MegaRAID 100 GB TeraSort Run 7 Disks 7 Disks 7 Disks No Caching: 18 Minutes 23 seconds With LSI Nytro Caching enabled: 12 Minutes 15 Seconds 33% reduction in job completion time.
Other requirements for effective Flash usage No CPU Bottlenecks Enough cores per node to keep the storage and network saturated Faster Network interfaces to support the shuffle phase storage capabilities with flash higher performance networking is recommended (10 GigE or IB) Enough local disks to avoid HDFS being a bottleneck Once other requirements are met, substantial acceleration with Flash is possible LSI Proprietary
Updated config Migrate Boot volume onto a small Flash partition freeing up drives for HDFS. Can cover the cost of the flash caching completely. Boot partition - ~20 GB Mirrored Map reduce local 9 Volumes (1 per disk) HDFS 9V Volumes (1 per disk)
Key Take Aways Hadoop leverages that computer hardware components are cheap and powerful. Hadoop require high IOPS and high bandwidth for different parts of the shuffle phase. Using Flash as a Cache is a both an effective and cost effective way to improve Hadoop performance. LSI Proprietary