PERFORMANCE MODELS FOR APACHE ACCUMULO:

Similar documents

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Benchmarking Cassandra on Violin

GraySort on Apache Spark by Databricks

Completing the Big Data Ecosystem:

HADOOP PERFORMANCE TUNING

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

InfiniteGraph: The Distributed Graph Database

CSE-E5430 Scalable Cloud Computing Lecture 2

Parallel Scalable Algorithms- Performance Parameters

Benchmarking Hadoop & HBase on Violin

Constructing a Data Lake: Hadoop and Oracle Database United!

Hypertable Architecture Overview

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

MapReduce Jeffrey Dean and Sanjay Ghemawat. Background context

Understanding the Benefits of IBM SPSS Statistics Server

Big Data Processing with Google s MapReduce. Alexandru Costan

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

How To Scale Out Of A Nosql Database

Agenda. Some Examples from Yahoo! Hadoop. Some Examples from Yahoo! Crawling. Cloud (data) management Ahmed Ali-Eldin. First part: Second part:

Energy Efficient MapReduce

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Chapter 7. Using Hadoop Cluster and MapReduce

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Using an In-Memory Data Grid for Near Real-Time Data Analysis

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting

ZooKeeper. Table of contents

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Apache Hama Design Document v0.6

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Cloud Computing at Google. Architecture

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem B Y R A H I M A.

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Apache HBase. Crazy dances on the elephant back

I/O Considerations in Big Data Analytics

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

Architectures for Big Data Analytics A database perspective

Can the Elephants Handle the NoSQL Onslaught?

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Putting Apache Kafka to Use!

Four Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014

Resource Utilization of Middleware Components in Embedded Systems

Reference Architecture, Requirements, Gaps, Roles

CitusDB Architecture for Real-Time Big Data

CSE-E5430 Scalable Cloud Computing Lecture 11

Optimization and analysis of large scale data sorting algorithm based on Hadoop

Big Data in the Enterprise: Network Design Considerations

Whitepaper: performance of SqlBulkCopy

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Performance and Energy Efficiency of. Hadoop deployment models

Big Data Challenges in Bioinformatics

Hadoop and Map-Reduce. Swati Gore

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Online Transaction Processing in SQL Server 2008

Management & Analysis of Big Data in Zenith Team

A Brief Outline on Bigdata Hadoop

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Optimizing Performance. Training Division New Delhi

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

Deep Dive: Maximizing EC2 & EBS Performance

How To Handle Big Data With A Data Scientist

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics

NoSQL and Hadoop Technologies On Oracle Cloud

Couchbase Server Under the Hood

BIG DATA What it is and how to use?

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Big Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island

Big Data With Hadoop

Speeding Up Cloud/Server Applications Using Flash Memory

Developing MapReduce Programs

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Distributed Computing and Big Data: Hadoop and MapReduce

A programming model in Cloud: MapReduce

SCALABILITY AND AVAILABILITY

Using distributed technologies to analyze Big Data

High-Volume Data Warehousing in Centerprise. Product Datasheet

Hadoop implementation of MapReduce computational model. Ján Vaňo

Integrating Big Data into the Computing Curricula

Duke University

GraySort and MinuteSort at Yahoo on Hadoop 0.23

ArcGIS for Server Performance and Scalability: Testing Methodologies. Andrew Sakowicz, Frank Pizzi,

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Architecting Distributed Databases for Failure A Case Study with Druid

Innovative technology for big data analytics

Unified Batch & Stream Processing Platform

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Make the Most of Big Data to Drive Innovation Through Reseach

Transcription:

Securely explore your data PERFORMANCE MODELS FOR APACHE ACCUMULO: THE HEAVY TAIL OF A SHAREDNOTHING ARCHITECTURE Chris McCubbin Director of Data Science Sqrrl Data, Inc.

I M NOT ADAM FUCHS But perhaps I m still an interesting guy MS in CS from UMBC in Network Security and Quantum Computing 8 years at JHU/APL working on UxV Swarms 4 years at JHU/APL and TexelTek creating Big Data Applications for the NSA Co-founder and Director of Data Science at Sqrrl 2

SO, YOUR DISTRIBUTED APPLICATION IS SLOW Today s distributed applications run on tens or hundreds of library components Many versions so internet advice could be ineffective, or worse, flat out wrong Hundreds of settings Some, shall we say, could be better documented Shared-nothing architectures are usually sharedlittle architectures with tricky interactions Profiling is hard and time-consuming What do we do? 3

TODAY S TALK 1. Quick intro to performance optimization 2. Tricks and techniques for targeted distributed application modeling performance improvement 3. A deep dive into improving bulk load application performance 4

The Apache Accumulo sorted, distributed key/value store is a secure, robust, scalable, high performance data storage and retrieval system. Many applications in real-time storage and analysis of big data : Spatio-temporal indexing in non-relational distributed databases - Fox et al 2013 IEEE International Congress on Big Data Big Data Dimensional Analysis - Gadepally et al IEEE HPEC 2014 Leading its peers in performance and scalability: Achieving 100,000,000 database inserts per second using Accumulo and D4M - Kepner et al IEEE HPEC 2014 An NSA Big Graph experiment (Technical Report NSA-RD-2013-056002v1) Benchmarking Apache Accumulo BigData Distributed Table Store Using Its Continuous Test Suite - Sen et al 2013 IEEE International Congress on Big Data For more papers and presentations, see http://accumulo.apache.org/papers.html 5

SCALING UP: DIVIDE & CONQUER Well- Known Loca9on (zookeeper) Metadata Tablet 1 - to Encyclopedia:Ocelot Root Tablet - to Metadata Tablet 2 Encyclopedia:Ocelot to Collections of KV pairs form Tables Tables are partitioned into Tablets Metadata tablets hold info about other tablets, forming a 3-level hierarchy A Tablet is a unit of work for a Tablet Server Table: Adam s Table Table: Encyclopedia Table: Foo Data Tablet - : thing Data Tablet thing : Data Tablet - : Ocelot Data Tablet Ocelot : Yak Data Tablet Yak : Data Tablet - to 6

PERFORMANCE ANALYSIS CYCLE Start: Create Model Simulate & Experiment Outputs: Better Code + Models Refine Model Analyze Modify Code 7

MAKING A MODEL Determine points of low-impact metrics Add some if needed Create parallel state machine models with components driven by these metrics Estimate running times and bottlenecks from a-priori information and/or apply measured statistics Focus testing on validation of the initial model and the (estimated) pain points Apply Amdahl s Law Rinse, repeat 8

BULK INGEST OVERVIEW Accumulo supports two mechanisms to bring data in: streaming ingest and bulk ingest. Bulk Ingest Goal: maximize throughput without constraining latency. create a set of Accumulo Rfiles, then register those files with Accumulo. RFiles are groups of sorted key-value pairs with some indexing information MapReduce has a built-in key sorting phase: a good fit to produce RFiles 9

BULK INGEST MODEL Map Reduce Register Time 10

BULK INGEST MODEL Hypothetical Resource Usage Map Reduce Register 100% CPU 20% Disk 0% Network 46 seconds 40% CPU 100% Disk 20% Network 168 seconds 10% CPU 20% Disk 40% Network 17 seconds Time 11

INSIGHT Spare disk here, spare CPU there can we even out resource consumption? Why did reduce take 168 seconds? It should be more like 40 seconds. No clear bottleneck during registration is there a synchronization or serialization problem? Map Reduce Register 100% CPU 20% Disk 0% Network 46 seconds 40% CPU 100% Disk 20% Network 168 seconds 10% CPU 20% Disk 40% Network 17 seconds Time 12

LOOKING DEEPER: REFINED BULK INGEST MODEL Map Thread Map Setup Map Sort Spill Merge Serve Parallel Latch Reduce Thread Shuffle Sort Reduce Output Time 13

BULK INGEST MODEL PREDICTIONS We can constrain parts of the model by physical throughput limitations Disk -> memory (100Mbps avg 7200rpm seq. read rate) Input reader Memory -> Disk (100Mbps) Spill, OutputWriter Disk -> Disk (50Mbps) Merge Network (Gigabit = 125Mbps) Shuffle And/or algorithmic limitations Sort, (Our) Map, (Our) Reduce, SerDe 14

PERFORMANCE GOAL MODEL Performance goals obtained through: Simulation of individual components Prediction of available resources at runtime 15

INSTRUMENTATION application version 1.3.3 SYSTEM DATA application sha 8d17baf8 node num 1 input type arcsight yarn.nodemanager.resource.memory-mb 43008 map num containers 20 input block size 32 yarn.scheduler.minimum-allocation-mb 2048 red num containers 20 input block count 20 yarn.scheduler.maximum-allocation-mb 43008 cores physical 12 input total 672054649 yarn.app.mapreduce.am.resource.mb 2048 cores logical 24 output map 9313303723 yarn.app.mapreduce.am.command-opts -Xmx1536m disk num 8 output map:combine input records 243419324 mapreduce.map.memory.mb 2048 disk bandwidth 100 output map:combine records out 209318830 mapreduce.map.java.opts -Xmx1638m replication 1 output map:spill 7325671992 mapreduce.reduce.memory.mb 2048 monitoring TRUE output final 573802787 mapreduce.reduce.java.opts -Xmx1638m output map:combine 7301374577 mapreduce.task.io.sort.mb 100 TIME mapreduce.map.sort.spill.percent 0.8 map:setup avg 8 RATIOS mapreduce.task.io.sort.factor 10 map:map avg 12 input explosion factor 13.877904 mapreduce.reduce.shuffle.parallelcopies 5 map:sort avg 12 compression intermediate 1.003327786 mapreduce.job.reduce.slowstart.completedmaps 1 map:spill avg 12 load combiner output 0.783972562 mapreduce.map.output.compress FALSE map:spill count 7 total ratio 0.786581455 mapred.map.output.compression.codec n/a map:merge avg 46 description baseline map total 290 CONSTANTS red:shuffle avg 6 avg schema entry size (bytes) 59 red:merge avg 38 red:reduce avg 68 effective MB/sec 1.618488025 red:total avg 112 red:reducer count 20 job:total 396 16

PERFORMANCE MEASUREMENT Baseline (naive implementation) Map Thread Map Setup Map Sort Spill Merge Serve Reduce Thread Shuffle Sort Reduce Output 17

PATH TO IMPROVEMENT 1. Profiling revealed much time spent serializing/ deserializing Key 2. With proper configuration, MapReduce supports comparison of keys in serialized form 3. Rewriting Key s serialization lead to an order-preserving encoding, easy to compare in serialized form 4. Configure MapReduce to use native code to compare Keys 5. Tweak map input size and spill memory for as few spills as possible 18

PERFORMANCE MEASUREMENT Optimized sorting Improvements: Time for map-side merge went down Sort performance drastically improved in both map and reduce phases 300% faster 19

PERFORMANCE MEASUREMENT Optimized sorting Map Thread Map Setup Map Sort Spill Merge Serve Reduce Thread Shuffle Sort Reduce Output Insights: Map is slower than expected Output is disk bound maybe we can move more processing to Reduce Reverse Amdahl s law Intermediate data inflation ratio (output/input for map) is very high 20

PATH TO IMPROVEMENT 1. Profiling revealed much time spent copying data 2. Evaluation of data passed from map to reduce revealed inefficiencies: Constant timestamp cost 8 bytes per key Repeated column names could be encoded/ compressed Some Key/Value pairs didn t need to be created until reduce 21

PERFORMANCE MEASUREMENT Optimized map code Improvement: Big speedup in map function Twice as fast Reduced intermediate inflation sped up all steps between map and reduce 22

DO TRY THIS AT HOME Hints for Accumulo Application Optimization With these steps, we achieved 6X speedup: Perform comparisons on serialized objects With Map/Reduce, calculate how many merge steps are needed Avoid premature data inflation Leverage compression to shift bottlenecks Always consider how fast your code should run 23

SOME CURRENT ACCUMULO PERFORMANCE PROJECTS Optimize metadata operations Batch to improve throughput (ACCUMULO-2175, ACCUMULO-2889) Remove from critical path where possible Optimize write-ahead log performance Maximize throughput Reduce flushes Parallelize WALs (ACCUMULO-1083) Avoid downtime by pre-allocating 24

Securely explore your data SQRRL IS HIRING! QUESTIONS? Chris McCubbin Director of Data Science Sqrrl Data, Inc.