Introducing EEMBC Cloud and Big Data Server Benchmarks



Similar documents
VP/GM, Data Center Processing Group. Copyright 2014 Cavium Inc.

Business opportunities from IOT and Big Data. Joachim Aertebjerg Director Enterprise Solution Sales Intel EMEA

Assignment # 1 (Cloud Computing Security)

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

Enabling High performance Big Data platform with RDMA

Chapter 7. Using Hadoop Cluster and MapReduce

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

System Models for Distributed and Cloud Computing

Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds

Big Data - Infrastructure Considerations

Virtualizing Apache Hadoop. June, 2012

Virtualization Performance Insights from TPC-VMS

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

HiBench Introduction. Carson Wang Software & Services Group

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Energy Efficient MapReduce

Why Private Cloud? Nenad BUNCIC VPSI 29-JUNE-2015 EPFL, SI-EXHEB

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

PARALLELS CLOUD SERVER

How To Improve The Fit For Purpose Model At Nationwide It

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

OTM in the Cloud. Ryan Haney

Fast, Low-Overhead Encryption for Apache Hadoop*

Gartner RPE2 Methodology Overview

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

Cloud Computing. Big Data. High Performance Computing

Business white paper. HP Process Automation. Version 7.0. Server performance

Infrastructure Matters: POWER8 vs. Xeon x86

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Rackspace Cloud Databases and Container-based Virtualization

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Building Private & Hybrid Cloud Solutions

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms?

Is there any alternative to Exadata X5? March 2015

BIG DATA TRENDS AND TECHNOLOGIES

RED HAT CLOUD SUITE FOR APPLICATIONS

Virtualization and Cloud Management Using Capacity Planning

Introduction to Cloud Computing

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Big Data Performance Growth on the Rise

Evaluating HDFS I/O Performance on Virtualized Systems

Software defined Storage The next generation of Storage virtualization

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Evaluation Methodology of Converged Cloud Environments

BIG DATA-AS-A-SERVICE

Performance And Scalability In Oracle9i And SQL Server 2000

TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Cloud Computing through Virtualization and HPC technologies

Maximizing Hadoop Performance with Hardware Compression

Red Hat Network Satellite Management and automation of your Red Hat Enterprise Linux environment

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi

Red Hat Satellite Management and automation of your Red Hat Enterprise Linux environment

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Big Fast Data Hadoop acceleration with Flash. June 2013

Performance Management for Cloudbased STC 2012

DELL s Oracle Database Advisor

Dell Reference Configuration for Hortonworks Data Platform

Cloud Computing. Adam Barker

An Oracle White Paper August Oracle VM 3: Application-Driven Virtualization

Data Centric Systems (DCS)

FLOW-3D Performance Benchmark and Profiling. September 2012

SQL Server 2008 Performance and Scale

Datacenter Operating Systems

CS 6343: CLOUD COMPUTING Term Project

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

A Brief Introduction to Apache Tez

CHAPTER 8 CLOUD COMPUTING

Cloud Computing Benchmarking: A Survey

Building Success on Acquia Cloud:

Tableau Server 7.0 scalability

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. Version 1.1 (June 19, 2012)

TPC-W * : Benchmarking An Ecommerce Solution By Wayne D. Smith, Intel Corporation Revision 1.2

Integrated Grid Solutions. and Greenplum

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Windows Server 2008 R2 Hyper-V Live Migration

Different NFV/SDN Solutions for Telecoms and Enterprise Cloud

Quattra s Cloud Vision & Framework Value

The Future of Servers in Cloud Computing

Copyright 2014 Oracle and/or its affiliates. All rights reserved.

Cloud Computing Backgrounder

Scaling Database Performance in Azure

Cloud Servers in the Datacenter: The Evolution of Density-Optimized

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Breaking Through the Virtualization Stall Barrier

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

CLOUD BENCHMARK ROUND 1

Big data management with IBM General Parallel File System

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

In a dynamic economic environment, your company s survival

Transcription:

Introducing EEMBC Cloud and Big Data Server Benchmarks

Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific benchmarks Targeting processors and systems Expansive Industry Support >47 members >90 commercial licensees >120 university licensees

General Characteristics of Cloud and Big Data Drinking from the fire hose Distribute data to many compute nodes Graph analytics Hadoop map reduce Unstructured data search and indexing IOT BIG DATA INFLUX Data Center Interconnect

Traditional Method of Measuring Server Performance Single threaded program(s) Databases Compilers Interpreters Single or a few machines Most successful are CPU/Memory (examples) Linpack SpecInt Lmbench CoreMark SPECInt is a registered trademark of the Standard Performance Evaluation Corporation (SPEC)

How Cloud and Big Data Workloads Differ CPU CPU/Memory Speed Transaction Access and Update Data ScaleOut Analysis Generate Insight Data sets typically larger Trending towards petabytes Rapid growth Many node environment Distributed data (e.g. HDFS) Distributed computation Nodes often special purpose Webserver Database server Caching layer Map reduce cluster

Introducing EEMBC Cloud and Big Data Server Benchmark Working Group Goal: Provide an industry standard suite of performance and efficiency benchmarks that address the needs of ODMS and OEMS providing compute systems to the scaleout datacenter marketplace and their consumers. Phased rollout starting with standalone workloads First phase will comprise graph analytics, memory caching, media serving Chaired by Narayan Iyengar, Lead Software Engineer at Cavium, Inc.

Industry Benchmark Qualifications Automated install and build process ensures consistent execution (multiplatform support) Relatively low cost to implement Does not require a large or expensive infrastructure) Predictable performance at scale Repeatable, verifiable, and certifiable - as in other EEMBC benchmarks

Memory Caching Analysis Basics Caching is used in data centers to optimize performance and energy usage Memcached is middleware that provides a caching layer to a web framework http://en.wikipedia.org/wiki/memcached EEMBC version Provide web workloads that mimic real-world scenarios Provide a mechanism to run repeatable and verifiable experiments

Basics Media Serving Real-time video streaming function for on-demand access using large server clusters to packetize and transmit media files Automatically adjust quality based on various preencoded formats and bit-rates to suit wide client base. Example media streaming services include NetFlix, YouTube, Pandora EEMBC version Simulate multiple users or requests simultaneously and asynchronously making requests Provide a mechanism to run repeatable and verifiable experiments for how well clients are being serviced

Graph Analytics Basics Take big-data data sets (e.g. social media output) and analyze using graph algorithms (find connectivity, common qualities to nodes). Example is page rank; deriving website popularity from social data. Also used for applications such as Facebook and Twitter EEMBC version Standardized implementation of page rank using GraphLab Provide a mechanism to run repeatable and verifiable experiments on a multi-node platform

EEMBC S Expanding Scope Traditional EEMBC Target - CPU Vendor CPU Memory Storage Network I/O Data Center I/O Expanded EEMBC Target - SoC Vendor CPU Memory Storage Network I/O Data Center I/O EEMBC Transition - System Vendor CPU Memory Storage Network I/O Data Center I/O Requires Benchmark Scaling - Cloud Vendor CPU Memory Storage Network I/O Data Center I/O Processors -> SoCs -> Systems

EEMBC S Expanding Scope SoC integration requires testing more than CPU and memory Focus on real-world benchmarking Single purpose servers/clusters run a small set of applications Hardware configured for an application Memory Size CPU Scalar Performance vs. Throughput Storage Capacity Hardware Accelerators

Cloud and Big Data Benchmarks the EEMBC Way EEMBC has a long track record of producing reliable, equitable benchmarks Open, multi-partner cooperative working group Participating members include Cavium, Imagination Technologies, Intel, and others (pending permission to announce) Join this working group and help influence the future of cloud and big data benchmarking Contact Markus.levy@eembc.org

Backup

cpu benchmarks aren t fit for big data and cloud SPECInt2006 today s server CPU benchmark standard A mixture of cache friendly and very memory intensive applications from a variety of fields CPU focused (scalar performance) Not a distributed application Essentially no I/O (network or disk) No operating system or hypervisor impact SpecRate is simple aggregation of SpecInt No cooperative tasks No sharing, no communication EEMBC MultiBench Similar to SPECInt2006 with the exception of operating system impact and inclusion of cooperative tasks

Why Transaction oriented benchmarks are not suitable for cloud and big data TPC Includes system overhead Can be large (and expensive to setup and run) Generally - requires a big system SpecJBB Requires JAVA - is it a JAVA benchmark? Similar transaction model to TPC like benchmarks

Other Benchmarks Spec OSG Working Group* Addresses Cloud environment (SaaS, PassS, IaaS) Hardware and cloud providers and cloud customers Black box and white box environments Agility, elasticity, provisioning, etc. EPFL CloudSuite Specific sets of workloads Does not address SaaS, PaaS or IaaS specifically Great for academic focus, but not designed for ease of use, verification, and validity * As described by OSG Cloud Subcommittee Report

significantly different instruction miss rate SpecINT2006 160 Instruction Misses Per Thousand Instructions 140 120 100 CloudSuite 80 60 40 20 0 data caching data serving map reduce media sat solver web front streaming end web search specint tpc-c tpc-e See Ferdman et al, ACM transactions on computer systems,nov 2012 (compares Cloudsuite characteristics to Spec, TPC, Parsec) Large I cache footprint Lower IPC Lower MLP (memory parallelism)