Introducing EEMBC Cloud and Big Data Server Benchmarks
Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific benchmarks Targeting processors and systems Expansive Industry Support >47 members >90 commercial licensees >120 university licensees
General Characteristics of Cloud and Big Data Drinking from the fire hose Distribute data to many compute nodes Graph analytics Hadoop map reduce Unstructured data search and indexing IOT BIG DATA INFLUX Data Center Interconnect
Traditional Method of Measuring Server Performance Single threaded program(s) Databases Compilers Interpreters Single or a few machines Most successful are CPU/Memory (examples) Linpack SpecInt Lmbench CoreMark SPECInt is a registered trademark of the Standard Performance Evaluation Corporation (SPEC)
How Cloud and Big Data Workloads Differ CPU CPU/Memory Speed Transaction Access and Update Data ScaleOut Analysis Generate Insight Data sets typically larger Trending towards petabytes Rapid growth Many node environment Distributed data (e.g. HDFS) Distributed computation Nodes often special purpose Webserver Database server Caching layer Map reduce cluster
Introducing EEMBC Cloud and Big Data Server Benchmark Working Group Goal: Provide an industry standard suite of performance and efficiency benchmarks that address the needs of ODMS and OEMS providing compute systems to the scaleout datacenter marketplace and their consumers. Phased rollout starting with standalone workloads First phase will comprise graph analytics, memory caching, media serving Chaired by Narayan Iyengar, Lead Software Engineer at Cavium, Inc.
Industry Benchmark Qualifications Automated install and build process ensures consistent execution (multiplatform support) Relatively low cost to implement Does not require a large or expensive infrastructure) Predictable performance at scale Repeatable, verifiable, and certifiable - as in other EEMBC benchmarks
Memory Caching Analysis Basics Caching is used in data centers to optimize performance and energy usage Memcached is middleware that provides a caching layer to a web framework http://en.wikipedia.org/wiki/memcached EEMBC version Provide web workloads that mimic real-world scenarios Provide a mechanism to run repeatable and verifiable experiments
Basics Media Serving Real-time video streaming function for on-demand access using large server clusters to packetize and transmit media files Automatically adjust quality based on various preencoded formats and bit-rates to suit wide client base. Example media streaming services include NetFlix, YouTube, Pandora EEMBC version Simulate multiple users or requests simultaneously and asynchronously making requests Provide a mechanism to run repeatable and verifiable experiments for how well clients are being serviced
Graph Analytics Basics Take big-data data sets (e.g. social media output) and analyze using graph algorithms (find connectivity, common qualities to nodes). Example is page rank; deriving website popularity from social data. Also used for applications such as Facebook and Twitter EEMBC version Standardized implementation of page rank using GraphLab Provide a mechanism to run repeatable and verifiable experiments on a multi-node platform
EEMBC S Expanding Scope Traditional EEMBC Target - CPU Vendor CPU Memory Storage Network I/O Data Center I/O Expanded EEMBC Target - SoC Vendor CPU Memory Storage Network I/O Data Center I/O EEMBC Transition - System Vendor CPU Memory Storage Network I/O Data Center I/O Requires Benchmark Scaling - Cloud Vendor CPU Memory Storage Network I/O Data Center I/O Processors -> SoCs -> Systems
EEMBC S Expanding Scope SoC integration requires testing more than CPU and memory Focus on real-world benchmarking Single purpose servers/clusters run a small set of applications Hardware configured for an application Memory Size CPU Scalar Performance vs. Throughput Storage Capacity Hardware Accelerators
Cloud and Big Data Benchmarks the EEMBC Way EEMBC has a long track record of producing reliable, equitable benchmarks Open, multi-partner cooperative working group Participating members include Cavium, Imagination Technologies, Intel, and others (pending permission to announce) Join this working group and help influence the future of cloud and big data benchmarking Contact Markus.levy@eembc.org
Backup
cpu benchmarks aren t fit for big data and cloud SPECInt2006 today s server CPU benchmark standard A mixture of cache friendly and very memory intensive applications from a variety of fields CPU focused (scalar performance) Not a distributed application Essentially no I/O (network or disk) No operating system or hypervisor impact SpecRate is simple aggregation of SpecInt No cooperative tasks No sharing, no communication EEMBC MultiBench Similar to SPECInt2006 with the exception of operating system impact and inclusion of cooperative tasks
Why Transaction oriented benchmarks are not suitable for cloud and big data TPC Includes system overhead Can be large (and expensive to setup and run) Generally - requires a big system SpecJBB Requires JAVA - is it a JAVA benchmark? Similar transaction model to TPC like benchmarks
Other Benchmarks Spec OSG Working Group* Addresses Cloud environment (SaaS, PassS, IaaS) Hardware and cloud providers and cloud customers Black box and white box environments Agility, elasticity, provisioning, etc. EPFL CloudSuite Specific sets of workloads Does not address SaaS, PaaS or IaaS specifically Great for academic focus, but not designed for ease of use, verification, and validity * As described by OSG Cloud Subcommittee Report
significantly different instruction miss rate SpecINT2006 160 Instruction Misses Per Thousand Instructions 140 120 100 CloudSuite 80 60 40 20 0 data caching data serving map reduce media sat solver web front streaming end web search specint tpc-c tpc-e See Ferdman et al, ACM transactions on computer systems,nov 2012 (compares Cloudsuite characteristics to Spec, TPC, Parsec) Large I cache footprint Lower IPC Lower MLP (memory parallelism)