BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 1 / 10



Similar documents
BigDataBench. Khushbu Agarwal

On Big Data Benchmarking

On Big Data Benchmarking

Evaluating Task Scheduling in Hadoop-based Cloud Systems

Benchmarking and Ranking Big Data Systems

BigDataBench: a Big Data Benchmark Suite from Internet Services

Welcome to the 6 th Workshop on Big Data Benchmarking

Big Data Benchmark Suite

How To Write A Bigbench Benchmark For A Retailer

BPOE Research Highlights

arxiv: v1 [cs.db] 26 May 2015

I/O Characterization of Big Data Workloads in Data Centers

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

Big Data Simulator version

How to avoid building a data swamp

Architecture Support for Big Data Analytics

Enterprise Dashboards: The Strategic Role of the CIO. Professor Vallabh Sambamurthy Eli Broad College of Business Michigan State University

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking. Aayush Agrawal

Monitoring GPFS Using TPC or, Monitoring IBM Spectrum Scale using IBM Spectrum Control. Christian Bolik, TPC/Spectrum Control development 13/05/2015

Release: 1. ICTWEB406 Create website testing procedures

A UPS Framework for Providing Privacy Protection in Personalized Web Search

Maximize Social Media Effectiveness with Data Science. An Insurance Industry White Paper from Saama Technologies, Inc.

How To Test For Elulla

Metaheuristics in Big Data: An Approach to Railway Engineering

Survey of Big Data Benchmarking

DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS

Network Health Framework: A Proactive Approach

Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council (TPC)

Table of Contents. Letter from the Publisher... Error! Bookmark not defined. Methodology... Error! Bookmark not defined.

Big Data Storage Architecture Design in Cloud Computing

Adobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Evaluating the impact of REMS on burden and patient access

Parquet. Columnar storage for the people

Establishing a business performance management ecosystem.

Big Data Generation. Tilmann Rabl and Hans-Arno Jacobsen

BigBench: Towards an Industry Standard Benchmark for Big Data Analytics

ITIL. Lifecycle. ITIL Intermediate: Continual Service Improvement. Service Strategy. Service Design. Service Transition

Measuring What Matters: A Dashboard for Success. Craig Schoenecker System Director for Research. And

BENCHMARKING BIG DATA SYSTEMS AND THE BIGDATA TOP100 LIST

The following criteria have been used to assess each of the options to ensure consistency and clarity:

How to Run a Successful Big Data POC in 6 Weeks

INTRODUCTION TO CASSANDRA

It s Not Called Continuous Integration for Nothing!

FoodBroker - Generating Synthetic Datasets for Graph-Based Business Analytics

Task Scheduling in Hadoop

The Applications of Genetic Algorithms in Stock Market Data Mining Optimisation

Google AdWords Remarketing

Draft guidance on disclosure of certain fees and returns by managed funds

Modern Processors using BigDataBench

IMPACT Online. Actionable data for measurable results. Actionable Data for Measurable Results

Method of Fault Detection in Cloud Computing Systems

AMA Marketing Effectiveness Online Seminar Series. Bob Wallach American Marketing Association

Analance Data Integration Technical Whitepaper

Graduating CRM Beyond Pipeline Management CRM

Development Effort & Duration

Characterizing Workload of Web Applications on Virtualized Servers

Characterizing Task Usage Shapes in Google s Compute Clusters

Performance Evaluation of Task Scheduling in Cloud Environment Using Soft Computing Algorithms

- HR Transformation in the Internet Era

Analance Data Integration Technical Whitepaper

The Artificial Prediction Market

QMB 7933: PhD Seminar in IS/IT: Economic Models in IS Spring 2016 Module 3

VMware vcenter Log Insight Delivers Immediate Value to IT Operations. The Value of VMware vcenter Log Insight : The Customer Perspective

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

STRATEGY 1: DETERMINE DESIRED PERFORMANCE METRICS

Advancing Analytics in Your Organization

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

Network Infrastructure Services CS848 Project

Using Metrics in Outsourcing

Prince George s County Public Schools School Lane Upper Marlboro, Maryland

Research on Applying Web3D Technology to College Library Instruction of Online Book Navigation System. Wang Shuo, Mu Dawei, Zhao Jinlong, Hu Xiaoli

Information Architecture

MEASURING EMPLOYEE EXPERIENCE TO DRIVE POSITIVE EMPLOYEE ENGAGEMENT A FORESEE WHITE PAPER

Transcription:

BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking Framework[1] May 17, 2014 BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 1 / 10

Outline 5 NEGATIVE POINTS 1 OBJECTIVE 2 SUMMARY 3 OBSERVATIONS 4 POSITIVE POINTS 6 PROBLEM S IDENTIFIED 7 FEEDBACK 8 References BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 2 / 10

OBJECTIVE OBJECTIVE BigOP is a part of open source big data benchmarking project BigDataBench which features the abstraction of reprenstative Operation Sets, workload Patterns, and prescribed texts. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 3 / 10

SUMMARY SUMMARY Today we have landed into an era of big data and thus we are facing problem of how to choose the right system for processing big data. Benchmarking is an optimal way for evaluation and comparsion of systems.bigop is an end-to-end system benchmarking framework which facilitates automatic generation of tests with comprehensive workloads for big data systems.in BigOP, benchmarking test is specified by a prescription of one or more applications.a prescription incorporates a subset of operations and processing patterns,a data set,a workload generation method and metrices. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 4 / 10

OBSERVATIONS OBSERVATIONS An end-to-end model and adequate level of abstraction in BigOP provides space for various system implementations and optimizations. Due to large volume of data, big data system consists of nodes as well as datacenters,thus communication must be considered in benchmarks. Big data processing operations are classified as- element operation, single-set operation and double-set operation. The pattern abstraction are classified as single operation,multi-operation and iterative operation. A prescription test consists of a subset of operations and processing patterns, a data set, a workload generation method, and the measured metrics. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 5 / 10

POSITIVE POINTS POSITIVE POINTS BigBench focuses on big data analytics which adopts TPC-DS as the basis and adds new data types like semi-/un-structured data,as well as non-relational workloads. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 6 / 10

NEGATIVE POINTS NEGATIVE POINTS BigBench targets only a specific big data application and it doesn t cover the variety of big data processing workloads. The workload of the benchmark is too simple to meet the various needs of data processing. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 7 / 10

PROBLEM S IDENTIFIED PROBLEM S IDENTIFIED The present big data benchmarks covers only a part of BigOP s abstraction of processing operations and patterns. Present benchmarks are not as flexible as BigOP. Present benchmarks do not include the iterative pattern. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 8 / 10

FEEDBACK FEEDBACK Benchmarking tests can be conducted using BigOP s abstracted operations and patterns.system users can prescribe tests aiming at a specific application while developers can carry out general tests. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014 Framework[1] 9 / 10

References References I [1] Y. Zhu, J. Zhan, C. Weng, R. Nambiar, J. Zhang, X. Chen, and L. Wang, Bigop: Generating comprehensive big data workloads as a benchmarking framework, arxiv preprint arxiv:1401.6628, 2014. BigOP:Generating Comprehensive Big Data Workloads as a Benchmarking May 17, 2014Framework[1] 10 / 10