Towards Energy Efficient Query Processing in Database Management System



Similar documents
Energy Constrained Resource Scheduling for Cloud Environment

System Requirements Table of contents

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

White Paper Perceived Performance Tuning a system for what really matters

Analysis of VDI Storage Performance During Bootstorm

MAGENTO HOSTING Progressive Server Performance Improvements

Comparing Multi-Core Processors for Server Virtualization

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Virtuoso and Database Scalability

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

Host Power Management in VMware vsphere 5

Energy-aware job scheduler for highperformance

TPC-W * : Benchmarking An Ecommerce Solution By Wayne D. Smith, Intel Corporation Revision 1.2

Host Power Management in VMware vsphere 5.5

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

Energy Efficient MapReduce

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

CHAPTER 1 INTRODUCTION

Performance Modeling and Analysis of a Database Server with Write-Heavy Workload

Unit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.

Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers

Tableau Server 7.0 scalability

AMD PhenomII. Architecture for Multimedia System Prof. Cristina Silvano. Group Member: Nazanin Vahabi Kosar Tayebani

Agility Database Scalability Testing

HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Energy-aware Memory Management through Database Buffer Control

Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?

Communicating with devices

Cloud Management: Knowing is Half The Battle

Dynamic Resource allocation in Cloud

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

Understanding the Performance of an X User Environment

EFFICIENT GEAR-SHIFTING FOR A POWER-PROPORTIONAL DISTRIBUTED DATA-PLACEMENT METHOD

Low Power AMD Athlon 64 and AMD Opteron Processors

Multi-core and Linux* Kernel

Optimizing Shared Resource Contention in HPC Clusters

Green Desktop Infrastructure

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008.

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

Power efficiency and power management in HP ProLiant servers

How System Settings Impact PCIe SSD Performance

SoSe 2014 Dozenten: Prof. Dr. Thomas Ludwig, Dr. Manuel Dolz Vorgetragen von Hakob Aridzanjan

CHAPTER 1 INTRODUCTION

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

TPCC-UVa. An Open-Source TPC-C Implementation for Parallel and Distributed Systems. Computer Science Department University of Valladolid, Spain

Managing Data Center Power and Cooling

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms

Rackspace Cloud Databases and Container-based Virtualization

PowerPC Microprocessor Clock Modes

Parallelism and Cloud Computing

Disk Storage Shortfall

Kronos Workforce Central on VMware Virtual Infrastructure

Managing Orion Performance

Monitoring Databases on VMware

Analyzing IBM i Performance Metrics

Benchmarking the Performance of XenDesktop Virtual DeskTop Infrastructure (VDI) Platform

TPCC-UVa: An Open-Source TPC-C Implementation for Parallel and Distributed Systems

Paul Brebner, Senior Researcher, NICTA,

Delivering Quality in Software Performance and Scalability Testing

VMWARE WHITE PAPER 1

PCI vs. PCI Express vs. AGP

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

IOS110. Virtualization 5/27/2014 1

Current Oracle license rules and definitions. Oracle licensing.

The Bus (PCI and PCI-Express)

The Importance of Software License Server Monitoring

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

A Framework for Performance Analysis and Tuning in Hadoop Based Clusters

Step by Step Guide To vstorage Backup Server (Proxy) Sizing

Virtual Desktops Security Test Report

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Design Considerations for Increasing VDI Performance and Scalability with Cisco Unified Computing System

Performance Workload Design

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Summary. Key results at a glance:

Recommendations for Performance Benchmarking

SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs

Virtualization with the Intel Xeon Processor 5500 Series: A Proof of Concept

Transcription:

Towards Energy Efficient Query Processing in Database Management System Report by: Ajaz Shaik, Ervina Cergani Abstract Rising concerns about the amount of energy consumed by the data centers, several computer hardware manufacturing companies are developing energy efficient components enabling to toggle the amount of power consumed by them at several workloads. But these features are not adopted by the Database Managements Systems which are one of the sources of large energy consumption. This report shows two techniques called Process Voltage/Frequency Control (PVC), Explicit Query Delays (QED) for trading energy efficiency into database systems with their experimental evaluation and also presenting energy efficient query processing framework which shows ease in deploying energy cost model into current query optimization system. What is Energy Efficiency? For computing systems and data centers, energy is delivered as electricity. Power is the instantaneous rate of energy used. Energy Efficiency can be defined as In order to increase energy efficiency either the throughput is increased keeping the power constant or the power consumption is decreased by maintaining throughput at certain standards of the systems. This report concentrates second method of increasing energy efficiency. CPU Power Management FinishedWorkloads EnergyEffi ciency = Energy FinishedWorkloads = Power* Time Throughput = Power It is observed from [1] that servers are rarely completely idle or operate to their maximum utilization. Instead most of times operating between 10 and 50 percent of their maximum utilization.

Modern hardware provides more than sixty percent of dynamic CPU power range which are handled internally by the system during various sever workloads. Processors adopt technique called Dynamic Frequency and Voltage scaling to vary these power / performance states. These states are implementation dependent with p0 being the maximum and p1 to pn being lower power consumption states. Some processors support built in DVFS like Intel s SpeedStep and AMDs PowerNow designed for mobile chips and also AMDs CoolnQuite for desktop and server chips. Several external packages which allow programs to customize the performance states are also available. Different CPU speeds can be achieved by varying CPU Multiplier which measures ratio of internal clock rate to externally supplied clock dictated by p states and the Front Side Bus which carries data between CPU and Northbridge. FSB * Multiplier = frequency Underclocking is the process of slowing the CPU frequency by varying FSB speed. This allows fine granularity modulation upon which multipliers act on. Frequency changes through multipliers (p state transition) put hard upper limit on CPU. For example, consider a CPU on a 333MHz FSB with p state multiplier 9, 8, 7 and 6. Assume the capped value of 7, which means that the top frequency CPU can now achieve is 2.3GHz, instead of 3 GHz (= 9 * 333MHz). However, by modulating the base factor (FSB speed) capping the multiplier with 7 would mean that 2 more transitioning states are left with an FSB speed of 333MHz. Towards Eco DBMS Database should utilize several modern hardware capabilities and also continuously monitor the energy consumption characteristics. This report presents two methods to show how changing system settings can save energy consumption with small amount of performance penalty. Processor Voltage/Frequency Control (PVC) This technique uses the underclocking method to trade energy for performance. To explore the energy savings of this method experimental evaluation on two databases commercial and MySql is performed with TPC H Q5 workload. Each workload consists of 10 Q5 queries using 2 region and 5 date range non overlapping predicates. Finally each workload is run five times to the average of the middle three readings. The workload is performance many times to address the issue of CPU power fluctuation during the actual run. Experimental setup uses ASUS motherboard and processor, 6 Engine software provided the system to underclock the FSB by 5%, 10%, and 15% and to downgrade the CPU voltage into its preset small and medium voltage downgrades. The results for commercial DBMS with scale factor of 1.0 and for MySql with scale factor of 0.125.

The graphs are plot with workload response of changed settings to stock settings ratio on X axis and Energy consumption of changed settings to stock settings ratio on Y axis. The stock settings can be referred to top left corner of the graphs. The solid line represents a metric called Energy Delay Product (EDP), which is the product of the energy and the delay and remains constant along the line. As we can see in figure 2 the 5% settings dramatically reduces the energy by only a small drop in performance on commercial databases. However the change is not huge in MySql as one could imagine that this database adopt less optimization methods compared to the other. Explicit Query Delays (QED) In traditional DBMS with workload of single table select only queries with same but different range of selection predicates will result in each query being evaluated individually and one after other. In QED, queries are delayed and placed into a queue on arrival. When the queue reaches a certain threshold all the queries in the queue are examined to determine if they can be aggregated into a small number of groups, such that the queries in each group can be evaluated together. Evaluating this workload can trade average query response time for reduced overall energy consumption. Single group queries can be run in the DBMS at a lower energy cost than the individual queries. For evaluating QED, simple workloads of a series of single table queries each having a selectivity of 2% based on l_quantity attribute of the lineitem table, with predicate drawing from 50 uniformly distributed integer values of TPC H benchmark at 0.5 scale factor. The workload is run on MySql database.

Figure shows the average response time of changed over stock settings ratio on X axis and energy consumptionn ratio on Y axis. As the batch size ( threshold) increases the energy consumption decreases with very less change in response time. This is because single queries consume more energy in overall when compared to queries which run in groups. Query Processing Framework Typical servers have many opportunities to execute the query slower if the Service Level Agreement ( SLA) permits the additional response time penalty. Since SLA s are written to meet peak or near peak demand they often have some slack in the frequent low utilization periods, whichh could be exploited by the DBMS. In the above framework it is assumed that queries have some response time goal,

potentially driven by an SLA. The query optimization problem now becomes to Find and execute the most energy efficient plan that meets the SLA. The Energy Response Time Profile (ERP) is a structure that details the energy and response time cost of executing each query plan at every possible power/ performance setting under consideration. Given a query Q with a set of possible plans P = {P1,..,Pn} to be executed on a machine with system operating settings H={H1,..,Hm} to Energy Cost Estimator generates the ERP which contains one point for every plan for every system operating setting. The generated ERP is then used by the combined energy enhanced query optimizer to choose the most energy efficient plan that meets SLA constraints. Then, a command to switch to the chosen system operating state is sent to the hardware, followed by sending the optimal query plan to the execution engine. Energy Cost Model Figure above gives an overview of energy cost model, Operator model estimates the query parameters (CPU Instructions: I Q disk page reads: R Q, disk page writes: W Q and memory page accesses: M Q ) for selection, projection, join of each query plan from the database statistics. Hardware abstraction model uses query parameters and system parameters (CPU: C cpu, disk write: C W, disk read: C R memory access: C mm, remaining system: C other ) that can be learnt through training procedures to accurately estimate the energy cost. Equations below show the model for average power (Watt) during a query response time T. IQ RQ WQ P av = C cpu * + C R * + C W * + C T T T mm M * Q + T C others

Conclusion This report on [2, 3] shows how energy and performance can be traded into current database management system with emerging modern hardware mechanism that can provide significant energy savings for DBMS. And also present Energy aware Query Optimization by introducing Hardware Abstraction model in addition to the traditional Operator Model. The ERP analysis the relationship between different query plans and different hardware power / performance settings, we can trade decreased query performance for increased energy efficiency in the presence of slack in SLA s. Remark Notice from paper that in order to achieve a gain in efficiency there should be certain amount of performance degrades. Authors from the paper showed good evaluations of how tuning database systems according to the hardware setting can lead to saving of energy consumption of servers with small downgrade in application performance. For example considering method QED where individual query execution energy consumption is more when compared to executing in batch. However, authors did not mention of how QED method will be efficient with several tables and disjoint predicates. The proposed energy saving query processing model can be easily deployed into existing database systems. But the energy cost model explanation is abstract, details about how hardware and query parameter values are obtained. And also the papers consider only one application running on server, but in normal scenario this is not the case. So question of how applicable the proposed methods and model are when other applications are also running on the same server. In overall the authors give initiative direction towards eco friendly database management systems. References [1] The Case for Energy Proportional Computing, Luiz Barroso and Urs Holzle Google. [2]Towards Eco friendly Database Management Systems, Willis Lang and Jignesh M. Patel Computer Science Department, University of Wisconsin Madison, USA [3]Rethinking Query Processing for Energy Efficency: Slowing Down to Win the Race, Willis Lang, Ramakrishnan Kandhan, Jigesh M. Patel Computer Sciences Department Univeristy of Wisconsin Madison.