Cost-Efficient Job Scheduling Strategies



Similar documents
Energy-aware job scheduler for highperformance

Grid Computing Approach for Dynamic Load Balancing

Utilization Driven Power-Aware Parallel Job Scheduling

International Journal of Advance Research in Computer Science and Management Studies

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

Data Center Specific Thermal and Energy Saving Techniques

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters

SoSe 2014 Dozenten: Prof. Dr. Thomas Ludwig, Dr. Manuel Dolz Vorgetragen von Hakob Aridzanjan

Survey on Job Schedulers in Hadoop Cluster

Analysis and Optimization Techniques for Sustainable Use of Electrical Energy in Green Cloud Computing

Optimizing Shared Resource Contention in HPC Clusters

Power efficiency and power management in HP ProLiant servers

Dynamic resource management for energy saving in the cloud computing environment

W4118 Operating Systems. Instructor: Junfeng Yang

Windows Server Performance Monitoring

LSKA 2010 Survey Report Job Scheduler

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Energy Constrained Resource Scheduling for Cloud Environment

Advanced Electricity Storage Technologies Program. Smart Energy Storage (Trading as Ecoult) Final Public Report

Efficient Parallel Processing on Public Cloud Servers Using Load Balancing

Dynamic Power Variations in Data Centers and Network Rooms

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

Generational Performance Comparison: Microsoft Azure s A- Series and D-Series. A Buyer's Lens Report by Anne Qingyang Liu

Demand Response Market Overview. Glossary of Demand Response Services

The International Journal Of Science & Technoledge (ISSN X)

Green HPC - Dynamic Power Management in HPC

Computing in High- Energy-Physics: How Virtualization meets the Grid

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing

Resource Models: Batch Scheduling

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Task Scheduling in Hadoop

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Newsletter 4/2013 Oktober

Power Aware and Temperature Restraint Modeling for Maximizing Performance and Reliability Laxmikant Kale, Akhil Langer, and Osman Sarood

Dynamic Power Variations in Data Centers and Network Rooms

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

Load Balancing in cloud computing

CHAPTER 1 INTRODUCTION

Towards energy-aware scheduling in data centers using machine learning

Minimization of Costs and Energy Consumption in a Data Center by a Workload-based Capacity Management

Operating Systems Concepts: Chapter 7: Scheduling Strategies

Low Power AMD Athlon 64 and AMD Opteron Processors

Recommended hardware system configurations for ANSYS users

IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT

Models of Life Cycle Management in a Server

Cloud Management: Knowing is Half The Battle

Towards Energy Efficient Query Processing in Database Management System

Scheduling Algorithms for Dynamic Workload

Enhancing the Scalability of Virtual Machines in Cloud

The Data Center as a Grid Load Stabilizer

Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment

Adaptive Resource Optimizer For Optimal High Performance Compute Resource Utilization

Various Schemes of Load Balancing in Distributed Systems- A Review

EVALUATING POWER MANAGEMENT CAPABILITIES OF LOW-POWER CLOUD PLATFORMS. Jens Smeds

The Importance of Software License Server Monitoring

Ecole des Mines de Nantes. Journée Thématique Emergente "aspects énergétiques du calcul"

Genetic Algorithms for Energy Efficient Virtualized Data Centers

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Simulation of load balancing in a cloud computing environment using OMNET

Batch Scheduling on the Cray XT3

Energy-Efficient Application-Aware Online Provisioning for Virtualized Clouds and Data Centers

Benchmarking Hadoop & HBase on Violin

Ready Time Observations

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

OPERATING SYSTEMS SCHEDULING

Flash Corruption: Software Bug or Supply Voltage Fault?

Web Server Software Architectures

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

Smarter Data Centers. Integrated Data Center Service Management

Real-Time Scheduling (Part 1) (Working Draft) Real-Time System Example

Process Scheduling CS 241. February 24, Copyright University of Illinois CS 241 Staff

A highly configurable and efficient simulator for job schedulers on supercomputers

A Hybrid Load Balancing Policy underlying Cloud Computing Environment

PARALLELS CLOUD STORAGE

Job scheduler details

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory

EPOBF: ENERGY EFFICIENT ALLOCATION OF VIRTUAL MACHINES IN HIGH PERFORMANCE COMPUTING CLOUD

TU-E2020. Capacity Planning and Management, and Production Activity Control

The Load Balancing Strategy to Improve the Efficiency in the Public Cloud Environment

Symfoware Server Reliable and Scalable Data Management

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

CHAPTER 1 INTRODUCTION

Grid Scheduling Dictionary of Terms and Keywords

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

A High Performance Computing Scheduling and Resource Management Primer

Leveraging Radware s ADC-VX to Reduce Data Center TCO An ROI Paper on Radware s Industry-First ADC Hypervisor

The Probabilistic Model of Cloud Computing

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

A Game Theory Modal Based On Cloud Computing For Public Cloud

The Association of System Performance Professionals

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment

Scheduling. Monday, November 22, 2004

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

LOAD BALANCING IN CLOUD COMPUTING USING PARTITIONING METHOD

Principles and characteristics of distributed systems and environments

Stability of Load Sharing in a Distributed Computer System

Scheduling Algorithms in MapReduce Distributed Mind

Hardware Configuration Guide

Keywords- Cloud Computing, Green Cloud Computing, Power Management, Temperature Management, Virtualization. Fig. 1 Cloud Computing Architecture

Transcription:

Cost-Efficient Job Scheduling Strategies Alexander Leib Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg 2015-12-07 Alexander Leib Cost-Efficient Job Scheduling Strategies 1 / 25

Agenda 1 Introduction 2 Energy-efficiency 3 Cost-efficiency 4 Conclusion and evaluation 5 Literature Alexander Leib Cost-Efficient Job Scheduling Strategies 2 / 25

Efficiency vs. Effectiveness Introduction Effectiveness Capability of producing or achieving a desired result [5] A means of achieving a set goal faster is more effective than other ones The term is focused on the outcome Efficiency "The ability to do things well, successfully, and without waste"[5] Can be expressed as a measure of how well input is used to produce a desired output Optimizing the ratio between input and output E.g. in physics: Energy conversion efficiency Context HPC Higher Performance and throughput can be more effective at the cost of efficiency. Alexander Leib Cost-Efficient Job Scheduling Strategies 3 / 25

The economic perstpective: Costs Introduction Main expenses in HPC Initial Investment High costs for the aquisition of the machine(-components) Suitable facilities must be availible or aquired Electrical and network infrastructure The initial investment is a one-time expense and relatively fixed Running costs Maintenance Spare and replacement parts Energy Consumption Variable and regular expenses Cost reduction potential The variability of the running costs results in a high saving potential by efficient running Power consumption is probably the most flexible type of running costs Alexander Leib Cost-Efficient Job Scheduling Strategies 4 / 25

The economic perstpective: Costs Introduction Cost-efficiency = energy-efficiency? Cost-efficiency: Reduction of input (costs) at stable output (performance) Energy-efficiency: Reduction of Energy Consumption Power demand of HPC centers can reach 20 MW or more [1] High impact on electricity service providers and grid reliability Power fluctuations Peak power demands Intuitive assumption: Lower power consumption equals lower costs Alexander Leib Cost-Efficient Job Scheduling Strategies 5 / 25

Relationship between electricity service providers and HPC-centers Introduction Electricity service provider (ESP) Goals Prevention of transmission and distribution congestion Frequency response Peak and reserve capacity Renewable integration Programs Energy efficiency Peak load reduction Dynamic pricing Up and down regulation Alexander Leib Cost-Efficient Job Scheduling Strategies 6 / 25

Relationship between electricity service providers and HPC-centers Introduction HPC-Centers The HPC-landscape is changing Instead of using higher performance components, systems use higher numbers of more energy-efficient components Overall power demand is still rising In compliance to ESP programs and/or requests, strategies to control electricity demand should be implemented Adaptation strategies Power management Load migration Job scheduling Back-up scheduling Lighting control Thermal management Alexander Leib Cost-Efficient Job Scheduling Strategies 7 / 25

The question how this was achieved becomes apparent Alexander Leib Cost-Efficient Job Scheduling Strategies 8 / 25 Reducing power consumption Energy-efficiency Power consumption has been rising significantly slower than expected over the past years Abbildung: Energy useage in HPC

Reducing power consumption Energy-efficiency Achieving energy-efficiency in HPC-environments Energy-efficient or energy proportional hardware The design of new hardware is not solely focused on higher performance, but also on optimizing the utilisation of power By using energy-efficient components, the machines themselves become more energy-efficient Dynamic Power Management (DPM) Devices (like CPUs, memory etc.) powered on and off dynamically Idle machines consume about 2/3 of peak power and the average workload in data centers amounts to around 30% => 70% of resources can be kept in sleep mode Alexander Leib Cost-Efficient Job Scheduling Strategies 9 / 25

Reducing power consumption Energy-efficiency Power Management Dynamic Voltage and Frequency Scaling (DVFS) Basic Idea: Adjustment of CPU clock frequency through supply voltage to reduce power consumption Energy is consumed proportional to workload Power Capping The operator can set a threshold for power consumption Total power consumption can be kept under a defined budget Sudden rises in power demand can be prevented Achieved by i.e. throttling CPUs and de- or rescheduling tasks Thermal Management Higher temperatures can reduce the system s reliability while increasing cooling expenses Similar to Power Capping but the temperature is monitored and a threshold predefined Alexander Leib Cost-Efficient Job Scheduling Strategies 10 / 25

Reducing power consumption Energy-efficiency Highly parallel systems allow adjustment of power consumption to workload through job scheduling Job scheduling policies can be static or dynamic Static: Jobs are known beforehand Dynamic: Scheduling is done at job-arrival At the time of design energy efficiency has traditionally been a rare consideration, since the focus was on high throughput and performance Alexander Leib Cost-Efficient Job Scheduling Strategies 11 / 25

Job Schedulers Energy-efficiency In most HPC-systems user requests are collected by a job scheduler The scheduler assigns jobs to nodes and specifies the time of execution The job scheduling algorithms offer a wide variety of opportunities to optimize the system for desired characteristics Optimizing the scheduler for energy-efficiency can drastically reduce power consumption In [2] Mämmelä et al. used different algorithms to compare results of different approaches Alexander Leib Cost-Efficient Job Scheduling Strategies 12 / 25

Job Schedulers Energy-efficiency Following algorithms were used in [2] First In, First Out (FIFO) In this simplest of scheduling algorithms jobs are scheduled in a queue If enough resources for the first job (job 1) are available it is executed, otherwise all jobs have to wait Energy aware FIFO (E-FIFO) FIFO is expanded by a the power-off threshold T If the estimated waiting time is longer than T seconds, idle nodes are powered off Alexander Leib Cost-Efficient Job Scheduling Strategies 13 / 25

Job Schedulers Energy-efficiency Backfilling (first fit and best fit) Backfilling basically works like FIFO with an important exception If the available resources are insufficient for job 1, the queue is scanned for jobs that can be executed The execution time of a potential backfill job must not exceed the waiting time of job 1 to prevent delays Time estimations for job 1 are based on running time estimations of currently running jobs which are given by the user User estimations can result in delays Alexander Leib Cost-Efficient Job Scheduling Strategies 14 / 25

Job Schedulers Energy-efficiency Two basic ways of selecting a backfill job are described in [2] Backfill first fit (BFF) The first queued job meeting the resource and time restrictions is chosen Backfill best fit (BBF) Additional criteria can be defined (i.e. shortest backfill, or most processors used) Backfill job is selected accordingly The energy-aware versions of backfilling (E-BFF and E-BBF use the same energy saving mechanisms as E-FIFO Idle nodes are powered off, if the wait time of job 1 exceeds T Backfilling offers less opportunities to power off idle nodes, by utilizing them for other jobs instead Wait times of jobs are however shorter than FIFO resulting in higher efficiency Alexander Leib Cost-Efficient Job Scheduling Strategies 15 / 25

Job Schedulers Energy-efficiency Two basic ways of selecting a backfill job are described in [2] Backfill first fit (BFF) The first queued job meeting the resource and time restrictions is chosen Backfill best fit (BBF) Additional criteria can be defined (i.e. shortest backfill, or most processors used) Backfill job is selected accordingly The energy-aware versions of backfilling (E-BFF and E-BBF use the same energy saving mechanisms as E-FIFO Idle nodes are powered off, if the wait time of job 1 exceeds T Backfilling offers less opportunities to power off idle nodes, by utilizing them for other jobs instead Wait times of jobs are however shorter than FIFO resulting in higher efficiency Alexander Leib Cost-Efficient Job Scheduling Strategies 16 / 25

Job Schedulers Energy-efficiency Simulation results using the 6 mentioned algorithms Abbildung: Energy Consumption Wait and simulation times: BBF < BFF < < FIFO Alexander Leib Cost-Efficient Job Scheduling Strategies 17 / 25

Job Schedulers Energy-efficiency Selected simulation results The increase in average wait times of energy-aware scheduling algorithms did not exceed 3.2% (FIFO vs. E-FIFO tested with only small jobs requiring 1-5 nodes) Highest average simulation time increase was 2.3% when comparing E-algorithms to their counterparts Using a backfilling algorithm cut the simulation time by 23% compared to FIFO Results on a cluster at Jülich Supercomputing Centre (JSC) JSC s testing environment uses the default scheduler of Torque RMS which is relatively close to BFF Testing with the E-BFF algorithm resulted in a marginally higher (0.63%) completion time In contrast: Energy consumption was 6.3% lower Alexander Leib Cost-Efficient Job Scheduling Strategies 18 / 25

Partnership with electricity service providers Cost-efficiency Approaches for working together with ESPs Flat Power Consumption By flattening (i.e. keeping constant) the overall power consumption providers have a predictable power demand Reduced fluctuations have less impact on electrical grid stability Some ESPs offer flat fees for relatively constant power demand Dynamically Adjusted Power Consumption Power demand regulation in response to grid capacity and ESP requests ESPs often provide financial incentives to reduce power demand during peak times By working closely together with providers, HPC-sites can actually be used as a resource to reduce the impact of grid fluctuations produced by other customers Alexander Leib Cost-Efficient Job Scheduling Strategies 19 / 25

Partnership with electricity service providers Cost-efficiency In [1] Bates et al. (2014) submitted a questionnaire to Supercomputing centers (SC) in the U.S. to evaluate the grade of interaction between SCs and ESPs 11 of 19 targeted U.S. Top100 SCs sent answers to the authors of [1] Results are more representative of the Top50, since 9 of 11 respondents are listed there SC s experiences concerning total power load and intra-hour fluctuations varied greatly, i.e.: 4 sites with had a total power load of over 10 MW with fluctuations between < 3 MW to 8 MW Of two 5 MW sites, one reported 4 MW fluctuations Alexander Leib Cost-Efficient Job Scheduling Strategies 20 / 25

Partnership with electricity service providers Cost-efficiency Questionnaire results Approx. half the respondents have had some interaction/discussion, but mostly limited to ESP-programs like dynamic pricing About 20% have had discussions about congestion, regulation and frequency response Abbildung: Questionnaire results Alexander Leib Cost-Efficient Job Scheduling Strategies 21 / 25

Partnership with electricity service providers Cost-efficiency Interpretation of the results Due to local differences, these results are only representative of the U.S. The current low grade of interaction between SCs and ESPs has serveral reasons: Organisational reasons (budgeting is done somewhere else) Some power management strategies are considered to have negligible effects The inherent trade-off in performance associated with most energy-efficience measures is deemed uneconomically high Measures like thermal and lighting management do not seem to make much of a difference The effects of highly sophisticated and automated job scheduling strategies is underestimated Alexander Leib Cost-Efficient Job Scheduling Strategies 22 / 25

Conclusion and evaluation Different means of reducing power consumption and combinations thereof can greatly enhance overall efficiency Utilization of these approaches is surprisingly uncommon Implementation of theses strategies may have to be realized on different levels Top-down: Regulations by the decision makers of HPC-sites Bottom-up: Initiatives by users Reaching understandings and closing deals with ESPs offer high cost saving potentials The scope of possible actions is influenced by ESPs characteristics like grid infrastructure Availible agreement opportunities may vary depending on the size and power demand of HPC-sites Smaller sites may utilize flat power consumption strategies SCs with higher demand my be forced to use dynamic strategies Alexander Leib Cost-Efficient Job Scheduling Strategies 23 / 25

Conclusion and evaluation Cooperation with ESPs can result in lower prices for higher overall amounts of energy compared to purely energy-aware strategies Suggestions for compliance to ESP requests Flat power consumption: Fluctuations can be reduced at the job scheduler level The wattages of different scheduler implementations differ Power consumption can be influenced accordingly by switching algorithms This process probably could be automated Dynamic power consumption Power consumption should be adjusted depending on current grid capacity and ESP requests Can be achieved by utilizing more or less energy-efficiency means and scheduler algorithms Alexander Leib Cost-Efficient Job Scheduling Strategies 24 / 25

Literature 1 N. Bates et al.: Electrical Grid and Supercomputing Centers: An Investigative Analysis of Emerging Opportunities and Challenges 2 O. Mämmelä et al.: Energy-aware job scheduler for high-performance computing 3 A. Chandio et al.: A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems 4 D. Yang et al.: Parallel-machine scheduling with controllable processing times and rate-modifying activities to minimise total cost involving total completion time and job compressions 5 Wikipedia Alexander Leib Cost-Efficient Job Scheduling Strategies 25 / 25