Resource Models: Batch Scheduling



Similar documents
Proceedings of the 4th Annual Linux Showcase & Conference, Atlanta

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Scheduling Algorithms for Dynamic Workload

CPU Scheduling. Core Definitions

Utilization Driven Power-Aware Parallel Job Scheduling

Process Scheduling CS 241. February 24, Copyright University of Illinois CS 241 Staff

A High Performance Computing Scheduling and Resource Management Primer

Grid Scheduling Dictionary of Terms and Keywords

OPERATING SYSTEMS SCHEDULING

Improving and Stabilizing Parallel Computer Performance Using Adaptive Backfilling

W4118 Operating Systems. Instructor: Junfeng Yang

Scheduling. Monday, November 22, 2004

ICS Principles of Operating Systems

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

New Issues and New Capabilities in HPC Scheduling with the Maui Scheduler

How To Use A Cloud For A Local Cluster

CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems

VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling

Scheduling 0 : Levels. High level scheduling: Medium level scheduling: Low level scheduling

Contributions to Gang Scheduling

Adaptive Processor Allocation for Moldable Jobs in Computational Grid

On Job Scheduling for HPC-Clusters and the dynp Scheduler

Network Contention Aware HPC job scheduling with Workload Precongnition

The Moab Scheduler. Dan Mazur, McGill HPC Aug 23, 2013

A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters

Workload Characteristics of the DAS-2 Supercomputer

Operating Systems Concepts: Chapter 7: Scheduling Strategies

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Optimizing Shared Resource Contention in HPC Clusters

CPU Scheduling Outline

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

BACKFILLING STRATEGIES FOR SCHEDULING STREAMS OF JOBS ON COMPUTATIONAL FARMS

Overview of Presentation. (Greek to English dictionary) Different systems have different goals. What should CPU scheduling optimize?

Scheduling and Resource Management in Computational Mini-Grids

GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems

Cobalt: An Open Source Platform for HPC System Software Research

Network Infrastructure Services CS848 Project

Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems

A CP Scheduler for High-Performance Computers

Adaptive Resource Optimizer For Optimal High Performance Compute Resource Utilization

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

Cloud Management: Knowing is Half The Battle

An Architecture for Dynamic Allocation of Compute Cluster Bandwidth

Cost-Efficient Job Scheduling Strategies

A Multi-criteria Job Scheduling Framework for Large Computing Farms

Towards energy-aware scheduling in data centers using machine learning

Process Scheduling. Process Scheduler. Chapter 7. Context Switch. Scheduler. Selection Strategies

C-Meter: A Framework for Performance Analysis of Computing Clouds

A Multi-criteria Class-based Job Scheduler for Large Computing Farms

Performance Workload Design

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

Main Points. Scheduling policy: what to do next, when there are multiple threads ready to run. Definitions. Uniprocessor policies

Delay Scheduling. A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling

Abstract. 1. Introduction

Characterizing Task Usage Shapes in Google s Compute Clusters

LSKA 2010 Survey Report Job Scheduler

Benefits of Global Grid Computing for Job Scheduling

Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction

Performance Monitoring of Parallel Scientific Applications

A Job Self-Scheduling Policy for HPC Infrastructures

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Managing the performance of large, distributed storage systems

CPU Scheduling. CPU Scheduling

Lecture 3 Theoretical Foundations of RTOS

International Journal of Advance Research in Computer Science and Management Studies

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

Research in Operating Systems Sparrow

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

Load Balancing in Distributed System. Prof. Ananthanarayana V.S. Dept. Of Information Technology N.I.T.K., Surathkal

WebSphere Application Server V6.1 Extended Deployment: Overview and Architecture

Understanding Linux on z/vm Steal Time

enanos: Coordinated Scheduling in Grid Environments

Improving Compute Farm Efficiency for EDA

Case Study I: A Database Service

Batch Scheduling and Resource Management

In Cloud, Do MTC or HTC Service Providers Benefit from the Economies of Scale?

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Objectives. Chapter 5: CPU Scheduling. CPU Scheduler. Non-preemptive and preemptive. Dispatcher. Alternating Sequence of CPU And I/O Bursts

Operating Systems, 6 th ed. Test Bank Chapter 7

CPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS

Load Balancing of Web Server System Using Service Queue Length

Batch Queueing in the WINNER Resource Management System

Ten Reasons to Switch from Maui Cluster Scheduler to Moab HPC Suite Comparison Brief

IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE

Towards an understanding of oversubscription in cloud

Automatic Workload Management in Clusters Managed by CloudStack

Operating Systems. III. Scheduling.

The Importance of Software License Server Monitoring

.:!II PACKARD. Performance Evaluation ofa Distributed Application Performance Monitor

Integer Programming Based Heterogeneous CPU-GPU Cluster Scheduler for SLURM Resource Manager

C-Meter: A Framework for Performance Analysis of Computing Clouds

Chapter 2: Getting Started

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Sparrow: Distributed, Low Latency Scheduling

Grid Computing Approach for Dynamic Load Balancing

New Metrics for Scheduling Jobs on a Cluster of Virtual Machines

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Synchronization. Todd C. Mowry CS 740 November 24, Topics. Locks Barriers

Transcription:

Resource Models: Batch Scheduling Last Time» Cycle Stealing Resource Model Large Reach, Mass Heterogeneity, complex resource behavior Asynchronous Revocation, independent, idempotent tasks» Resource Sharing in Utilities What is possible: lots of sharing! Statistical models of performance and sharing Today» Batch Scheduling Resource Model» Batch Queue Wait Prediction Reminders/Announcements» Project Writeups back today Today s Readings Portable Batch System Web Site http://www.pbspro.com/tech_overview.html Brett Bode, David M. Halstead, Ricky Kendall, and Zhou Lei, The Portable Batch Scheduler and the Maui Scheduler on Linux Clusters Optional Reading» John Brevik, Daniel Nurmi, and Rich Wolski, Predicting Bounds on Queuing Delay in Space-shared Computing Environments, University of California, Santa Barbara Technical Report CS2005-09.

Resource Management Problem Application Perspective:» Given my application, find and bind an appropriate ( best, acceptable, best below acceptable ) set of resources» => Optimize for application quality or performance System Perspective:» For a set of resources, identify a set of applications which make good use of the resources ( best, acceptable, high utilization, etc.)» => Optimize for system utilization or total value Resource Management Models Cycle Stealing» Volatile, Asynchronous Preemption Dedicated Resource Scheduling TODAY» Batch schedulers, dedicated resources, advanced reservation Time-sharing» Slice schedulers, proportional share, guaranteed/best effort Objective: Understand implications for distributed application performance and motivations/advantages of various models (reach, local)

Batch Scheduler Q s: Job size, Priority, Resource Type, etc. Job: #procs, memory, runtime, other Batch Scheduler PBS, LSF, Maui, SGE, NQS, etc. Large-Cluster #1 Large Cluster #3 Large Cluster #2 Batch Scheduling Idea Resources are expensive Value in achieving high resource utilization» Get more work done» Allow more users access Queue a set of Jobs» Allows Choice of best next job Choose jobs based on» Priority» Resource Requirement» Achieve a Performance Goal Stage files in, run, stage files out

Batch Scheduler for a Single Resource Submit, wait, run, complete Stage files in and out interactive mode What is the scheduling discipline?» FIFO simplest» Many others possible» Optimize for different metrics Comparing Scheduling Disciplines FIFO Shortest First Scheduling policy makes a big difference in perceived performance

FIFO Scheduling 1 1 1 1 5 3 1 Arrive LvQ Complete 1 1 2 2 2 5 3 5 10 4 10 11 5 11 12 6 12 13 7 13 14 Average Delay: 26/7 => 3.5 Shortest First Scheduling 1 1 1 1 5 3 1 Arrive LvQ Complete 1 1 2 2 2 5 3 9 14 4 5 6 5 6 7 6 7 8 7 8 9 Average Delay: 10/7 => 1.5

Other Scheduling Policies Random Earliest Deadline First Combined Aging and Priority Min-Max Suffrage Gaming the system Suppose you were in a shortest job first system» How would you get high throughput? Suppose you were in a FIFO system» How would you get high throughput? Suppose you needed to run a distributed/grid job involving resources from multiple batch scheduled resources, what would you do?

Complications: Resource Constraints Minimum Memory CPU speed Operating System => thins the pool of candidates => complicates the choices (matching) Increases the complexity of scheduling More constraints, the longer you wait Complications: Parallel Jobs 8 16 16 8 Require contiguous partition (interconnect topology: security, performance) Becomes a bin-packing problem More you ask for, the longer you wait

Complications: Advanced Reservations Request a set of resources» 64 nodes, >1GB memory, runtime = 1 hour» At 5pm PST on Friday, May 28, 2008 Why?» Coordinate a distributed job for a grid experiment, use of a instrument, etc. How does this affect the schedule?» Block all use of a specific set of nodes in the period» Prohibit schedules that might run into the period» Reduces efficiency Intelligent Scheduling There are holes in the schedule» Jobs run shorter, greedy bin packing limits, Advanced Reservations Backfilling: allow lower priority jobs to fill the holes in the schedule Maui Scheduler» Determines start times for large jobs» Backfills smaller jobs PBS is a pluggable scheduler framework» Explore backfill performance» Explore scheduler performance

Experimental Parameters NASA Workload (lots of real parallel computing) 64 node cluster, 256MB memory, flat network PBS w/ Maui Scheduler Job Mix» Large (30%), medium(40%), small(20%), debug and failed (<5%)» Range of processor numbers (correlated with size) Randomized delay for submission (random order)» Same order for each experiment FIFO vs. Maui

Overall Performance and Resource Utilization Intelligent Scheduling improves performance by about 10% Theoretical Minimum is another 20% lower User information is poor (runtime estimates) Overall resource utilization is pretty good SLA for Grid Application Request resources Queue Queue Queue Get Resources!!!! Run.. Run.. Run.. Relinquish resources Start all over again SLA: Request and wait for variable time (can be days), run for known quantity of resources, start over Asynchronous available after long delay, predictable quantity

Batch Queue Wait Time Problem: We want to know how long individual jobs will wait before they will acquire the resources then need» Perceived execution time is really affected by wait times Goal: Rigorous confidence bounds on the amount of time a specific job will wait in a batch queue before it is scheduled on a cluster or parallel machine.» Statistical nature implies that a quantifiable confidence range is necessary» Need an answer that applies to an individual job Better Goal: estimate percentiles and quantified confidence bounds» Statistical certainty at specified confidence levels» At most how long will I have to wait before my job runs with 95% confidence? How Well Does it Work? Examine the batch queue logs that record wait time Choose a quantile and a confidence level» 0.95 quantile with 95% confidence For each job» Calculate the upper limit on the quantile» Observe whether job s wait time is less than that limit For the entire trace, record the percentage of job wait times that are less than the prediction» Value should be less than quantile if method is working 5 sites and machines (NERSC, LANL, LLNL, SDSC, TACC) 9 years (96 through 05) 1,200,000+ jobs

Percentage of Correct Predictions 0.98 0.97 0.96 0.95 0.94 0.93 Quantifiable Confidence Percentage of Wait Times Correctly Predicted NERSC SP 3/01-3/03 LLNL SP 1/02-10/02 with 95% Confidence LANL O2K 12/99-4/00 SDSC SP 4/98-4/00 TACC LS 1/04-5/04 SDSC PGN Datastar 1/96-1/97 4/04-1/05 low regular reg-long debug low premium interactive blue small chammpq shared ir-shared scavenger schammpq normal express high low low normal high development systest q1l q16s q64l q256s standby normal interactive express Tgnormal 10000000 1000000 Capturing Dynamics Datastar Queue Times and Predictions April 2004 to March 2005 (96.1% correct) Measurements Prediction (95% upper) Time (seconds) 100000 10000 1000 100 10 1 April September March

Choosing the Best Worst Case TACC and Datastar Upper 95% Predictions 1000000 Thursday February 24, 2005 Time (seconds) 10000 100 345678 seconds = 4 Days Datastar 95% TACC 95% 12 seconds 1 Midnight 12:00 PM Midnight Choosing the Best Number of Processors Datastar 95% Predictions June 2004, 1-4 and 17-64 Processors 604800 1-4 Processors 17-64 Processors 518400 Time (seconds) 432000 345600 259200 172800 86400 0 June 1 June 15 June 30

Prediction Summary Combinations of quantiles provide a qualitative way to evaluate resources» If median and 95th percentile are lower, chances are job will start sooner Quantiles provide a quantitative way to predict possible outcomes» 45% chance that a job will start between the median and the 95th percentile Discussion: Open Questions and Ideas How could you include resources such as these in a distributed application? What are the implications of having an indeterminate wait? What are the implications of having longer queueing times for large resource requests? What are the local implications of Advanced Reservations? How can you do co-allocation? How does batch queue prediction help?

Basic Queueing Theory Delay 12 10 8 6 4 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Load As resource efficiency (or load) goes to 1, delay goes to infinity Fundamental tradeoff between resource utilization and latency? Summary Batch Scheduling Resource Model» High Resource Efficiency, but Indeterminate Wait Complications» Requirements; Parallel Jobs» Advance Reservations Batch Queue Wait Prediction» Early results are promising» What to do about them? Are they stable?