Dynamic Slot Tutorial. Condor Project Computer Sciences Department University of Wisconsin-Madison



Similar documents
Job and Machine Policy Configuration. HTCondor Week 2015 Todd Tannenbaum

Dynamic Resource Provisioning with HTCondor in the Cloud

MSU Tier 3 Usage and Troubleshooting. James Koll

HTCondor at the RAL Tier-1

Condor: Grid Scheduler and the Cloud

Operating Systems Concepts: Chapter 7: Scheduling Strategies

Maximizing SQL Server Virtualization Performance

An objective comparison test of workload management systems

KVM & Memory Management Updates

LINUX CLUSTER MANAGEMENT TOOLS

LoadLeveler Overview. January 30-31, IBM Storage & Technology Group. IBM HPC Developer TIFR, Mumbai

Distributed Systems. Virtualization. Paul Krzyzanowski

Survey on Job Schedulers in Hadoop Cluster

How to control Resource allocation on pseries multi MCM system

A Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa

Performance And Scalability In Oracle9i And SQL Server 2000

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC

Fair Scheduler. Table of contents

If you re not using Citrix XenCenter 6.0, your screens may vary. Required Virtual Interface Maps to... mgmt0. virtual network = mgmt0 wan0

Real-time KVM from the ground up

Novell File Reporter 2.5 Who Has What?

Simnet Registry Repair User Guide. Edition 1.3

- An Essential Building Block for Stable and Reliable Compute Clusters

SQL Server Performance Tuning for DBAs

Virtualization. Dr. Yingwu Zhu

CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems

Proposal for Virtual Private Server Provisioning

How To Use A Raid

Required Virtual Interface Maps to... mgmt0. bridge network interface = mgmt0 wan0. bridge network interface = wan0 mgmt1

Vulnerability Assessment for Middleware

HTCondor within the European Grid & in the Cloud

I-Motion SQL Server admin concerns

GraySort on Apache Spark by Databricks

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Introduction to Parallel Programming and MapReduce

Controlling the Linux ecognition GRID server v9 from a ecognition Developer client

Systemverwaltung 2009 AIX / LPAR

Price Comparison ProfitBricks / AWS EC2 M3 Instances

Rational Application Developer Performance Tips Introduction

Quick Start Guide. Citrix XenServer Hypervisor. Server Mode (Single-Interface Deployment) Before You Begin SUMMARY OF TASKS

Red Hat and Condor Project

Kerrighed / XtreemOS cluster flavour

Cloud Server. Parallels. Key Features and Benefits. White Paper.

GRID workload management system and CMS fall production. Massimo Sgaravatto INFN Padova

VoIPon Tel: +44 (0) Fax: +44 (0)

Avoiding Performance Bottlenecks in Hyper-V

Linux on z/vm Memory Management

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Network Monitoring with Xian Network Manager

Cloud Sure - Virtual Machines

Virtual Server and Storage Provisioning Service. Service Description

Open Mic on IBM Notes Traveler Best Practices. Date: 11 July, 2013

Parallels Virtuozzo Containers

Installation Guide C-MOR Video Surveillance on XenServer from version 6.2

Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Running VirtualCenter in a Virtual Machine

DMS Performance Tuning Guide for SQL Server

Bigdata High Availability (HA) Architecture

Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

The Improved Job Scheduling Algorithm of Hadoop Platform

Reborn Card NET. User s Manual

Virtual server management: Top tips on managing storage in virtual server environments

Install Guide for JunosV Wireless LAN Controller

Windows Server Performance Monitoring

Pattern Insight Clone Detection

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Simplest Scalable Architecture

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

HADOOP PERFORMANCE TUNING

Extending Hadoop beyond MapReduce

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Virtualisa)on* and SAN Basics for DBAs. *See, I used the S instead of the zed. I m pretty smart for a foreigner.

Survey on Scheduling Algorithm in MapReduce Framework

9/26/2011. What is Virtualization? What are the different types of virtualization.

Session Storage in Zend Server Cluster Manager

Operating Systems Lab Exercises: WINDOWS 2000/XP Task Manager

StarWind iscsi SAN Software Hands- On Review

HPC performance applications on Virtual Clusters

PeerMon: A Peer-to-Peer Network Monitoring System

University of Maryland. Tuesday, February 2, 2010

Computing in High- Energy-Physics: How Virtualization meets the Grid

Transcription:

Dynamic Slot Tutorial Condor Project Computer Sciences Department University of Wisconsin-Madison

Outline Why we need partitionable slots How they ve worked since 7.2 What s new in 7.8 What s still left to do

What s the Problem?

Example Machine: 8 cores 8 Gigabytes memory 2 disks

The old way (Still the default) $ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime slot1@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.110 1024 0+00:45:04 slot2@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:05 slot3@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:06 slot4@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:07 slot5@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:08 slot6@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:09 slot7@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:10 slot8@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 1024 0+00:45:03 Total Owner Claimed Unclaimed Matched Preempting Backfill X86_64/LINUX 8 0 0 8 0 0 0 Total 8 0 0 8 0 0 0

Problem: Job Image Size

Simple solution: Static non-uniform memory # condor_config NUM_SLOTS_TYPE_1 = 1 NUM_SLOTS_TYPE_2 = 7 SLOT_TYPE_1 = mem=4096 SLOT_TYPE_2 = mem=auto

Result is $ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime slot1@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.150 4096 slot2@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot3@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot4@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot5@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot6@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot7@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 slot8@chevre.cs.wi LINUX X86_64 Unclaimed Idle 0.000 585 Total Owner Claimed Unclaimed Matched Preempting Backfill 0 X86_64/LINUX 8 0 0 8 0 0

Better, still not good Job requirements: h Requirements = memory > 2048 How to steer small jobs to small slot? h Trivia question in classads? How to pick correct sizes? Changes require startd restarts

8 Gb machine partitioned into 5 slots 4Gb Slot 1Gb 1Gb 1Gb 4 Gb Job 1Gb 1Gb

8 Gb machine partitioned into 5 slots 4Gb Slot 1Gb 1Gb 1Gb 1Gb 1Gb 1Gb

8 Gb machine partitioned into 5 slots 4Gb Slot 1Gb 1Gb 1Gb 1Gb 7 Gb free, but idle job 4 Gb Job

New: Partitionable slots Work in progress First landed in 7.2 More work in 7.8 Even more goodness to come But very usable now

The big idea One partionable slot From which dynamic slots are made When dynamic slot exit, merged back into partionable Split happens at claim time

(cont) Partionable slots split on h Cpu h Disk h Memory h (Maybe more later) When you are out of one, you re out of slots

3 types of slots Static (e.g. the usual kind) Partitionable (e.g. leftovers) Dynamic (usableable ones) h Dynamically created h But once created, static

8 Gb Partitionable slot 4Gb

8 Gb Partitionable slot 5Gb

How to configure NUM_SLOTS = 1 NUM_SLOTS_TYPE_1 = 1 SLOT_TYPE_1 = cpus=100% SLOT_TYPE_1_PARTITIONABLE = true

Looks like $ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime slot1@c LINUX X86_64 Unclaimed Idle 0.110 8192 Total Owner Claimed Unclaimed Matched X86_64/LINUX 1 0 0 1 0 Total 1 0 0 1 0

When running $ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime slot1@c LINUX X86_64 Unclaimed Idle 0.110 4096 slot1_1@c LINUX X86_64 Claimed Busy 0.000 1024 slot1_2@c LINUX X86_64 Claimed Busy 0.000 2048 slot1_3@c LINUX X86_64 Claimed Busy 0.000 1024

All this in 7.2 What are the problems? h Slow matching h Broken for parallel universe h Dedicated slots users broken h Fragmentation h Selection of dynamic slots sizes tricky

Fixed in 7.8 Matching faster h CLAIM_PARTITIONABLE_LEFTOVERS = false to make slow again.. Parallel universe fixed Dedicated slot users fixed Support for defragging!

Answer to triva question Requirements = (Memory > 1024) h How can startd parse? It Can t!

THIS IS BIG! Memory requirements deprecated Don t do h Requirements = memory > 1024 Generates a warning now: condor_submit submit7 Submitting job(s) WARNING: your Requirements expression refers to TARGET.Memory. This is obsolete. Set request_memory and condor_submit will modify the Requirements expression as needed..

Instead, use request_memory = 2048 # mbytes Same is true for disk, cpus request_disk = 16384 # kbytes request_cpus = 1 Requirements automatically fixed

I have to change all my submit files? There s a knob for that JOB_DEFAULT_REQUESTMEMORY JOB_DEFAULT_REQUESTDISK JOB_DEFAULT_REQUESTCPUS submit side defaults Can be expressions

I don t want to change the config file JOB_DEFAULT_REQUEST_MEMORY ifthenelse(memoryusage =!= UNDEF, Memoryusage, 1) JOB_DEFAULT_REQUEST_CPUS 1 JOB_DEFAULT_REQUEST_DISK DiskUsage

What about the startd side? Startd has a say, too: MODIFY_REQUEST_EXPR_REQUESTCPUS quantize(requestcpus, {1}) MODIFY_REQUEST_EXPR_REQUESTMEMORY quantize(requestmemory, {TotalSlotMem/TotalSlotCpus / 4}) MODIFY_REQUEST_EXPR_REQUESTDISK quantize(requestdisk, {1024})

Why quantize? Allow slot reuse Name OpSys Arch State Activity LoadAv Mem slot1@c LINUX X86_64 Unclaimed Idle 0.110 4096 slot1_1@c LINUX X86_64 Claimed Busy 0.000 2048 slot1_2@c LINUX X86_64 Claimed Busy 0.000 1024 slot1_3@c LINUX X86_64 Claimed Busy 0.000 1024

Also holds for cpus Much easier way to do whole machine Basically same as memory requirements Easier to set up than Parallel universe

Fragmentation Name OpSys Arch State Activity LoadAv Mem slot1@c LINUX X86_64 Unclaimed Idle 0.110 4096 slot1_1@c LINUX X86_64 Claimed Busy 0.000 2048 slot1_2@c LINUX X86_64 Claimed Busy 0.000 1024 slot1_3@c LINUX X86_64 Claimed Busy 0.000 1024 Now I submit a job that needs 8G what happens?

Solution: New Daemon condor-defrag (new in 7.8) h One daemon defrags whole pool Central manager good place to run Scan pool,try to fully defrag some startds Only looks at partitionable machines Admin picks some % of pool that can be whole

Oh, we got knobs DEFRAG_DRAINING_MACHINES_PER_HOUR default is 0 DEFRAG_MAX_WHOLE_MACHINES default is -1 DEFRAG_SCHEDULE graceful (obey MaxJobRetirementTime, default) quick (obey MachineMaxVacateTime) fast (hard-killed immediately)

Defrag vs. Preemption Defrag can be general purpose h Looks only at startds, not at demand h Can also preempt non-partitionable slots (if so configured) Negotiator preemption looks at 2 jobs

Advanced Topics More than one Partitionable slot Mix and match partionable slots Overcommitting partionable sltos Parallel universe

Future work Claiming partitionable slots h So RANK based preemption works condor_q analyze More knobs!

Thank you