KISTI Supercomputer TACHYON Scheduling scheme & Sun Grid Engine



Similar documents
Grid Engine 6. Policies. BioTeam Inc.

Grid Engine Training Introduction

Grid Engine Administration. Overview

SCHEDULER POLICIES FOR JOB PRIORITIZATION IN THE SUN N1 GRID ENGINE 6 SYSTEM. Charu Chaubal, N1 Systems. Sun BluePrints OnLine October 2005

Batch Job Analysis to Improve the Success Rate in HPC

Grid Engine Users Guide p1 Edition

Oracle Grid Engine. User Guide Release 6.2 Update 7 E

SGE Roll: Users Guide. Version Edition

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

The SUN ONE Grid Engine BATCH SYSTEM

Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer. Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA)

Running ANSYS Fluent Under SGE

User s Guide. Introduction

Introduction to Sun Grid Engine (SGE)

Introduction to Sun Grid Engine 5.3

Grid Engine 6. Troubleshooting. BioTeam Inc.

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Enigma, Sun Grid Engine (SGE), and the Joint High Performance Computing Exchange (JHPCE) Cluster

Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine

Sun Grid Engine Update

Introduction to the SGE/OGS batch-queuing system

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Efficient cluster computing

Sun Powers the Grid SUN GRID ENGINE

MPI / ClusterTools Update and Plans

The Moab Scheduler. Dan Mazur, McGill HPC Aug 23, 2013

A Design of Resource Fault Handling Mechanism using Dynamic Resource Reallocation for the Resource and Job Management System

Grid Engine. Application Integration

HPC-Nutzer Informationsaustausch. The Workload Management System LSF

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

Benchmark Report: Univa Grid Engine, Nextflow, and Docker for running Genomic Analysis Workflows

LSKA 2010 Survey Report Job Scheduler

How To Run A Tompouce Cluster On An Ipra (Inria) (Sun) 2 (Sun Geserade) (Sun-Ge) 2/5.2 (

Running a Workflow on a PowerCenter Grid

locuz.com HPC App Portal V2.0 DATASHEET

A High Performance Computing Scheduling and Resource Management Primer

MapReduce Evaluator: User Guide

SLURM Workload Manager

Job Scheduling with Moab Cluster Suite

Open Source Grid Computing Java Roundup

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Scheduling Algorithms for Dynamic Workload

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

User s Manual

Martinos Center Compute Clusters

Oracle Grid Engine. Administration Guide Release 6.2 Update 7 E

High Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Adaptive Resource Optimizer For Optimal High Performance Compute Resource Utilization

System Software for High Performance Computing. Joe Izraelevitz

Release Notes for Open Grid Scheduler/Grid Engine. Version: Grid Engine

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

SAS Grid: Grid Scheduling Policy and Resource Allocation Adam H. Diaz, IBM Platform Computing, Research Triangle Park, NC

Installing and running COMSOL on a Linux cluster

GRID Computing: CAS Style

Grid Scheduling Dictionary of Terms and Keywords

High Performance Computing

Chapter 2: Getting Started

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Cluster APIs. Cluster APIs

Survey on Job Schedulers in Hadoop Cluster

Introduction to Apache YARN Schedulers & Queues

Linux Block I/O Scheduling. Aaron Carroll December 22, 2007

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Module 3: Instance Architecture Part 1

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Capacity Scheduler Guide

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing

System Requirements. Version

CPU Scheduling 101. The CPU scheduler makes a sequence of moves that determines the interleaving of threads.

Module 14: Scalability and High Availability

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Schedule WRF model executions in parallel computing environments using Python

A CP Scheduler for High-Performance Computers

Quick Tutorial for Portable Batch System (PBS)

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Survey on Scheduling Algorithm in MapReduce Framework

How to control Resource allocation on pseries multi MCM system

MIMIX Availability. Version 7.1 MIMIX Operations 5250

System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies

A Survey of Shared File Systems

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Oracle Architecture. Overview

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 24.

Transcription:

KISTI Supercomputer TACHYON Scheduling scheme & Sun Grid Engine 슈퍼컴퓨팅인프라지원실 윤 준 원 (jwyoon@kisti.re.kr) 2014.07.15

Scheduling (batch job processing) Distributed resource management Features of job schedulers (SW) Broad scope Support for algorithms Capability to integrate with standard resource manager Sensitivity to compute node and interconnect architecture Scalability Fair-Share capability Efficiency Dynamic capability Support for preemption - Job Scheduling in HPC Cluster, Dell Power Solution, February 2005. -

Sun Grid Engine Open source batch-queuing system, developed and supported by Sun Microsystems (Oracle) SGE History CODINE(Computing in Distributed Networked Environments) - 1991 GRD(Global Resource Director) 1996 Merged with GridWare - 1999 acquired by Sun Microsystems - in August of 2000 Sun renamed the product Grid Engine and released a free version - 2001 Oracle acquired Sun in January 2010 By the end of 2010, Oracle had closed the open source community, stopped shipping source code, increased the license fees In January of 2011, Univa announced that it had hired the core Grid Engine development team who had worked on Grid Engine for several years.

Job scheduling in SGE Tachyon2 - SGE 6.2u6 / Tachyon1 - SGE 6.1u5 The scheduler was a separate daemon(qmaster) before 6.2 released Scheduling a job has two distinct stage Job selection Job scheduling

Sun Grid Engine Overview Queue A logical abstraction that aggregate a set of job slots across one or more execution hosts. Slots A container for jobs that execute on a single host Default queue configuration : Slot count set equal to CPU count Standard Job Types Batch, Interactive, Parallel, Checkpoint Terminology cluster queue all.q queue instance all.q@node004

Host Group & Queue Configuration in SGE Host Group mgt. qconf ahgrp, -mhgrp, -dhgrp, -shgrp qconf -m{q,e,p,ckpt} <파일이름> -m : 수정 파일을 작성할 텍스트 편집, q : 대기열, e : 실행 호스트, p : 병렬 환경, ckpt : 체크포인트 환경 switch option a:추가, m:변경, d:삭제, r:교체, s:보기 Q mgt. qconf -[aq, mq, dq, sq] queuename // 큐 생성,수정,삭제, 확인 Host Group, PE, UserSet List 수정, userset list NONE(기본값)인 경우 모든 사용자 submit이 가능 qmaster/usersets 에서 큐 그룹별로 관리(#qconf [au, mu, du, su] user1,user2,.. user_lists) qtype, slots, shell, shell_start_mode, prolog, epilog, complex_values 및 resources 등 수정 h_rt (walltime clock)은 Tachyon 1st long queue 168 hours, normal queue 48 hours 로설정 long queue는 1cpu 이상, normal queue는 17cpu 이상이며, 그 미만 실행 불가 qconf [ahgrp, mhgrp] @hostgroup, qconf -shgrpl // hostgroup 생성,수정, 확인

Scheduling Decisions

Policy Components

Sun Grid Engine Scheduler Grid Engine Tickets All policies are defined using tickets Jobs get tickets from all the various policies Jobs with more tickets are more important Administrator controls the total number of tickets in the system # of tickets assigned to each policy determines how important each of the different available policies are To disable a policy within scheduler, assign zero tickets to it

Three Classes of Policies Ticket Policies (Entitlement) Share Tree (or Pair-share) Functional Ticket Override Ticket Urgency Policies Deadline Wait time Resource urgency Custom Policies POSIX Priority Administrator to push a particular job to the front of the pending job list

Three Classes of Policies Ticket Policies (Entitlement) Share Tree (or Pair-share) Functional Ticket Override Ticket Urgency Policies Deadline Wait time Resource urgency Custom Policies POSIX Priority Administrator to push a particular job to the front of the pending job list

Entitlement Share tree Ticket Policies (Job Selection) Share Tree(fair-share) Policy Start with N tickets, Divvy up across tree Job sorting based on ticket count Memory(historical) of past usage Leaf nodes must be project or user nodes [root@sge03qs pe]# qconf -ssconf grep weight_tickets* weight_tickets_functional 0 weight_tickets_share 100000 weight_ticket 0.010000

Entitlement Function Ticket Ticket Policies (Job Selection) Functional Ticket Policy Start with N tickets, Divide into four categories Users, Dept, Projects, Jobs By default all categories have equal weight Divide within category among all jobs weight_tickets_functional 0 weight_user 0.250000 weight_project 0.250000 weight_department 0.250000 weight_job 0.250000 Sum ticket count for each job within each category, Highest count wins No memory(historical) of past usage Leaf nodes must be project or user nodes By default, the functional ticket policy is inactive

Entitlement Override Ticket Ticket Policies (Job Selection) Override Policy Used to make temporary changes Override tickets disappear with job exit Admin can assign extra tickets User, project, department or job Can also use quota to add override entitlements to a pending jobs share_override_tickets Does job count dilute override ticket count. Default is TRUES [root@sge03 pe]# qconf -ssconf grep share* weight_tickets_share 100000 share_override_tickets TRUE

Relevant parameters

Three Classes of Policies Ticket Policies (Entitlement) Share Tree (or Pair-share) Functional Ticket Override Ticket Urgency Policies Wait time Deadline Resource urgency Custom Policies POSIX Priority Administrator to push a particular job to the front of the pending job list

Urgency Wait Time Policy As a job remains in the pending queue, the wait time policy increases the urgency for that job. It can be useful for preventing job starvation U wait = T wait X W wait U wait : wait-time urgency T wait : the time spent since being submitted W wait : wait-time weighting factor weight_waiting_time 100.000000 weight_urgency 0.100000

Urgency Deadline Policy The deadline is the time by which the job must be scheduled. In order to submit a job with a deadline, a user must be a member of the deadlineusers group. U deadline = : deadline time : current time are given in Unix time(in seconds) : wait-time weighting factor weight_deadline 3600000.000000 weight_urgency 0.100000

Urgency Resource Policy If some resources in a cluster are particularly valuable, it might be advantageous to make sure those resources stay as busy as possible.

Three Classes of Policies Entitlement (ticket) based Share Tree (or Pair-share) Functional Ticket Override Ticket Urgency Policies Wait time Deadline Resource urgency Custom Policies POSIX Priority Administrator to push a particular job to the front of the pending job list

Combining Policies Final dispatch priority assigned to all pending jobs is determined by combining the contributions entitlement, urgency, and custom policies P = Ne We + Nu Wu + Nc Wc Ne : entitlement priority We : entitlement weighting factor # weight_ticket 0.010000 Nu : urgency priority Wu : urgency weighting factor # weight_urgency 0.100000 Nc : custom priority Wc : custom weighting factor # weight_priority 1.000000

Scheduler weighting factors Reference in Text Weighting Factor Parameter Name Tachyon1 Tachyon2 W deadline Deadline weight_deadline 3600000 3600000 W wait Wait-time weight_waiting_time 0 100 W e Entitlement (Ticket) weight_ticket 0.01 0.01 W u Urgency weight_urgency 0.1 0.1 W c Custom (POSIX) weight_priority 1 1 weight_tickets_share 100000 100000 weight_tickets_funct ional share_override_tick ets True 0 0 True

ref. ) Job Priorities and Tickets -urg = rrcontr + wtcontr + dlcontr -tckts = ftckt + otckt + stckt - job_priority = weight_urgency * normalized_urgency_value + weight_ticket * normalized_ticket_value + weight_priority * normalized_posix_priority_value ntckts The total number of tickets in normalized fashion. tckts The total number of tickets assigned to the job currently ovrts The override tickets as assigned by the -ot option of qalter. otckt The override portion of the total number of tickets assigned to the job currently ftckt The functional portion of the total number of tickets assigned to the job currently stckt The share portion of the total number of tickets assigned to the job currently share The share of the total system to which the job is entitled currently. nurg urg The jobs total urgency value in normalized fashion. The jobs total urgency value. rrcontr The urgency value contribution that reflects the urgency that is related to the jobs overall resource requirement. wtcontr The urgency value contribution that reflects the urgency related to the jobs waiting time. dlcontr The urgency value contribution that reflects the urgency related to the jobs deadline initiation time. deadline The deadline initiation time of the job as specified with the qsub -dl option. npprior The jobs -p priority in normalized fashion. ppri The jobs -p priority as specified by the user.