Slurm License Management



Similar documents
SLURM Workload Manager

General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)

GT 6.0 GRAM5 Key Concepts

Matlab on a Supercomputer

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

HPC-Nutzer Informationsaustausch. The Workload Management System LSF

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

A High Performance Computing Scheduling and Resource Management Primer

Sybase Software Asset Management (SySAM)

Using Three-Server Redundancy

SOA PAYMENT REQUEST SYSTEM

How to Install and Configure a LSF Host

FLEXNET LICENSING END USER GUIDE. Version 10.8

Installation Guide for Windows. Licensing for GEOSYSTEMS Software

An introduction to compute resources in Biostatistics. Chris Scheller

End Users Guide VERSION 9.2 JULY 2003

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

FLEXNET LICENSING END USER GUIDE. Version 10.8

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

Until now: tl;dr: - submit a job to the scheduler

Maintaining Non-Stop Services with Multi Layer Monitoring

2015 Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation

Install Guide. Contents

MATLAB Distributed Computing Server Licensing Guide

MICHIGAN AUDIT REPORT OFFICE OF THE AUDITOR GENERAL. Doug A. Ringler, C.P.A., C.I.A. AUDITOR GENERAL ENTERPRISE DATA WAREHOUSE

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

A Case Study in Integrated Quality Assurance for Performance Management Systems

Scheduling in SAS 9.3

Synopsys Common Licensing Release Notes Version 11.8

18.2 user guide No Magic, Inc. 2015

The Method to Manage the Information about Network Software License Utilization Based on Log-file Parsing

While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX

HOT TREND: ACCOUNTS PAYABLE AUTOMATION

Chapter 2: Getting Started

Dragonframe License Manager User Guide Version 1.2.2

Installation Instruction STATISTICA Enterprise Small Business

SiteCelerate white paper

Submitting batch jobs Slurm on ecgate. Xavi Abellan User Support Section

STATISTICA VERSION 12 STATISTICA ENTERPRISE SMALL BUSINESS INSTALLATION INSTRUCTIONS

License Administration Guide. FlexNet Publisher 2014 R1 ( )

License Administration Guide. FlexNet Publisher Licensing Toolkit

Performance analysis of a Linux based FTP server

Running COMSOL in parallel

Scheduling in SAS 9.4 Second Edition

MapGuide Open Source Repository Management Back up, restore, and recover your resource repository.

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

FLEXlm End Users Manual

Monitoring PostgreSQL database with Verax NMS

IBM Tivoli Netcool Configuration Manager 6.3 Administration and Implementation

Power and Energy aware job scheduling techniques

Introduction to parallel computing and UPPMAX

Software Requirements Specification. Schlumberger Scheduling Assistant. for. Version 0.2. Prepared by Design Team A. Rice University COMP410/539

Reliable log data transfer

E-Commerce User Instructions

The Asterope compute cluster

Audits. Alerts. Procedure

Fax User Guide 07/31/2014 USER GUIDE

Moab and TORQUE Highlights CUG 2015

Sophos Mobile Control Technical guide

MATLAB Distributed Computing Server System Administrator's Guide

Flight Workflow User's Guide. Release

LicenseMonitor The Top-10 Issues Facing License Managers

Migration Strategies and Tools for the HP Print Server Appliance

Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0

The Information Technology Solution. Denis Foueillassar TELEDOS project coordinator

Legal exchange. Total Legal Spend Management Solution for Corporate legal departments

MXSAVE XMLRPC Web Service Guide. Last Revision: 6/14/2012

License Administration Guide. FLEXnet Publisher Licensing Toolkit 11.5 FNP-115-LA01

IPcenter v3 ITIL v3 Alignment

Payments & Transfers ACH

Improving the Customer Support Experience with NetApp Remote Support Agent

MATLAB Distributed Computing Server System Administrator s Guide. R2013b

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum

In-memory Tables Technology overview and solutions

Slurm Fault Tolerant Workload Management

OpenAdmin Tool for Informix (OAT) October 2012

Name. Description. Rationale

Managing Approvals in Expenses. Understanding Approvals

Building and Deploying Enterprise M2M Applications with Axeda Platform

SAS Grid: Grid Scheduling Policy and Resource Allocation Adam H. Diaz, IBM Platform Computing, Research Triangle Park, NC

How To Fix A Powerline From Disaster To Powerline

Toad for Oracle 8.6 SQL Tuning

CA Workload Automation Agents Operating System, ERP, Database, Application Services and Web Services

Introduction to system monitoring with Nagios, Check_MK and Open Monitoring Distribution (OMD)

Legal exchange. Total Legal Spend Management Solution for Insurance Companies

High availability options for IBM Rational License Key Server

Netop Remote Control for Linux Installation Guide Version 12.22

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:

CHAPTER 15: Operating Systems: An Overview

TEMPLE UNIVERSITY POLICIES AND PROCEDURES MANUAL

Transcription:

Slurm License Management Slurm 2013 User Group Bill Brophy, Bull 1

Background Software and the associated Licenses are EXPENSIVE Nearly $160 billion will be spent by American companies on software purchases this year Companies overspend by nearly 30% on software license agreements and maintenance (embedded licenses, etc) Software vendors are concerned about lost revenue Commercial value of pirated & overused software rose to $63.4 billion in 2011 Increasing the number & frequency of software audits 2

Background License Managers were developed to guard ISVs resources Prevent usage of unlicensed software Prevent overuse of software Manage licenses from various vendors Problems with integrating Resource Managers with License Managers License Managers do not provide an open interface Solutions often introduce Race Conditions Often involve a great deal of overhead ISVs, the License Manager s clients, prefer to sell more licenses FlexNet Publisher (formerly Flexlm) is the major license manager 3

Current Slurm License Management License information parameter in slurm.conf License names can optionally be followed by a colon and count Multiple license names should be comma separated.e.g. Licenses=Intel_Compiler:4,TotalView salloc, srun & sbatch support for licenses -L, --licenses=<license> e.g. --licenses=intel_compiler:2,totalview Reservations can be used to restrict license usage Licenses=<license> LICENSE_ONLY flag No integration with License Managers Acceptable if licenses are uses exclusively within Slurm Potential problems & inefficiency if usage external to Slurm 4

Development Plan Bull has initiated a Slurm License Management project Based on consultation with SchedMD Input from the development community welcomed Project will consist of multiple phase First phase will introduce new License Structures Second phase will integrate Slurm with License Manager(s) e.g. FlexNet Publisher Hongjia Cao has begun work on a plugin 5

Phase 1 Two new license structures are defined System license structure Cluster license structure Include structures in database Lays the groundwork for including in Associations Populate new structures using sacctmgr interface slurmctld notified of license changes slurmctld to use existing structures 6

Phase 2 Integration of Slurm with License Manager Initially only FlexNet Publisher Still in discussion stage Involves considering the interface with the licenses manager Providing the communication protocols between Slurm and the license manager Solution must be efficient & scalable 7

License Acquisition License Manager Vendor Daemon 1 Request Vendor Daemon Info 1 2 3 4 2 License Manager returns information 3 Request for License Application 4 Grant or Deny request 8

Slurm in License Acquisition License Manager Vendor Daemon Slurm Application 9

Slurm License Management Issues License request occurs AFTER Slurm has started the job Slurm has no knowledge of non-slurm license users License Usage Counts are very Dynamic License Managers cater to Software Vendors (primary focus is license limit enforcement) FlexNet has a command called lmstat to retrieve information, which can be very slow if FlexNet is handling many applications. A possible alternative could be parsing the FlexNet log file (lmgr). 10

Possible Solution LSF Network Floating License Management Use an external program (ELIM) to obtain the number of licenses currently available Configure an external load index containing the number of free licenses on each host ELIM periodically informs LSF of the number of available licenses Configure a dedicated queue to run jobs requiring a floating software license Queue definitions REQUEUE_EXIT_VALUES parameter related to license denial codes For each job in the queue, LSF reserves a software license before dispatching a job, and releases the license when the job finishes. A batch job may fail to allocate a license due to an interactive job (race condition) If a job exits with one of the values in the REQUEUE_EXIT_VALUES, LSF will requeue the job. David Bigagli was one of the designers & developers of LSF the License Scheduler http://www.ccs.miami.edu/hpc/lsf/7.0.6/admin/licensemgt.html 11

12

Supplemental Information Following are several slides which give brief descriptions of ideas for integrating a resource manager, such as Slurm, with a license manager. Several of these ideas came from the Slurm developer s forum Other ideas were located in web searches on this topic. Additional suggestions welcome: Bill Brophy <bill.brophy@bull.com> Danny Auble <da@schedmd.com> David Bigagli <david@schedmd.com> 13

Approach 1 Gary Brown Reservation & Commit model Slurm "reserves" licenses through external license manager (FlexNet) The running job actually "checks out" the reserved licenses Issues Race condition between running job checkout & external job checkout Requires new model for license managers Changes required to ALL Independent Software Vendors (ISVs) software 14

Approach 2 Mark Olsen method for the Gridengine Sophisticated Perl script observes license manager using lmstat A load sensor adjusts complex values (licenses available to the system) for jobs external jobs using licenses Resource manager calculates internal count Available complex values = complex values internal count Issues Delay in reports Race Conditions between external & internal license allocation http://gridengine.info/files/mark_olesen-howto-licenses-n1ge.html 15

Approach 3 Hongjia Cao Approach Modify the vendor option file Reserve the number of licenses (features, in term of FlexNet) configured in Slurm Use a randomly generated project name On job resource allocation reserve licenses in resource manager Set the environment LM_PROJECT to the project name to checkout licenses On job resource deallocation, the licenses reserved to project of the job are taken back Reservation in vendor option file is deleted Imreread executed Issues Race condition on vendor option file updates Scalability Possible scheduling performance impact 16