LSF with OS Partitioning and Virtualization

Similar documents
SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs

iseries Logical Partitioning

Systemverwaltung 2009 AIX / LPAR

How To Run A Power5 On A Powerbook On A Mini Computer (Power5) On A Microsoft Powerbook (Power4) On An Ipa (Power3) On Your Computer Or Ipa On A Minium (Power2

Virtualization 101 for Power Systems

<Insert Picture Here> Oracle Database Support for Server Virtualization Updated December 7, 2009

Unit 9: License Management

Capacity Planning and Performance Management on IBM PowerVM Virtualized Environment. Neeraj Bhatia

Pete s All Things Sun: Comparing Solaris to RedHat Enterprise and AIX Virtualization Features

SAP Process Orchestration. Technical information

Current Oracle license rules and definitions. Oracle licensing.

Server Virtualization: The Essentials

Virtualization what it is?

The Truth Behind IBM AIX LPAR Performance

Monitoring IBM HMC Server. eg Enterprise v6

Free performance monitoring and capacity planning for IBM Power Systems

Best Practices on monitoring Solaris Global/Local Zones using IBM Tivoli Monitoring

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

IBM Security Host Protection V2.2.2 System Requirements

Virtualization management tools

Proposal for Virtual Private Server Provisioning

Return to Basics I : Understanding POWER7 Capacity Entitlement and Virtual Processors VN211 Rosa Davidson

Guide to AIX VIOS Monitoring. by: Nigel Adams, Independent IBM Consultant

IBM Power Systems Technical University Athens, Greece. November Oracle on Power Power advantages and license optimization.

Exam : IBM : Iseries Linux Soluton Sales v5r3

Cloud Infrastructure Management - IBM VMControl

Power Systems Performance Management In A Virtualized World. February 2015

Agenda. Capacity Planning practical view CPU Capacity Planning LPAR2RRD LPAR2RRD. Discussion. Premium features Future

Best Practices. Improving Data Server Utilization and Management through Virtualization. IBM DB2 for Linux, UNIX, and Windows

Configuration Management of Massively Scalable Systems

Virtual Machines.

Servers and two Client with failover

Symantec Storage Foundation and High Availability Solutions 6.2 Virtualization Guide - AIX

Chapter 1 - Web Server Management and Cluster Topology

Models For Modeling and Measuring the Performance of a Xen Virtual Server

Unit 4 i5/os Work Management

Linux on POWER for Green Data Center

PowerVC 1.2 Q Power Systems Virtualization Center

How to control Resource allocation on pseries multi MCM system

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

WebSphere XD Virtual Enterprise v7.0: virtualization and infrastructure optimization

IOS110. Virtualization 5/27/2014 1

Redpaper. Integrated Virtualization Manager on IBM System p5. Front cover. ibm.com/redbooks. No dedicated Hardware Management Console required

Virtualization. Michael Tsai 2015/06/08

LoadLeveler Overview. January 30-31, IBM Storage & Technology Group. IBM HPC Developer TIFR, Mumbai

What is virtualization

White Paper. Server Virtualization on UNIX Systems: A comparison between HP Integrity Servers with HP-UX and IBM Power Systems with AIX

An Oracle White Paper April Best Practices for Running Oracle Databases in Oracle Solaris Containers

Features - SRM UNIX File System Agent

Managing a Virtual Computing Environment

IT Asset Management in Today's Complex Data Center

IBM Tivoli Monitoring for Virtual Environments: Dashboard, Reporting, and Capacity Planning Version 7.2 Fix Pack 2. User s Guide SC

Sun Solaris Container Technology vs. IBM Logical Partitions and HP Virtual Partitions

Virtualization: Concepts, Applications, and Performance Modeling

SANtricity Storage Manager 11.20

Hard Partitioning and Virtualization with Oracle Virtual Machine. An approach toward cost saving with Oracle Database licenses

Cloud Computing through Virtualization and HPC technologies

Hardware Information Managing your server, adapters, and devices ESCALA POWER5 REFERENCE 86 A1 00EW 00

FileNet System Manager Dashboard Help

managing the risks of virtualization

SUSE Linux Enterprise 10 SP2: Virtualization Technology Support

IBM Tivoli Storage Manager for Linux Version Installation Guide IBM

Buffer Management 5. Buffer Management

System i and System p. Customer service, support, and troubleshooting

EMC AVAMAR BACKUP CLIENTS

Integrated and reliable the heart of your iseries system. i5/os the next generation iseries operating system

Software asset management White paper. Improving IT service delivery through an integrated approach to software asset management.

opensm2 Enterprise Performance Monitoring December 2010 Copyright 2010 Fujitsu Technology Solutions

EXPLORING LINUX KERNEL: THE EASY WAY!

Database Instance Caging: A Simple Approach to Server Consolidation. An Oracle White Paper September 2009

Toolbox.com Live Chat

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Migrating LAMP stack from x86 to Power using the Server Consolidation Tool

CHAPTER 15: Operating Systems: An Overview

Centralize AIX LPAR and Server Management With NIM

Resource Manager Overview. Sue Lee, Director of Development 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Session 1: Managing Software Licenses in Virtual Environments. Paul Baguley, Principal, Advisory Services KPMG

OS Thread Monitoring for DB2 Server

Getting Started with RES Automation Manager Agent for Linux

Virtualization Technologies ORACLE TECHNICAL WHITE PAPER OCTOBER 2015

CA Virtual Assurance for Infrastructure Managers

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

x86 ISA Modifications to support Virtual Machines

Amahi Instruction Manual

AIX 7. Open, secure, scalable, reliable UNIX operating system for IBM Power Systems servers

Exam Name: i5 iseries LPAR Technical Solutions V5R3 Exam Type: IBM Exam Code: Total Questions: 132

Virtualization and the U2 Databases

Capacity planning for IBM Power Systems using LPAR2RRD.

Example of Standard API

Transcription:

LSF with OS Partitioning and Virtualization Feb 2011 This document describes how LSF works in OS partitioning and virtualization environments. It will focus on Solaris containers and AIX partitions because these are the environments most commonly used by SAS Grid Manager customers who choose to use OS partitioning and virtualization. LSF Version LSF version covered in this report is LSF7.0. Most of the testing was done with LSF7.02, but the results are expected to be applicable to LSF7.05 and 7.06. 1. Summary SOLARIS: AIX: The only known configuration with which LSF works well (daemons and functionality) is in a non-global Solaris container that uses a dedicated, fixed number of CPUs. The known configuration with which LSF works well (daemons and functionality) is LPARs and DLPARs with dedicated processors. In more detail, 1. With LPARs, DLPARs and micro-partitions, all LSF daemons can run, and LSF is able to dispatch and run jobs on the partitions like on a physical host. 2. LSF cannot run on WPARs. 3. Specific to CPU utilization reporting, a. In micro-partitions with shared processors (capped or uncapped), LSF-reported CPU utilization is not consistent with mpstat output. So LSF CPU-based scheduling may not work well b. On LPARs and DLPARs with dedicated processors, LSF works well and reports CPU utilization correctly. LSF also reports the number of processors correctly after CPU allocation is changed dynamically on DLPARs.

2. Solaris Containers This refers to the Solaris Containers introduced in Solaris 10. 2.1. Global zone In a global zone of Solaris 10, LSF works well without any known issues. 2.2. Non-global zone In non-global zone of Solaris 10, there are known limitations. 2.2.1. Privileges The root user in non-global zones of Solaris containers does not have all the privileges that the root user in a global zone has. For LSF to work in a non-global zone, we must ensure all required privileges are configured for the root user. A typical issue is the root user lacks proc_priocntl privilege. Make sure this is assigned to the root user of the non-global zone, for example, through zonecfg command. 2.2.2. Resource Management When all required privileges are available for the root user (as in 2.2.1), all LSF daemons can run and LSF will be able to dispatch and run jobs on a non-global zone like on a physical host. However, the number of CPUs and CPU utilization detected by LSF may not be correct specific to a nonglobal zone, depending on what resource management mechanism is configured. This may interfere with LSF CPU-based scheduling. The resource management features provided by Solaris 10 include a combination of Fair Share Scheduler, CPU caps, dedicated CPUs, and resource pools. The only known configuration with which LSF works well in a non-global Solaris container is to use a dedicated fixed number of CPUs. Dedicated CPUs The dedicated-cpu resource specifies that a subset of the system's processors should be dedicated to a non-global zone while it is running. When the zone boots, the system will dynamically create a temporary pool for use while the zone is running. A fixed number of CPUs or a range, such as 2-4 CPUs, can be specified for the number of dedicated CPUs. When dedicated CPUs are configured with a fixed number of CPUs (min=max), LSF works well and correctly reports the number of CPUs and CPU usage in the non-global container. To be tested: If a range is specified for the number of CPUs (min!= max), LSF does not report the number of CPUs and CPU usage accurately when CPUs are added or removed from the container.

Capped CPUs The capped-cpu resource provides an absolute fine-grained limit (up to 2 decimal points) on the amount of CPU resources that can be consumed by a zone CPU capping for Solaris containers, by design, will make all CPUs in the global zone visible to each of the individual containers. LSF detects the number of CPUs for the container to be the total number of CPUs on the physical host. The CPU utilization LSF reports is always consistent with mpstat output, which is the host/global-level metric for CPU-capped Solaris containers. There are no Solaris commands that show the CPU utilization specific to a single container when CPU capping is in effect. To illustrate, suppose a physical host has 2 CPUs and there are two Solaris containers z1 and z2 with CPU capping of 0.8 each. If z1 has a job running using up all of the CPUs available for this container (0.8), and z2 does not have any jobs running, the CPU utilization seen from both z1 and z2 is host-level, 40%. This can be obtained from mpstat output on z1 and z2, and LSF reports the same number. In this case, everything else equal, LSF will have equal probability of dispatching new jobs to z1 and z2, even though z1 has no more capacity to run more jobs and z2 has the full capacity. This may be mitigated by the job slots configured for z1 and z2. For example, if both of them are manually configured with 1 job slot, since the job slot on z1 is used, and the one on z2 is available, LSF will dispatch the new job to z2. However, in general, CPU-based scheduling of LSF does not work in this environment. CPU capping for Solaris containers and LSF are essentially conflicting workload management tools and should not be used together. Fair Share Scheduler Fair Share Scheduler can be used to control the allocation of available CPU resources among zones, based on their importance. This importance is expressed by the number of shares of CPU resources that is assigned to each zone. When Fair Share Scheduler is configured, the case is similar to capped CPUs for LSF. All CPUs are visible to the sharing zones and global-level CPU utilization is reported in the zones. This is another example of conflicting workload management tools and therefore CPU-based scheduling of LSF does not work in this environment. Resource Pool Resource pools can be configured to include a processor set with an optional scheduling class. Dynamic resource pools provide a mechanism for dynamically adjusting each pool's resource allocation in response to system events and application load changes. A zone can then be associated with a configured resource pool. Though not tested, if a non-global zone is associated with a static resource pool where the processor set has a fixed number of CPUs, LSF may work well in this zone. In all other cases, CPU-based scheduling of LSF may not work well.

3. AIX Partitions 3.1. AIX Partitions Overview There are several types of AIX partitions. Logical partitions (LPAR): introduced with AIX5.1 and POWER4 technology. Multiple operating systems including AIX and Linux can run in separate partitions on the same system without interference. However, a reboot is required to move resources between LPARs. Dynamic logical partitions (DLPAR): introduced in AIX5.2. More system flexibility is achieved by being able to move CPU, I/O adapters and memory dynamically without rebooting the LPARs. Micropartitions: introduced in AIX5.3. Micro-Partitioning maps virtual processors to physical processors and the virtual processors are assigned to the partitions instead of the physical processors. With IBM eserver p5 servers, you can choose the following types of partitions: Dedicated processor partition: the entire processor is assigned to a particular logical partition. Shared processor partition: the physical processors are virtualized and then assigned to partitions. Sharing mode can be capped or uncapped. Workload partitions (WPAR): introduced in AIX6.1. WPAR is a purely software partitioning solution that is provided by the operating system. WPAR provides a solution for partitioning one AIX operating instance in multiple environments, each environment being a WPAR. WPARs can be created within multiple AIX instances of the same physical server, whether they execute in dedicated LPARs or micropartitions (See Figure 1, copyright by IBM). A powerful feature of WPAR is WPAR mobility with checkpoint/restart.

Figure 1. (Copyright by IBM) Two types of WPAR are available: Application WPAR and System WPAR. An application WPAR is a light environment suitable for execution of one or more processes. It is transient and has the same lifetime as the application. A system WPAR is almost a full AIX environment. Each System WPAR has dedicated writable file systems, although it can share the global environment /usr and /opt file systems in read only mode. When a system WPAR is started, an init process is created for this WPAR, which in turns spawns other processes and daemons. 3.2. LSF on AIX Partitions With LPARs, DLPARs and micropartitions, all LSF daemons can run, and LSF is able to dispatch and run jobs on the partitions like on a physical host. LSF cannot run on WPARs. 3.2.1. LPARs with dedicated processors On LPARs (whether on micropartitions or physical machine) with dedicated processors, LSF works well and is able to reports CPU utilization correctly.

3.2.2. DLPARs with dedicated processors On DLPARs (whether on micropartitions or physical machine) with dedicated processors, LSF works well and is able to report CPU utilization correctly. LSF also reports number of processors correctly when CPU allocation is changed dynamically on DLPARs. 3.2.3. Micropartitions In a micropartition that is configured with shared processors, LSF detects the number of CPUs to be the same as the configured virtual processors, and it reports CPU utilization independently for each partition. However, the reported CPU utilization is not consistent with mpstat output. So LSF CPU-based scheduling may not work well. 3.2.4. WPARs Application WPARs are transient and not suitable for server applications like LSF daemons. Also, LSF cannot run on system WPARs because LSF needs access to devices like /dev/mem and /dev/kmem. These are typically removed on WPARs due to best practice recommendations from IBM.