IBM Platform HPC: Best Practices for integrating with Intel Xeon Phi TM Coprocessors



Similar documents
Installing Platform Product Suite for SAS (Windows)

SAS Grid: Grid Scheduling Policy and Resource Allocation Adam H. Diaz, IBM Platform Computing, Research Triangle Park, NC

Parallel Processing using the LOTUS cluster

Running Jobs with Platform LSF. Platform LSF Version 8.0 June 2011

Running Jobs with Platform LSF. Version 6.2 September 2005 Comments to:

IBM WebSphere Application Server Version 7.0

Scheduling in SAS 9.3

Hadoop Basics with InfoSphere BigInsights

Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide

Installing and Configuring DB2 10, WebSphere Application Server v8 & Maximo Asset Management

IBM Endpoint Manager Version 9.1. Patch Management for Red Hat Enterprise Linux User's Guide

User's Guide - Beta 1 Draft

Running Native Lustre* Client inside Intel Xeon Phi coprocessor

Dell Server Management Pack Suite Version 6.0 for Microsoft System Center Operations Manager User's Guide

Tivoli Log File Agent Version Fix Pack 2. User's Guide SC

PARALLELS SERVER 4 BARE METAL README

Scheduling in SAS 9.4 Second Edition

PATROL Console Server and RTserver Getting Started

User's Guide - Beta 1 Draft

Table 1 shows the LDAP server configuration required for configuring the federated repositories in the Tivoli Integrated Portal server.

Kognitio Technote Kognitio v8.x Hadoop Connector Setup

IBM Software Hadoop Fundamentals

Using the Windows Cluster

CommandCenter Secure Gateway

IBM License Metric Tool Version Installing with embedded WebSphere Application Server

Intelligent Power Protector User manual extension for Microsoft Virtual architectures: Hyper-V 6.0 Manager Hyper-V Server (R1&R2)

Intellicus Enterprise Reporting and BI Platform

Dell UPS Local Node Manager USER'S GUIDE EXTENSION FOR MICROSOFT VIRTUAL ARCHITECTURES Dellups.com

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

How you configure Iscsi target using starwind free Nas software & configure Iscsi initiator on Oracle Linux 6.4

Xeon Phi Application Development on Windows OS

Command Line Interface User Guide for Intel Server Management Software

Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers

Network Shutdown Module V3 Extension of the User Manual for IBM BladeCenter architecture

Consolidated Monitoring, Analysis and Automated Remediation For Hybrid IT Infrastructures. Goliath Performance Monitor Installation Guide v11.

Deploying Microsoft Operations Manager with the BIG-IP system and icontrol

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Parallels Virtuozzo Containers 4.7 for Linux

WebSphere Business Monitor

Adaptive Log Exporter Users Guide

ORACLE OPS CENTER: PROVISIONING AND PATCH AUTOMATION PACK

Monitoring Clearswift Gateways with SCOM

Deploying IBM Lotus Domino on Red Hat Enterprise Linux 5. Version 1.0

Partek Flow Installation Guide

Migrating LAMP stack from x86 to Power using the Server Consolidation Tool

About This Guide Signature Manager Outlook Edition Overview... 5

VMTurbo Operations Manager 4.5 Installing and Updating Operations Manager

Agenda. Using HPC Wales 2

Deploying Red Hat Enterprise Virtualization On Tintri VMstore Systems Best Practices Guide

Oracle Fusion Middleware 11gR2: Forms, and Reports ( ) Certification with SUSE Linux Enterprise Server 11 SP2 (GM) x86_64

Running a Workflow on a PowerCenter Grid

WEB2CS INSTALLATION GUIDE

Application Servers - BEA WebLogic. Installing the Application Server

Sametime Gateway Version 9. Deploying DMZ Secure Proxy Server

SMRT Analysis Software Installation (v2.3.0)

Introduction to Operating Systems

NetIQ Sentinel Quick Start Guide

FileMaker Server 7. Administrator s Guide. For Windows and Mac OS

IBM Tivoli Monitoring Version 6.3 Fix Pack 2. Infrastructure Management Dashboards for Servers Reference

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing

Required Ports and Protocols. Communication Direction Protocol and Port Purpose Enterprise Controller Port 443, then Port Port 8005

Dell SupportAssist Version 2.0 for Dell OpenManage Essentials Quick Start Guide

Getting Started with ESXi Embedded

IBM Endpoint Manager Version 9.2. Patch Management for SUSE Linux Enterprise User's Guide

Windows Template Creation Guide. How to build your own Windows VM templates for deployment in Cloudturk.

Intellicus Cluster and Load Balancing- Linux. Version: 7.3

owncloud Enterprise Edition on IBM Infrastructure

How To Run A Linux Agent On Alandesk (For Free) On A Linux Server (For A Non-Free) On Your Ubuntu Computer (For Cheap) On An Ubuntu 2.5 (For Ubuntu) On Linux

PARALLELS SERVER BARE METAL 5.0 README

Unbreakable Linux Network An Overview

DocuShare Installation Guide

Monitor the Cisco Unified Computing System

Evaluation Guide for IBM Platform HPC 3.2 A step by step guide for IBM Platform HPC evaluation

Hadoop Basics with InfoSphere BigInsights

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

Acronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide

Installation and Configuration Guide for Windows and Linux

Using VMware Player. VMware Player. What Is VMware Player?

Syncplicity On-Premise Storage Connector

An Oracle White Paper July Oracle VM 3: Building a Demo Environment using Oracle VM VirtualBox

simplify monitoring Consolidated Monitoring, Analysis and Automated Remediation For Hybrid IT Infrastructures

Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE

IBM Security QRadar Vulnerability Manager Version User Guide

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

VMware Server 2.0 Essentials. Virtualization Deployment and Management

An Oracle White Paper June Oracle Linux Management with Oracle Enterprise Manager 12c

IBM Security QRadar Version (MR1) WinCollect User Guide

Dell Fabric Manager Installation Guide 1.0.0

Version 8.2. Tivoli Endpoint Manager for Asset Discovery User's Guide

10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition

IBM Pure Application Create Custom Virtual Image Guide - Part 1 Virtual Image by extending

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

Citrix XenServer 5.6 OpenSource Xen 2.6 on RHEL 5 OpenSource Xen 3.2 on Debian 5.0(Lenny)

COMMANDS 1 Overview... 1 Default Commands... 2 Creating a Script from a Command Document Revision History... 10

Getting Started with RES Automation Manager Agent for Linux

AVG 8.5 Anti-Virus Network Edition

Transcription:

IBM Platform HPC: Best Practices for integrating with Intel Xeon Phi TM Coprocessors Date: August 22, 2013 (Revision 1.3) Author: Gábor Samu (gsamu@ca.ibm.com) Reviewer: Mehdi Bozzo-Rey (mbozzore@ca.ibm.com)

I. Background... 3 II. Infrastructure Preparation... 4 III. Intel Software tools deployment... 12 IV. IBM Platform HPC: Xeon Phi monitoring, workloads... 14 A. IBM Platform HPC built-in Xeon Phi monitoring... 14 B. IBM Platform LSF ELIM: Xeon Phi monitoring, job scheduling (dynamic resources)... 16 C. IBM Platform LSF: Xeon Phi job scheduling (LSF configuration)... 22 D. IBM Platform LSF: Xeon Phi job submission... 23 Appendix A: IBM Platform HPC and Intel Xeon Phi Integration Scripts... 32 Copyright and Trademark Information... 32

I. Background IBM Platform HPC is a complete, end-to-end HPC cluster management solution. It includes a rich set of out-of the-box features that empowers high performance technical computing users by reducing the complexity of their HPC environment and improving their time-to-solution. IBM Platform HPC includes the following key capabilities: Cluster management Workload management Workload monitoring and reporting System monitoring and reporting MPI libraries (includes IBM Platform MPI) Integrated application scripts/templates for job submission Unified web portal Intel Xeon Phi (Many Integrated Cores - MIC) is a new CPU architecture developed by Intel Corporation that provides higher aggregated performance than alternative solutions deliver. It is designed to simplify application parallelization while at the same time delivering significant performance improvement. The distinct features included in the Intel Xeon Phi design are following: It is comprised of many smaller lower power Intel processor cores It contains wider vector processing units for greater floating point Performance /watt with its innovative design, Intel Xeon Phi delivers higher aggregated performance, Supports data parallel, thread parallel and process parallel, and increased total memory bandwidth. This document provides an example configuration of IBM Platform HPC for an environment containing systems equipped with Xeon Phi. The document is broken down into three broad sections: Infrastructure Preparation Xeon Phi monitoring using IBM Platform HPC Monitoring Framework Xeon Phi workload management using IBM Platform LSF The overall procedure was validated with the following software versions: IBM Platform HPC 3.2 (Red Hat Enterprise Linux 6.2) Intel(r) MPSS version 2.1.6720-12.2.6.32-220 (Intel software stack for Xeon Phi) Intel(r) Cluster Studio XE for Linux version 2013 Update 1 Intel(r) MPI version 4.1.0.024

II. Infrastructure Preparation For the purpose of this document, the example IBM Platform HPC cluster is configured as follows: IBM Platform HPC head node: Compute node(s): mel1 compute000 compute001 Both compute000, compute001 are equipped as follows: 1 Intel Xeon Phi co-processor cards 2 Gigabit Ethernet NICs (No InfiniBand (R) present) The infrastructure is configured using the following networks: IP Ranges Description Comments 192.0.2.2-192.0.2.50 Cluster private network Provisioning, monitoring, computation 192.0.2.51-192.0.2.99 Xeon Phi network (bridged) Computation 192.0.2.100-192.0.2.150 Out-of-Band management network IPMI network mel1 has two network interfaces configured: eth0 (public interface) eth1 (private interface) compute000, compute001 will have two network interfaces configured: eth1 (private interface); configured during provisioning bridged interface; to be configured post provisioning The following steps assume that mel1 has been installed with IBM Platform HPC 3.2 and the original configuration was not modified. Additionally, the steps below rely upon the IBM Platform HPC CLI/TUI tools. All of the operations performed in the document using the CLI/TUI tools, can also be performed using the IBM Platform HPC Web Console.

1. Create a new node group template for compute hosts equipped with Xeon Phi. Create a copy of the default package based compute node group template compute-rhel-6.2-x86_64 named computerhel-6.2-x86_64_xeon_phi. # kusu-ngedit -c compute-rhel-6.2-x86_64 -n compute-rhel-6.2-x86_64_xeon_phi Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/etc/.updatenics New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/etc/fstab.kusuappend New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/etc/hosts New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/hosts.equiv New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/passwd.merge New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/group.merge New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/shadow.merge New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_config New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_dsa_key New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_rsa_key New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_key New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_dsa_key.pub New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_key.pub New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /etc/ssh/ssh_host_rsa_key.pub New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /root/.ssh/authorized_keys New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /root/.ssh/id_rsa New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /opt/kusu/etc/logserver.addr New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /opt/lsf/conf/lsf.conf New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /opt/lsf/conf/hosts New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /opt/lsf/conf/lsf.shared New file found: /etc/cfm/compute-rhel-6.2-x86_64_ Xeon_Phi /opt/lsf/conf/lsf.cluster.mel1_cluster1...... Distributing 75 KBytes to all nodes. 2. Add Intel MPSS packages to the default software repository managed by IBM Platform HPC. This is required to automate the deployment of Intel MPSS to the Xeon Phi equipped nodes. # cp *.rpm /depot/contrib/1000/

# ls -la *.rpm -rw-r--r-- 1 root root 16440156 May 8 13:55 intel-mic-2.1.6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 3298216 May 8 13:55 intel-mic-cdt-2.1.6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 522844 May 8 13:55 intel-mic-flash-2.1.386-2.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 10255872 May 8 13:55 intel-mic-gdb-2.1.6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 182208656 May 8 13:55 intel-mic-gpl-2.1.6720-12.el6.x86_64.rpm -rw-r--r-- 1 root root 2300600 May 8 13:55 intel-mic-kmod-2.1.6720-12.2.6.32.220.el6.x86_64.rpm -rw-r--r-- 1 root root 280104 May 8 13:55 intel-mic-micmgmt-2.1.6720-12.2.6.32.220.el6.x86_64.rpm -rw-r--r-- 1 root root 254776 May 8 13:55 intel-mic-mpm-2.1.6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 10863724 May 8 14:10 intel-mic-ofed-card-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 1489992 May 8 14:10 intel-mic-ofed-dapl-2.0.36.7-1.el6.x86_64.rpm -rw-r--r-- 1 root root 44528 May 8 14:10 intel-mic-ofed-dapl-devel-2.0.36.7-1.el6.x86_64.rpm -rw-r--r-- 1 root root 220712 May 8 14:10 intel-mic-ofed-dapl-devel-static-2.0.36.7-1.el6.x86_64.rpm -rw-r--r-- 1 root root 108940 May 8 14:10 intel-mic-ofed-dapl-utils-2.0.36.7-1.el6.x86_64.rpm -rw-r--r-- 1 root root 5800 May 8 14:10 intel-mic-ofed-ibpd-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 14730200 May 8 14:10 intel-mic-ofed-kmod-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 102052 May 8 14:10 intel-mic-ofed-kmod-devel-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 8536 May 8 14:10 intel-mic-ofed-libibscif-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 5620 May 8 14:10 intel-mic-ofed-libibscif-devel-6720-12.2.6.32-220.el6.x86_64.rpm -rw-r--r-- 1 root root 55163240 May 8 13:55 intel-mic-sysmgmt-2.1.6720-12.2.6.32-220.el6.x86_64.rpm # kusu-repoman -u -r "rhel-6.2-x86_64" Refreshing repository: rhel-6.2-x86_64. This may take a while... 3. The Out-of-Band management network is defined using kusu-netedit. The configuration of an Outof-Band management network is highly recommended. Refer to Chapter 6 (Networks) in the IBM Platform HPC 3.2 Administering IBM Platform HPC guide for details on configuring a BMC network. # kusu-netedit -a -n 192.0.2.0 -s 255.255.255.0 -i bmc -t 192.0.2.100 -e "BMC network" -x "-bmc" 4. Next, modify the node group "compute-rhel-6.2-x86_64_xeon_phi" to add the following Optional Packages for deployment and to enable the Out-of-Band management network. The kusu-ngedit tool is used for this purpose. kusu-ngedit presents the administrator with a TUI interface from which the package selection and network selection can be performed. Networks TUI screen: Enable network "bmc" for node group "compute-rhel-6.2-x86_64_xeon_phi" Optional Packages TUI screen: intel-mic-micmgmt intel-mic-mpm-2.1.6720 intel-mic-2.1.6720 intel-mic-cdt-2.1.6720

intel-mic-flash-2.1.386 intel-mic-gdb-2.1.6720 intel-mic-gpl intel-mic-kmod intel-mic-sysmgmt-2.1.6720 libstdc++ (needed by Intel MPSS software) 5. For each Xeon Phi device, you must assign a static IP address. The IP addresses selected for this example are on the cluster network 192.0.2.0. The Xeon Phi device IP addresses are added to IBM Platform HPC as unmanaged devices using thekusu-addhost command. This will ensure that the Xeon Phi hostnames are added to the /etc/hosts on each cluster node, as well as preventing the IPs from being allocated by IBM Platform HPC for other devices. Hostname IP address compute000 192.0.2.11 compute000-mic0 192.0.2.51 compute001 192.0.2.12 compute001-mic0 192.0.2.52 # kusu-addhost -s compute000-mic0 -x 192.0.2.51 Setting up dhcpd service... Setting up dhcpd service successfully... Setting up NFS export service... Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Distributing 8 KBytes to all nodes. Updating installer(s) # kusu-addhost -s compute001-mic0 -x 192.0.2.52 Setting up dhcpd service... Setting up dhcpd service successfully... Setting up NFS export service... Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Distributing 8 KBytes to all nodes. Updating installer(s)

6. IBM Platform HPC manages the network interfaces of all compute nodes but does not currently support the management of bridged network interfaces. It is necessary to define a bridge on the compute nodes so that the Xeon Phi devices can be accessed over the network. This is mandatory in the situation for example where Xeon Phi native MPI. The following procedure automates the configuration of the network bridge and the necessary Xeon Phi configuration to utilize the bridge. The following procedure supports a maximum of one Xeon Phi device per node. Two steps are involved: Create a post install script that will trigger the run of /etc/rc.local and add it to the node group of your choice Create rc.local.append for the same node group, under the appropriate cfm directory on the installer Appendix A contains details on where to obtain the example post-install script, and rc.local.append contents. ** WARNING: The following changes will prevent Platform HPC from managing network interfaces on the compute nodes. ** Copy the example post_install.sh script to /root on the IBM Platform HPC head node. # cp post_install.sh /root Start kusu-ngedit and edit the Xeon Phi specific node group compute-rhel-6.2-x86_64_ Xeon_Phi. Add the script post_install.sh as a Custom Script. Copy the example rc.local.xeon_phi script to the appropriate CFM directories with filename rc.local.append. This will ensure that the contents of rc.local.xeon_phi are appended to the rc.local file on the respective compute nodes. In this case, we need to copy the file to the CFM directory for the node group compute-rhel-6.2-x86_64_ Xeon_Phi.

# cp rc.local.xeon_phi /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/etc/rc.local.append Next, execute kusu-cfmsync to make the change take effect. [root@mel1 ~]# kusu-cfmsync -f Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/etc/rc.local.append Distributing 1 KBytes to all nodes. Updating installer(s) 7. To ensure a consistent host name space, you should use the CFM framework to propagate the /etc/hosts file from the IBM Platform HPC head node to all known Xeon Phi devices. On the IBM Platform HPC head node perform the following operations: # cp /etc/hosts /shared/hosts # mkdir -p /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/opt/intel/mic/filesystem/base/etc/rc.d In /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/opt/intel/mic/filesystem/base/etc/rc.d create the file rc.sysinit.append containing the following: cp /shared/hosts /etc/hosts **Note: The above steps must be repeated under the /etc/cfm/installer-rhel-6.2-x86_64 file system. This is required as the IBM Platform HPC head node is equipped with Xeon Phi.** The updates to the Xeon Phi configuration files are propagated to all nodes in the node groups "compute-rhel-6.2-x86_64_xeon_phi". On the IBM HPC head node, execute "kusu-cfmsync". # kusu-cfmsync -f Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/compute-rhel-6.2- x86_64_xeon_phi/opt/intel/mic/filesystem/base/etc/rc.d/rc.sysinit.append Distributing 0 KBytes to all nodes. Updating installer(s)

8. Provision all Xeon Phi equipped nodes using the node group template "compute-rhel-6.2- x86_64_xeon_phi". Note that once nodes are discovered by kusu-addhost, the administrator must exit from the listening mode by pressing Control-C. This will complete the node discovery process. # kusu-addhost -i eth0 -n compute-rhel-6.2-x86_64_xeon_phi -b Scanning syslog for PXE requests... Discovered Node: compute000 Mac Address: 00:1e:67:49:cc:83 Discovered Node: compute001 Mac Address: 00:1e:67:49:cc:e5 ^C Command aborted by user... Setting up dhcpd service... Setting up dhcpd service successfully... Setting up NFS export service... Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Distributing 100 KBytes to all nodes. Updating installer(s) 9. If passwordless SSH as 'root' to the Xeon Phi devices is needed, then the following step must be performed prior to the generation of the Intel MPSS configuration. Copy the public SSH key for the root account from the head node to all nodes that are in the node group compute-rhel-6.2- x86_64_xeon_phi node group.

# ln -s /opt/kusu/etc/.ssh/id_rsa.pub /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/root/.ssh/id_rsa.pub # kusu-cfmsync -f Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/root/.ssh/id_rsa.pub Distributing 0 KBytes to all nodes. Updating installer(s)

III. Intel Software tools deployment It is recommended to install all of the Intel Software tools, and Intel MPI to the common shared file system. IBM Platform HPC configures a default NFS share "/shared" which is common on all compute nodes managed by the software. With the procedure above, /shared is mounted and available on all nodes in the cluster, including Xeon Phi co-processor environments. Here, the native Intel Software tools and Intel MPI installation programs are used. No further detail is provided in the document. As part of the installation of the Intel Software tools, you may be required to install additional 32-bit libraries. If this is the case, then the following yum commands must be executed across the nodes as follows in the example below. # lsrun -m "mel1 compute000 compute001" yum -y install libstdc++.i686 Loaded plugins: product-id, security, subscription-manager Updating certificate-based repositories. Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package libstdc++.i686 0:4.4.6-3.el6 will be installed --> Processing Dependency: libm.so.6(glibc_2.0) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libm.so.6 for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libgcc_s.so.1(glibc_2.0) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libgcc_s.so.1(gcc_4.2.0) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libgcc_s.so.1(gcc_3.3) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libgcc_s.so.1(gcc_3.0) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libgcc_s.so.1 for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.4) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.3.2) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.3) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.2) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.1.3) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.1) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6(glibc_2.0) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: libc.so.6 for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: ld-linux.so.2(glibc_2.3) for package: libstdc++-4.4.6-3.el6.i686 --> Processing Dependency: ld-linux.so.2 for package: libstdc++-4.4.6-3.el6.i686 --> Running transaction check ---> Package glibc.i686 0:2.12-1.47.el6 will be installed --> Processing Dependency: libfreebl3.so(nssrawhash_3.12.3) for package: glibc-2.12-1.47.el6.i686

--> Processing Dependency: libfreebl3.so for package: glibc-2.12-1.47.el6.i686 ---> Package libgcc.i686 0:4.4.6-3.el6 will be installed --> Running transaction check ---> Package nss-softokn-freebl.i686 0:3.12.9-11.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: libstdc++ i686 4.4.6-3.el6 kusu-compute-default 298 k Installing for dependencies: glibc i686 2.12-1.47.el6 kusu-compute-default 4.3 M libgcc i686 4.4.6-3.el6 kusu-compute-default 110 k nss-softokn-freebl i686 3.12.9-11.el6 kusu-compute-default 116 k Transaction Summary ================================================================================ Install 4 Package(s) Total download size: 4.8 M Installed size: 14 M Downloading Packages: -------------------------------------------------------------------------------- Total 21 MB/s 4.8 MB 00:00 Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Warning: RPMDB altered outside of yum. Installing : libgcc-4.4.6-3.el6.i686 1/4 Installing : glibc-2.12-1.47.el6.i686 2/4 Installing : nss-softokn-freebl-3.12.9-11.el6.i686 3/4 Installing : libstdc++-4.4.6-3.el6.i686 4/4 Installed products updated. Installed: libstdc++.i686 0:4.4.6-3.el6

Dependency Installed: glibc.i686 0:2.12-1.47.el6 libgcc.i686 0:4.4.6-3.el6 nss-softokn-freebl.i686 0:3.12.9-11.el6 Complete!. IV. IBM Platform HPC: Xeon Phi monitoring, workloads A. IBM Platform HPC built-in Xeon Phi monitoring. NOTE: The following section requires that Fix Pack hpc-3.2-build216840 is applied to the IBM Platform HPC head node. This is available via the IBM Fix Central site. IBM Platform HPC provides rudimentary Xeon Phi monitoring capabilities via the Web based console out of the box. The capabilities are not enabled by default at the time of installation. To enable the monitoring capabilities the following steps must be performed on the head node. Add the line hasmic=true" to /usr/share/pmc/gui/conf/pmc.conf # sed -i s?unselect=?unselect=mics,?g /usr/share/pmc/gui/conf/prefconf/hostlistdtdiv_default.properties # sed -i s?unselect=?unselect=mics,?g /usr/share/pmc/gui/conf/prefconf/hostlistprovisiondtdiv_default.properties # pmcadmin stop # pmcadmin start The IBM Platform HPC Web Console incorrectly assumes that 'micinfo' is located in /usr/bin. The current Intel MPSS installs micinfo to /opt/intel/mic/bin. Here a wrapper script to call /opt/intel/mic/bin is distributed to /usr/bin on all nodes within the compute-rhel-6.2-x86_64_xeon_phi node group. Create a script micinfo with the following contents: #!/bin/sh /opt/intel/mic/bin/micinfo exit 0 # mkdir -p /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/usr/bin Copy the micinfo script to the appropriate CFM directory and set execute permissions. # cp micinfo /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/usr/bin # chmod 755 /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/usr/bin/micinfo kusu-cfmsync -f

Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/compute-rhel-6.2-x86_64_xeon_phi/usr/bin/micinfo Distributing 0 KBytes to all nodes. Updating installer(s) The following screenshot of the IBM Platform HPC Web console shows the "MIC" tab that displays metrics for each Xeon Phi device on a per host basis:

B. IBM Platform LSF ELIM: Xeon Phi monitoring, job scheduling (dynamic resources). IBM has prepared an example ELIM script for IBM Platform HPC (and IBM Platform LSF) that leverages the Intel MPSS tools to provide metrics for both monitoring and job scheduling. Download details for the example ELIM script can be found in Appendix A. The example ELIM has been validated on systems with IBM Platform HPC 3.2/IBM LSF Express 8.3, Intel MPSS 2.1 build 4346-16 and 2 Xeon Phi co-processor cards per node; it will report back the following metrics: Total number of Xeon Phi co-processors per node Number of cores per Xeon Phi co-processor Xeon Phi CPU temperature (Celsius) Xeon Phi CPU frequency (GHz) Xeon Phi Total power (Watts) Xeon Phi Total Free memory (MB) Xeon Phi CPU utilization (%) Below, the IBM Platform HPC hpc-metric-tool is used to configure the monitoring of the Xeon Phi specific metrics. Download details for the script (mic_metrics_add.sh) to automate the configuration of the Intel MIC metrics can be found in Appendix A. The script requires as input the ELIM script, dynelim.intelmic. The following is executed on the IBM Platform HPC head node # chmod 755 /root/dynelim.intelmic # chmod 755 /root/mic_metric_add.sh #./mic_metric_add.sh /root/dynelim.intelmic Adding External Metric Summary: Name: num_mics LSF resource Mapping: default ELIM file path: /root/dynelim.intelmic LSF resource interval: 60 LSF resource increase: n Display Name: Number of MICs External Metric is added, please run "hpc-metric-tool apply" to apply the change to cluster. Adding External Metric Summary: Name: miccpu_temp0

LSF resource Mapping: default ELIM file path: /root/dynelim.intelmic LSF resource interval: 60 LSF resource increase: n Display Name: MIC0 CPU temp Celsius External Metric is added, please run "hpc-metric-tool apply" to apply the change to cluster. Adding External Metric Summary: Name: micnum_cores1 LSF resource Mapping: default ELIM file path: /root/dynelim.intelmic LSF resource interval: 60 LSF resource increase: n Display Name: MIC1 Number of cores External Metric is added, please run "hpc-metric-tool apply" to apply the change to cluster. Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh New file found: /etc/cfm/installer-rhel-6.2-x86_64/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.num_mics New file found: /etc/cfm/installer-rhel-6.2-x86_64/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.micnum_cores1 New file found: /etc/cfm/installer-rhel-6.2-x86_64/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.miccore_freq0 New file found: /etc/cfm/installer-rhel-6.2-x86_64/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.miccpu_util1 New file found: /etc/cfm/installer-rhel-6.2-x86_64/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.miccpu_temp1 New file found: /etc/cfm/lsf-master-candidate/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.miccore_freq1 New file found: /etc/cfm/lsf-master-candidate/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.miccpu_temp0 New file found: /etc/cfm/lsf-master-candidate/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.micfree_mem0 Distributing 333 KBytes to all nodes. Updating installer(s)

Setting up dhcpd service... Setting up dhcpd service successfully... Setting up NFS export service... Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Distributing 89 KBytes to all nodes. Updating installer(s) Running the commands above will result in multiple ELIMs being written to the $LSF SERVERDIR directory (/opt/lsf/8.3/linux2.6-glibc2.3-x86_64/etc) with names: elim.num_mics elim.miccpu_temp0...... As each ELIM is an instance of "dynelim.intelmic", you simply retain elim.num_mics. Below are the steps to perform a cleanup of the ELIM scripts. Cleanup of ELIM scripts from CFM template directories. Here all elim.mic* files are removed; we retain only elim.num_mics. (executed on the IBM Platform HPC head node). # cd /etc/cfm # find./ -name "elim.mic*" -print xargs rm -f Cleanup of ELIMs from $LSF_SERVERDIR on all nodes (executed on the IBM Platform HPC head node). # lsgrun -m compute000 compute001" find /opt/lsf/8.3/linux2.6-glibc2.3-x86_64/ -name "elim.mic*" - exec rm -f {} \; Update CFM (executed on IBM Platform HPC head node). # kusu-cfmsync -f Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh

Removing orphaned file: /opt/kusu/cfm/1/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.micnum_cores1 Removing orphaned file: /opt/kusu/cfm/1/opt/lsf/8.3/linux2.6-glibc2.3-x86_64/etc/elim.miccore_freq0 Removing orphaned file: /opt/kusu/cfm/1/opt/lsf/8.3/linux2.6-glibc2.3-x86_64/etc/elim.miccpu_util1 Removing orphaned file: /opt/kusu/cfm/1/opt/lsf/8.3/linux2.6-glibc2.3-x86_64/etc/elim.miccpu_temp1 Removing orphaned file: /opt/kusu/cfm/1/opt/lsf/8.3/linux2.6-glibc2.3- x86_64/etc/elim.mictotal_power1 Removing orphaned file: /opt/kusu/cfm/7/opt/lsf/8.3/linux2.6-glibc2.3-x86_64/etc/elim.micfree_mem0 Updating installer(s) # lsadmin limshutdown all Do you really want to shut down LIMs on all hosts? [y/n] y Shut down LIM on <mel1>... done Shut down LIM on <compute000>... done Shut down LIM on <compute001>... done ** NOTE: The warning messages below in the output of lsadmin limstartup may be ignored ** # lsadmin limstartup all Aug 14 11:37:44 2013 23856 3 8.3 do_resources: /opt/lsf/conf/lsf.shared(340): Resource name processes reserved or previously defined. Ignoring line Do you really want to start up LIM on all hosts? [y/n]y Start up LIM on <mel1>... Aug 16 15:37:49 2013 25092 3 8.3 do_resources: /opt/lsf/conf/lsf.shared(340): Resource name processes reserved or previously defined. Ignoring line done Start up LIM on <compute000>... Aug 14 11:37:50 2013 88229 3 8.3 do_resources: /opt/lsf/conf/lsf.shared(340): Resource name processes reserved or previously defined. Ignoring line done Start up LIM on <compute001>... Aug 14 11:37:50 2013 63077 3 8.3 do_resources: /opt/lsf/conf/lsf.shared(340): Resource name processes reserved or previously defined. Ignoring line done

The output of the IBM Platform LSF 'lsload' command shows the metrics as expected: # lsload -l HOST_NAME status r15s r1m r15m ut pg io ls it tmp swp mem root maxroot processes clockskew netcard iptotal cpuhz cachesize diskvolume processesroot ipmi powerconsumption ambienttemp cputemp num_mics miccpu_temp0 miccore_freq0 mictotal_power0 micfree_mem0 miccpu_util0 micnum_cores0 miccpu_temp1 miccore_freq1 mictotal_power1 micfree_mem1 miccpu_util1 micnum_cores1 ngpus gpushared gpuexcl_thrd gpuprohibited gpuexcl_proc gpumode0 gputemp0 gpuecc0 gpumode1 gputemp1 gpuecc1 gpumode2 gputemp2 gpuecc2 gpumode3 gputemp3 gpuecc3 hostid ip mac osversion abbros cpumode fanrate rxpackets txpackets rxbytes txbytes droppedrxpackets droppedtxpackets errorrxpackets errortxpackets overrunrxpackets overruntxpackets rxpacketsps txpacketsps rxbytesps txbytesps gpumodel0 gpumodel1 gpumodel2 gpumodel3 gpudriver compute001 ok 0.0 0.0 0.0 5% 0.0 12 0 1 8584M 2G 31G 8591.0 1e+04 725.0 0.0 4.0 6.0 2100.0 2e+04 2e+06 717.0-1.0-1.0-1.0-1.0 1.0 46.0 1.1 74.0 2662.6 0.0 57.0 - - - - - - - - - - - - - - - - - - - - - - - 010a0c01 eth0:;eth1:9.21.51.37;eth2:;eth3:;eth4:;eth5:; eth0:00:1e:67:49:cc:e5;eth1:00:1e:67:49:cc:e6;eth2:00:1e:67:49:cc:e7;eth3:00:1e:67:49:cc:e8;eth4:00: 1E:67:0C:BE:20 Red_Hat_Enterprise_Linux_Server_release_6.2_(Santiago) RedHat6 06/2d -1 eth0:290901;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:26842;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:338700;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:2668;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; - - - - - compute000 ok 0.0 0.1 0.0 4% 0.0 10 0 1 8584M 2G 31G 8591.0 1e+04 723.0 0.0 4.0 6.0 1200.0 2e+04 1e+07 715.0-1.0-1.0-1.0-1.0 1.0 54.0 1.1 67.0 2662.4 0.0 57.0 - - - - - - - - - - - - - - - - - - - - - - - 010a0b01 eth0:;eth1:9.21.51.36;eth2:;eth3:;eth4:;eth5:; eth0:00:1e:67:49:cc:83;eth1:00:1e:67:49:cc:84;eth2:00:1e:67:49:cc:85;eth3:00:1e:67:49:cc:86;eth4:00 :1E:67:0C:BA:D8 Red_Hat_Enterprise_Linux_Server_release_6.2_(Santiago) RedHat6 06/2d -1 eth0:293788;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:26151;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:338976;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:2679;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; - - - - -

mel1 ok 0.2 0.3 0.3 7% 0.0 60 3 2 28G 33G 25G 3e+04 5e+04 1021.0 0.0 4.0 6.0 1200.0 2e+04 2e+06 938.0-1.0-1.0-1.0-1.0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 010a0a01 eth0:192.0.2.10;eth1:9.21.51.35;eth2:;eth3:;eth4:;eth5:; eth0:00:1e:67:31:42:cd;eth1:00:1e:67:31:42:ce;eth2:00:1e:67:31:42:cf;eth3:00:1e:67:31:42:d0;eth4:00 :1E:67:0C:BB:80 Red_Hat_Enterprise_Linux_Server_release_6.2_(Santiago) RedHat6 06/2d -1 eth0:3416828;eth1:13903274;eth2:0;eth3:0;eth4:0;eth5:0; eth0:11413427;eth1:29766156;eth2:0;eth3:0;eth4:0;eth5:0; eth0:331724;eth1:2318337;eth2:0;eth3:0;eth4:0;eth5:0; eth0:13522166;eth1:42629433;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:125;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:9;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:3;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:1;eth2:0;eth3:0;eth4:0;eth5:0; eth0:0;eth1:0;eth2:0;eth3:0;eth4:0;eth5:0; - - - - - It is now possible to display the newly added resources in the IBM Platform HPC Web Console. In the screenshot below, the Xeon Phi specific metrics are displayed on the respective host status lines.

C. IBM Platform LSF: Xeon Phi job scheduling (LSF configuration) The following steps describe the necessary IBM Platform LSF configuration in support of Xeon Phi. All LSF hosts equipped with Xeon Phi must be tagged with the Boolean resource "mic". This will allow users submitting Xeon Phi specific workloads to IBM Platform LSF to request a system equipped with Xeon Phi. Additionally, it is necessary to enable the resource reservation per slot for the defined resource 'num_mics'. Edit /etc/cfm/templates/lsf/default.lsf.shared and make the following updates: Begin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION # Keywords mic Boolean () () (Intel MIC architecture)...... End Resource Edit /etc/cfm/templates/lsf/default.lsf.cluster and make the following updates: Begin Host HOSTNAME model type server r1m mem swp RESOURCES #Keywords XXX_lsfmc_XXX!! 1 3.5 () () (mg) compute000!! 1 3.5 () () (mic) compute001!! 1 3.5 () () (mic)...... End Host Edit /etc/cfm/templates/lsf/lsbatch/default/configdir/lsb.resources and make the following updates: Begin ReservationUsage RESOURCE METHOD RESERVE num_mics - Y End ReservationUsage Edit /etc/cfm/templates/lsf/lsbatch/default/configdir/lsb.params and add the following parameter: Begin Parameters...... RESOURCE_RESERVE_PER_SLOT=Y End Parameters

Run the command kusu-addhost -u to make the changes take effect. # kusu-addhost -u Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Updating installer(s) Setting up dhcpd service... Setting up dhcpd service successfully... Setting up NFS export service... Running plugin: /opt/kusu/lib/plugins/cfmsync/getent-data.sh Distributing 102 KBytes to all nodes. Updating installer(s) D. IBM Platform LSF: Xeon Phi job submission This section discusses the methodology to submit both "offload" and "native" type workloads to Xeon Phi coprocessor equipped nodes in an IBM Platform LSF cluster. In simple terms: With the "offload" model, the executable is run on the host processor, offloading specific work to the Xeon Phi coprocessor. With the "native" model, the executable runs natively on the Xeon Phi coprocessor. For both "offload" and "native" type workloads, it is assumed that the following configuration for IBM Platform LSF exists (see Section C: IBM Platform LSF: Xeon Phi job scheduling above):

Non-shared dynamic resource 'num_mics', which counts the number of Xeon Phi coprocessor (cards) available on a system (online state). Non-shared Boolean resource 'mic', which is configured for nodes equipped with Xeon Phi coprocessor(s). RESOURCE_RESERVE_PER_SLOT=Y has been configured. i. Offload Example I: The following example shows a simple offload binary ('omp_numthreadsofl') being launched under IBM Platform LSF. The binary has been compiled using the Intel Compiler (intel-compilerproc-117-13.0-1.x86_64) and is launched using the IBM Platform LSF bsub command requesting the Boolean resource "mic" and rusage on the resource num_mics equal to 1. Note that it is currently not possible to target a specific Xeon Phi device at runtime when running an Offload executable. This is a limitation of the current Intel tools. $ bsub -I -R "select[mic] rusage[num_mics=1]" /shared/omp_numthreadsofl -t 16 Job <1083> is submitted to default queue <medium_priority>. <<Waiting for dispatch...>> <<Starting on compute000>> Hello World from thread = 0 Hello World from thread = 11 Number of threads on node compute000-mic0 = 16 Hello World from thread = 2 Hello World from thread = 1 Hello World from thread = 4 Hello World from thread = 9 Hello World from thread = 8 Hello World from thread = 10 Hello World from thread = 5 Hello World from thread = 6 Hello World from thread = 7 Hello World from thread = 3 Hello World from thread = 13 Hello World from thread = 12 Hello World from thread = 14 Hello World from thread = 15 ii. Offload Example II: The following shows an example of an Intel MPI offload binary being launched under IBM Platform LSF. The binary has been compiled using the Intel Compiler (intel-compilerproc-117-13.0-1.x86_64) and is launched using the IBM Platform LSF bsub command requesting the following: Boolean resource "mic". Resource reservation (rusage) on the resource num_mics equal to 1 (per slot).

Two processors (MPI ranks), one processor per node. Note that each MPI rank will use Offload if Xeon Phi available. $ bsub -n 2 -R "select[mic] rusage[num_mics=1] span[ptile=1]" -I mpiexec.hydra /shared/mixedofl_demo Job <1082> is submitted to default queue <medium_priority>. <<Waiting for dispatch...>> <<Starting on compute000>> Hello from thread 0 out of 224 from process 0 out of 2 on compute000 Hello from thread 94 out of 224 from process 0 out of 2 on compute000 Hello from thread 8 out of 224 from process 0 out of 2 on compute000 Hello from thread 78 out of 224 from process 0 out of 2 on compute000 Hello from thread 14 out of 224 from process 0 out of 2 on compute000 Hello from thread 70 out of 224 from process 0 out of 2 on compute000 Hello from thread 1 out of 224 from process 0 out of 2 on compute000 Hello from thread 57 out of 224 from process 0 out of 2 on compute000 Hello from thread 113 out of 224 from process 0 out of 2 on compute000 Hello from thread 72 out of 224 from process 0 out of 2 on compute000 Hello from thread 16 out of 224 from process 0 out of 2 on compute000 Hello from thread 43 out of 224 from process 1 out of 2 on compute001 Hello from thread 98 out of 224 from process 1 out of 2 on compute001. iii. Native Examples IBM has devised an example job wrapper script which allows users to launch jobs targeted to Xeon Phi under IBM Platform LSF. The example job wrapper script assumes that IBM Platform LSF has been configured as per Section C: IBM Platform LSF: Xeon Phi job scheduling above. Download instructions for the example ELIM, dynelim.intelmic and the job wrapper script (mic.job) can be found in Appendix A. The job wrapper makes the following assumptions: Support for a maximum of 2 Xeon Phi devices per node. The job wrapper script assumes that there is a shared $HOME directory for the user running the job, so the wrapper will not function for the user 'root'. Each Xeon Phi equipped LSF host is running an ELIM which reports back the number of Xeon Phi available in the node (dynamic resource 'num_mics'). Jobs requiring a Xeon Phi are submitted to LSF with the correct resource requirements. For Xeon Phi jobs, bsub -n N translates to a request for N Xeon Phi devices. Jobs must also be submitted with the correct corresponding rusage[num_mics=1] resource (assuming the configuration above on Section C). For example, here we submit a job which requires 2 Xeon Phi coprocessors: o $ bsub -n 2 -R "select [mic] rusage[num_mics=1]" <PATH_TO>/mic.job <PATH_TO>/a.out

o Note that 2 job slots will also be allocated on the nodes selected. Intel MPI (runtime) must be available if MPI ranks are to be run on Xeon Phi (coprocessor mode). The job wrapper script has been tested with both native Xeon Phi binaries (leveraging Intel MPSS 'micnativeloadex') and Intel MPi Xeon Phi co-processor mode jobs. Xeon Phi jobs submitted to IBM Platform LSF will be marked with the Xeon Phi hostname(s) using the IBM Platform LSF bpost command. This provides rudimentary control over access to devices. Once a Xeon Phi has been marked for use by a job, it is used exclusively for the duration of the job. Native MPI Example ('Co-processor mode'): The following shows an example of a native Xeon Phi MPI binary being launched under IBM Platform LSF. The binary has been compiled using the Intel Compiler (intel-compilerproc-117-13.0-1.x86_64) and is launched using the 'mic.job' wrapper. The resource requirement string for the job requests the Boolean resource "mic" and rusage on the resource num_mics equal to 1. $ bsub -n 2 -I -R "select[mic] rusage[num_mics=1]" /shared/mic.job /shared/cpuops_mpi Job <975> is submitted to default queue <medium_priority>. <<Waiting for dispatch...>> <<Starting on compute001>> - current ops per sec [avg 35.24] 33.66 35.44 35.24 35.24 35.24 35.44 35.24 35.24 35.24 35.44 35.24 35.44 35.24 35.44 35.24 35.24 35.24 35.44 35.24 35.24 35.24 35.44 35.24 35.24 35.24 35.44 35.24 35.24 35.24 35.24 35.24 35.24...... During execution of the job we see the following: bpost of Xeon Phi hostname(s) to the job (this is performed by the job wrapper script, 'mic.job'): # bjobs -l 975 Job <975>, User <hpcadmin>, Project <default>, Status <RUN>, Queue <medium_prio rity>, Interactive mode, Command </shared/mic.job /shared/ cpuops_mpi>, Share group charged </hpcadmin> Wed Aug 14 12:48:13: Submitted from host <mel1>, CWD <$HOME>, 2 Processors Requ ested, Requested Resources <select[mic] rusage[num_mics=1] >; Wed Aug 14 12:48:15: Started on 2 Hosts/Processors <compute001> <compute000>; Wed Aug 14 12:48:29: Resource usage collected. MEM: 5 Mbytes; SWAP: 357 Mbytes; NTHREAD: 7

PGID: 188945; PIDs: 188945 PGID: 188946; PIDs: 188946 188948 188969 PGID: 188970; PIDs: 188970 PGID: 188971; PIDs: 188971 SCHEDULING PARAMETERS: r15s r1m r15m ut pg io ls it tmp swp mem loadsched - - - - - - - - - - - loadstop - - - - - - - - - - - root maxroot processes clockskew netcard iptotal loadsched - - - - - - - - loadstop - - - - - - - - cpuhz cachesize diskvolume processesroot ipmi powerconsumption ambienttemp cputemp loadsched - - - - - - loadstop - - - - - - num_mics miccpu_temp0 miccore_freq0 mictotal_power0 micfree_mem0 loadsched - - - - - loadstop - - - - - miccpu_util0 micnum_cores0 miccpu_temp1 miccore_freq1 mictotal_power1 loadsched - - - - - loadstop - - - - - micfree_mem1 miccpu_util1 micnum_cores1 ngpus gpushared loadsched - - - - - loadstop - - - - - gpuexcl_thrd gpuprohibited gpuexcl_proc gpumode0 gputemp0 gpuecc0 loadsched - - - - - - loadstop - - - - - - gpumode1 gputemp1 gpuecc1 gpumode2 gputemp2 gpuecc2 gpumode3 gputemp3 loadsched - - - - - - - - loadstop - - - - - - - - gpuecc3 loadsched - loadstop - EXTERNAL MESSAGES: MSG_ID FROM POST_TIME MESSAGE ATTACHMENT 0 hpcadmin Aug 14 12:48 compute001-mic0 N 1 hpcadmin Aug 14 12:48 compute000-mic0 N

bhosts -l compute000 HOST compute000 STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW ok 16.00-16 1 1 0 0 0 - CURRENT LOAD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem root maxroot Total 0.0 0.0 0.0 0% 0.0 16 0 27 8584M 2G 31G 8591.0 1e+04 Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M 0.0 0.0 processes clockskew netcard iptotal cpuhz cachesize diskvolume Total 722.0 0.0 4.0 6.0 1200.0 2e+04 1e+07 Reserved 0.0 0.0 0.0 0.0 0.0 0.0 0.0 processesroot ipmi powerconsumption ambienttemp cputemp num_mics Total 714.0-1.0-1.0-1.0-1.0 0.0 Reserved 0.0 0.0 0.0 0.0 0.0 1.0... # bhosts -l compute001 HOST compute001 STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW ok 16.00-16 1 1 0 0 0 - CURRENT LOAD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem root maxroot Total 0.0 0.0 0.0 0% 0.0 17 0 32 8584M 2G 31G 8591.0 1e+04 Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M 0.0 0.0 processes clockskew netcard iptotal cpuhz cachesize diskvolume Total 731.0 0.0 4.0 6.0 1200.0 2e+04 2e+06 Reserved 0.0 0.0 0.0 0.0 0.0 0.0 0.0 processesroot ipmi powerconsumption ambienttemp cputemp num_mics Total 717.0-1.0-1.0-1.0-1.0 0.0 Reserved 0.0 0.0 0.0 0.0 0.0 1.0 You can see above that num_mics has 1 unit reserved (as expected) on both compute000 and compute001. Native (non-mpi) Example: The following shows an example of a native Xeon Phi binary (non-mpi) being launched under IBM Platform LSF. The binary has been compiled using the Intel Compilers, Intel

MPI (intel-mpi-intel64-4.1.0p-024.x86_64) and is launched using the 'mic.job' wrapper. The resource requirement string for the job requests the Boolean resource "knc" and rusage on the resource num_mics equal to 1. $ bsub -I -R "select[mic] rusage[num_mics=1]" /shared/mic.job /shared/fibo Job <1078> is submitted to default queue <medium_priority>. <<Waiting for dispatch...>> <<Starting on compute000>> 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 Remote process returned: 0 Exit reason: SHUTDOWN OK During execution of the job you see the following: 'bpost' of Xeon Phi hostnames to the job (this is performed by the job wrapper script) # bjobs -l 1078 Job <1078>, User <hpcadmin>, Project <default>, Status <RUN>, Queue <medium_pri ority>, Interactive mode, Command </shared/mic.job /shared /fibo>, Share group charged </hpcadmin> Wed Aug 14 17:43:54: Submitted from host <mel1>, CWD <$HOME>, Requested Resourc es <select[mic] rusage[num_mics=1]>; Wed Aug 14 17:43:57: Started on <compute000>; SCHEDULING PARAMETERS:

r15s r1m r15m ut pg io ls it tmp swp mem loadsched - - - - - - - - - - - loadstop - - - - - - - - - - - root maxroot processes clockskew netcard iptotal loadsched - - - - - - - - loadstop - - - - - - - - cpuhz cachesize gpumode1 gputemp1 gpuecc1 gpumode2 gputemp2 gpuecc2 gpumode3 gputemp3 loadsched - - - - - - - - loadstop - - - - - - - - gpuecc3 loadsched - loadstop - EXTERNAL MESSAGES: MSG_ID FROM POST_TIME MESSAGE ATTACHMENT 0 hpcadmin Aug 14 17:43 compute000-mic0 N # bhosts -l compute000 HOST compute000 STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW ok 16.00-16 1 1 0 0 0 - CURRENT LOAD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem root maxroot Total 0.0 0.0 0.0 3% 0.0 23 1 2 8600M 2G 31G 0.0 0.0 Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M - - processes clockskew netcard iptotal cpuhz cachesize diskvolume Total 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Reserved - - - - - - - processesroot ipmi powerconsumption ambienttemp cputemp num_mics Total 0.0 0.0 0.0 0.0 0.0 0.0 Reserved - - - - - 1.0 gpuecc0 gpumode1 gputemp1 gpuecc1 gpumode2 gputemp2 gpuecc2 Total 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Reserved - - - - - - -

gpumode3 gputemp3 gpuecc3 Total 0.0 0.0 0.0 Reserved - - - You see above that num_mics has 1 unit reserved (as expected).

Appendix A: IBM Platform HPC and Intel Xeon Phi Integration Scripts The package containing all example scripts referred to in this document are available for download at IBM Fix Central (http:// http://www.ibm.com/support/fixcentral/). The scripts are provided as examples as per the terms indicated in the file LICENSE. The package contains the example scripts: dynelim.mic: dynamic ELIM script to collect Intel MIC related metrics (HPC, LSF) mic.job: Intel MIC job wrapper (LSF) mic_metric_add.sh: script to facilitate the rapid addition of Intel MIC metrics (HPC). rc.local.xeon_phi Script portion to configure network bridge, and configure Intel MPSS to use network bridge post_install.sh Custom script for IBM Platform HPC. This will force the execution of rc.local on the first boot after the CFM has executed. Copyright and Trademark Information Copyright IBM Corporation 2013 U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.