Job Scheduling on a Large UV Chad Vizino SGI User Group Conference May Pittsburgh Supercomputing Center
|
|
|
- Alicia Lee
- 10 years ago
- Views:
Transcription
1 Job Scheduling on a Large UV 1000 Chad Vizino SGI User Group Conference May 2011
2 Overview About PSC s UV 1000 Simon UV Distinctives UV Operational issues Conclusion
3 PSC s UV Blacklight
4 Blacklight after router installation
5 Lots of cables
6 Blacklight Hardware Installed September 2010 Routers installed December x 16TB SSIs 128 Blades per SSI 8GB per core 2048 physical cores per SSI Dual socket 8-way Intel Xeon 2.27GHz (Nahalem) 16 physical cores per blade 32 Hyper-Threaded cores per blade
7 Current SCRATCH Lustre TB 8 Servers IB SDR connection via Blacklight 2 x DDN 8550 New deployment coming
8 New SCRATCH Imminent deployment Runs drives at 95% of spindle speed See Michael Levine s talk on Blacklight
9 User Environment Login node Dual quad core Intel Xeon 2.4GHz (Westmere) 24 GB memory Common /usr/users ($HOME), /usr/local (packages managed with modules )
10 Login node Access/edit files Compile codes Submit/monitor jobs Users may not login to compute nodes (UV SSIs) Interactive jobs via qsub I are allowed Runs Torque server and scheduler processes
11 Software SUSE Linux Enterprise Server 11.1 SGI Performance Suite 1.1 Torque Resource Manager (with local mods) Simon scheduler (locally developed)
12 About Simon Locally developed job scheduler Work started 10 years ago Integrated with Torque Ported to various architectures Compaq AlphaServer SC (RMS) Cray XT3 (CPA) SGI Altix 4700 (cpusets)
13 UV Distinctive #1: Cpusets Job assigned to whole blades Users request ncpus and walltime limits Get more memory by requesting more blades Memory enforcement Job killed when cpuset memory_pressure > 0 Cpuset is cpu exclusive Cpuset is mem exclusive Lessons learned from Altix 4700 experience
14 Cpusets facilitate repeatable performance
15 Hard to achieve repeatable performance!
16 More on Simon Written in TCL About 4,200 lines of code Integration with Torque Backfill Reservations Stuffing control (QOS) Co-scheduling software licenses Flexible walltime support
17 Torque Integration Features Linux kernel job integration Mom calls job_create() with Torque job id Enables use of ja by users csacom j `printf %x <torque_job_id>` Limiting process threads Java garbage collection threads -XX:ParallelGCThreads=N Thread_factor set on queue Limit = thread_factor*ncpus
18 Distinctive #2: Dealing with Hyper-Threads
19 Hyper-Threads and Jobs Users specify physical core count qsub l ncpus=n! N must be multiple of 16 PBS_NCPUS (N) PBS_HT_NCPUS (N*2) mpirun np $PBS_NCPUS! Or, mpirun np $PBS_HT_NCPUS! Mom daemon creates cpuset with Hyper- Thread cpu count (N*2)
20 CPU Numbering from topology output CPU Blade PhysID CoreID APIC-ID Family Model Speed L1(KiB) L2(KiB) L3(KiB)! ! 0 r001i01b d/32i ! 1 r001i01b d/32i ! 2 r001i01b d/32i ! 3 r001i01b d/32i !...! 13 r001i01b d/32i ! 14 r001i01b d/32i ! 15 r001i01b d/32i !...! 2048 r001i01b d/32i ! 2049 r001i01b d/32i ! 2050 r001i01b d/32i ! 2051 r001i01b d/32i ! 2052 r001i01b d/32i !...! 2060 r001i01b d/32i ! 2061 r001i01b d/32i ! 2062 r001i01b d/32i ! 2063 r001i01b d/32i !!
21 Blade scheduling
22 System Hierarchy and Scheduling Rack IRU Blade Memnode Cpus Boot blade (1 st blade of each SSI) not scheduled IO blades (have IB cards) not scheduled Simon maintains list of free and in-use memory nodes per SSI Simon manipulates nodeset resource
23 Nodeset resource used for job placement Simon places jobs using nodeset Mems:cpus nodeset=2-3:16-31, ! Used by pbs_mom to construct cpuset on Blacklight node Queues can have a memnode mask Target specific memnodes (blades) Debug jobs on blade 127 (1/2 memory) Also on other nodes with < 128GB (full memory)
24 PMM A text based monitor 1 (bl0) 2 (bl1)=partition! =RACK! IRU ! ******** ******** ******** *******. ******** ******** ******** *******.! ******** ******** ******** ******** ******** ******** ******** ********! ! ******** ******** ******** ******** ******** ******** ******** ********! B*xxx*** ******** ******** ***.**** B*xxx*** ******** ******** ********! ! 4567CDEF=HEX BLADE # Key: *=allocated B=boot! AB.=free x=not scheduled!
25 Blacklight Racks
26 Blacklight IRUs
27 Blacklight 3D Monitor See Blacklight3DMonitor.avi
28 UV Distinctive #3: Lots of Hardware
29 Database Holds Static Configuration Data SQLite SQL database engine Provides one place to get SSI configuration information for both SSIs Easy access to topology command output Each SSI Integration with Simon planned Used by pmm and Blacklight 3D Monitor
30 Database Tables (all in under 500 kilobytes!) Partitions Blades Cpus Cpusets Memnodes Devices Routers
31 Partitions Table sqlite> select * from partitions limit 1;! partition_num = 1! serial = UV ! hostname = bl0.psc.teragrid.org! blades = 128! routers = 96! cpus = 4096! mem_total_gb = ! io_risers = 5! infiniband_controllers = 6! network_controllers = 2! scsi_controllers = 1! usb_controllers = 8! vga_gpus = 1!!
32 Blades Table sqlite> select * from blades limit 1;! blade_num = 0! partition_num = 1! blade_name = r001i01b00! rack = 1! iru = 1! blade = 0! asic = UVHub 2.0! nasid = 0! cpus = 32! memory_kb = ! configured = 0! comment = boot!
33 Cpusets and Memnodes Tables sqlite> select * from cpusets limit 1;! cpuset_num = 0! partition_num = 1! cpuset_name = boot! mems = 0-1! cpus = 0-15, !!! sqlite> select * from memnodes limit 1;! memnode_num = 0! blade_num = 0! partition_num = 1! cpuset_num = 0! mem_total_kb = !
34 Memnodes and Cpus Tables sqlite> select * from memnodes limit 1;! memnode_num = 0! blade_num = 0! partition_num = 1! cpuset_num = 0! mem_total_kb = !! sqlite> select * from cpus limit 1;! cpu_num = 0! memnode_num = 0! blade_num = 0! partition_num = 1! cpuset_num = 0! physid = 0! coreid = 0! apic_id = 0! family = 6! model = 46! speed = 2266! l1 = 32d/32i! l2 = 256! l3 = 24576!
35 Devices and Routers Tables sqlite> select * from devices limit 1;! blade_num = 0! partition_num = 1! pci_address = 0000:01:00.0! x_server_display = -! device = Intel Gigabit Network Connection!!! sqlite> select * from routers limit 1;! router_num = 0! partition_num = 1! router_name = r001i01r00! rack = 1! upos = 1! router = 0! class = NL5Router!
36 Database Queries Facilitates blade name and cpu/memnode translation Look up last job use by blade Helps answer: What blades did a job use? What memnodes and partition correspond to a given blade name? Which blades have less memory than expected after boot?
37 Operations: Pre-job scan (prologue script) Cpuset coherency at startup Tmpfs Ram based file system based on cpuset s memory /dev/tmpfs/<job_id> directory Created at job start Destroyed at job end (also scan for orphans in prologue) Lustre check Save job script for future reference
38 Operations: Memory failures Check at boot time via topology command difference checker Watch memlog via Simple Event Correlator (SEC) SEC updates system db so we can keep track of failures Provides place holder so we don t forget about them Remove from db after hardware replaced
39 Future Plans Develop database integration Predictive walltime scheduling Mitigate long drain times D. Tsafrir, Y. Etsion, and D. G. Feitelson, Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans. Parallel & Distributed Syst. 18(6), pp , Jun Topology-aware scheduling algorithms
40 More PSC! Michael Levine giving customer keynote on Blacklight, Wednesday at 9:00am.
The CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
Getting Started with HPC
Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage
JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research
! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at
An Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April 2009. Page 1 of 12
XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines A.Zydroń 18 April 2009 Page 1 of 12 1. Introduction...3 2. XTM Database...4 3. JVM and Tomcat considerations...5 4. XTM Engine...5
JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers
JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers Dave Jaffe, PhD, Dell Inc. Michael Yuan, PhD, JBoss / RedHat June 14th, 2006 JBoss Inc. 2006 About us Dave Jaffe Works for Dell
Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource
PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)
Using NeSI HPC Resources. NeSI Computational Science Team ([email protected])
NeSI Computational Science Team ([email protected]) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting
System Requirements Table of contents
Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5
Building Clusters for Gromacs and other HPC applications
Building Clusters for Gromacs and other HPC applications Erik Lindahl [email protected] CBR Outline: Clusters Clusters vs. small networks of machines Why do YOU need a cluster? Computer hardware Network
Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007
Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms Cray User Group Meeting June 2007 Cray s Storage Strategy Background Broad range of HPC requirements
Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM
White Paper Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM September, 2013 Author Sanhita Sarkar, Director of Engineering, SGI Abstract This paper describes how to implement
Main Memory Data Warehouses
Main Memory Data Warehouses Robert Wrembel Poznan University of Technology Institute of Computing Science [email protected] www.cs.put.poznan.pl/rwrembel Lecture outline Teradata Data Warehouse
ontune SPA - Server Performance Monitor and Analysis Tool
ontune SPA - Server Performance Monitor and Analysis Tool Product Components - ontune is composed of the Manager; the Agents ; and Viewers Manager - the core ontune component, and installed on the management/viewing
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
A Crash course to (The) Bighouse
A Crash course to (The) Bighouse Brock Palen [email protected] SVTI Users meeting Sep 20th Outline 1 Resources Configuration Hardware 2 Architecture ccnuma Altix 4700 Brick 3 Software Packaged Software
Using the Yale HPC Clusters
Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: [email protected] Read documentation at: http://research.computing.yale.edu/hpc-support
Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept
Integration of Virtualized Workernodes in Batch Queueing Systems, Dr. Armin Scheurer, Oliver Oberst, Prof. Günter Quast INSTITUT FÜR EXPERIMENTELLE KERNPHYSIK FAKULTÄT FÜR PHYSIK KIT University of the
Sun Constellation System: The Open Petascale Computing Architecture
CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical
Enabling Technologies for Distributed and Cloud Computing
Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading
OpenMP Programming on ScaleMP
OpenMP Programming on ScaleMP Dirk Schmidl [email protected] Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign
Example of Standard API
16 Example of Standard API System Call Implementation Typically, a number associated with each system call System call interface maintains a table indexed according to these numbers The system call interface
The Evolution of Cray Management Services
The Evolution of Cray Management Services Tara Fly, Alan Mutschelknaus, Andrew Barry and John Navitsky OS/IO Cray, Inc. Seattle, WA USA e-mail: {tara, alanm, abarry, johnn}@cray.com Abstract Cray Management
Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak
Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance
SUSE Linux Enterprise 10 SP2: Virtualization Technology Support
Technical White Paper LINUX OPERATING SYSTEMS www.novell.com SUSE Linux Enterprise 10 SP2: Virtualization Technology Support Content and modifications. The contents of this document are not part of the
Jason Hill HPC Operations Group ORNL Cray User s Group 2011, Fairbanks, AK 05-25-2011
Determining health of Lustre filesystems at scale Jason Hill HPC Operations Group ORNL Cray User s Group 2011, Fairbanks, AK 05-25-2011 Overview Overview of architectures Lustre health and importance Storage
Martinos Center Compute Clusters
Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress
MODULE 3 VIRTUALIZED DATA CENTER COMPUTE
MODULE 3 VIRTUALIZED DATA CENTER COMPUTE Module 3: Virtualized Data Center Compute Upon completion of this module, you should be able to: Describe compute virtualization Discuss the compute virtualization
The Hardware Dilemma. Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering
The Hardware Dilemma Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering April 9, 2013 The Blurring of the Lines Business Applications and High Performance Computing Are
Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7
Introduction 1 Performance on Hosted Server 1 Figure 1: Real World Performance 1 Benchmarks 2 System configuration used for benchmarks 2 Figure 2a: New tickets per minute on E5440 processors 3 Figure 2b:
Running applications on the Cray XC30 4/12/2015
Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes
Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)
Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:
HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief
Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...
Cloud Computing through Virtualization and HPC technologies
Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC
Lecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
Introduction to HPC Workshop. Center for e-research ([email protected])
Center for e-research ([email protected]) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using
Parallel Processing using the LOTUS cluster
Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver
1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution
Very Large Enterprise Network, Deployment, 25000+ Users
Very Large Enterprise Network, Deployment, 25000+ Users Websense software can be deployed in different configurations, depending on the size and characteristics of the network, and the organization s filtering
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a
Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network
A typical Linux cluster consists of a group of compute nodes for executing parallel jobs and a head node to which users connect to build and launch their jobs. Often the compute nodes are connected to
Monitoring Tools for Large Scale Systems
Monitoring Tools for Large Scale Systems Ross Miller, Jason Hill, David A. Dillow, Raghul Gunasekaran, Galen Shipman, Don Maxwell Oak Ridge Leadership Computing Facility, Oak Ridge National Laboratory
SGI High Performance Computing
SGI High Performance Computing Accelerate time to discovery, innovation, and profitability 2014 SGI SGI Company Proprietary 1 Typical Use Cases for SGI HPC Products Large scale-out, distributed memory
Altix Usage and Application Programming. Welcome and Introduction
Zentrum für Informationsdienste und Hochleistungsrechnen Altix Usage and Application Programming Welcome and Introduction Zellescher Weg 12 Tel. +49 351-463 - 35450 Dresden, November 30th 2005 Wolfgang
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007
PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit
Performance and scalability of a large OLTP workload
Performance and scalability of a large OLTP workload ii Performance and scalability of a large OLTP workload Contents Performance and scalability of a large OLTP workload with DB2 9 for System z on Linux..............
Virtualization Guide. McAfee Vulnerability Manager Virtualization
Virtualization Guide McAfee Vulnerability Manager Virtualization COPYRIGHT Copyright 2012 McAfee, Inc. Do not copy without permission. TRADEMARKS McAfee, the McAfee logo, McAfee Active Protection, McAfee
Enabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
Quick Tutorial for Portable Batch System (PBS)
Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.
CHAPTER FIVE RESULT ANALYSIS
CHAPTER FIVE RESULT ANALYSIS 5.1 Chapter Introduction 5.2 Discussion of Results 5.3 Performance Comparisons 5.4 Chapter Summary 61 5.1 Chapter Introduction This chapter outlines the results obtained from
Manual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
System requirements for MuseumPlus and emuseumplus
System requirements for MuseumPlus and emuseumplus System requirements for MuseumPlus and emuseumplus Valid from July 1 st, 2008 Apart from the listed system requirements, the requirements established
Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.
Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster
How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014
How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014 2014 年 云 计 算 效 率 与 能 耗 暨 第 一 届 国 际 云 计 算 咨 询 委 员 会 中 国 高 峰 论 坛 Contents Background
Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
HPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
PARALLELS SERVER 4 BARE METAL README
PARALLELS SERVER 4 BARE METAL README This document provides the first-priority information on Parallels Server 4 Bare Metal and supplements the included documentation. TABLE OF CONTENTS 1 About Parallels
Hodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
Miami University RedHawk Cluster Working with batch jobs on the Cluster
Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.
Biowulf2 Training Session
Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs
Job Scheduling with Moab Cluster Suite
Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. [email protected] 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..
Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers
Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers Enterprise Product Group (EPG) Dell White Paper By Todd Muirhead and Peter Lillian July 2004 Contents Executive Summary... 3 Introduction...
Load and Performance Testing
Blaise Internet 4.8.4 Load and Performance Testing Lane Masterton Assistant Statistician Technology Services Division Australian Bureau of Statistics Content 1. Purpose 2. Test Targets 3. Approach 4. Solution
Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This document
Microsoft Windows Compute Cluster Server 2003 Getting Started Guide
Microsoft Windows Compute Cluster Server 2003 Getting Started Guide Part Number 434709-003 March 2007 (Third Edition) Copyright 2006, 2007 Hewlett-Packard Development Company, L.P. The information contained
Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms
IT@Intel White Paper Intel IT IT Best Practices: Data Center Solutions Server Virtualization August 2010 Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms Executive
Understanding the Benefits of IBM SPSS Statistics Server
IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster
SOFTWARE TECHNOLOGIES
SOFTWARE TECHNOLOGIES (September 2, 2015) BUS3500 - Abdou Illia, Fall 2015 1 LEARNING GOALS Identify the different types of systems software. Explain the main functions of operating systems. Know the various
IBM License Metric Tool Version 7.2.2. Installing with embedded WebSphere Application Server
IBM License Metric Tool Version 7.2.2 Installing with embedded WebSphere Application Server IBM License Metric Tool Version 7.2.2 Installing with embedded WebSphere Application Server Installation Guide
How To Write An Article On An Hp Appsystem For Spera Hana
Technical white paper HP AppSystem for SAP HANA Distributed architecture with 3PAR StoreServ 7400 storage Table of contents Executive summary... 2 Introduction... 2 Appliance components... 3 3PAR StoreServ
Very Large Enterprise Network Deployment, 25,000+ Users
Very Large Enterprise Network Deployment, 25,000+ Users Websense software can be deployed in different configurations, depending on the size and characteristics of the network, and the organization s filtering
DIABLO VALLEY COLLEGE CATALOG 2014-2015
COMPUTER SCIENCE COMSC The computer science department offers courses in three general areas, each targeted to serve students with specific needs: 1. General education students seeking a computer literacy
Dell KACE K1000 Management Appliance. Administrator Guide. Release 5.3. Revision Date: May 16, 2011
Dell KACE K1000 Management Appliance Administrator Guide Release 5.3 Revision Date: May 16, 2011 2004-2011 Dell, Inc. All rights reserved. Information concerning third-party copyrights and agreements,
MOSIX: High performance Linux farm
MOSIX: High performance Linux farm Paolo Mastroserio [[email protected]] Francesco Maria Taurino [[email protected]] Gennaro Tortone [[email protected]] Napoli Index overview on Linux farm farm
HPC Update: Engagement Model
HPC Update: Engagement Model MIKE VILDIBILL Director, Strategic Engagements Sun Microsystems [email protected] Our Strategy Building a Comprehensive HPC Portfolio that Delivers Differentiated Customer Value
An Oracle White Paper August 2012. Oracle WebCenter Content 11gR1 Performance Testing Results
An Oracle White Paper August 2012 Oracle WebCenter Content 11gR1 Performance Testing Results Introduction... 2 Oracle WebCenter Content Architecture... 2 High Volume Content & Imaging Application Characteristics...
Performance Characteristics of Large SMP Machines
Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller [email protected] Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark
CentOS Linux 5.2 and Apache 2.2 vs. Microsoft Windows Web Server 2008 and IIS 7.0 when Serving Static and PHP Content
Advances in Networks, Computing and Communications 6 92 CentOS Linux 5.2 and Apache 2.2 vs. Microsoft Windows Web Server 2008 and IIS 7.0 when Serving Static and PHP Content Abstract D.J.Moore and P.S.Dowland
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the software, please review the readme files,
Multi-core and Linux* Kernel
Multi-core and Linux* Kernel Suresh Siddha Intel Open Source Technology Center Abstract Semiconductor technological advances in the recent years have led to the inclusion of multiple CPU execution cores
Capacity Planning for Microsoft SharePoint Technologies
Capacity Planning for Microsoft SharePoint Technologies Capacity Planning The process of evaluating a technology against the needs of an organization, and making an educated decision about the configuration
HP Universal CMDB. Software Version: 10.20. Support Matrix
HP Universal CMDB Software Version: 10.20 Support Matrix Document Release Date: January 2015 Software Release Date: January 2015 Legal Notices Warranty The only warranties for HP products and services
Improved LS-DYNA Performance on Sun Servers
8 th International LS-DYNA Users Conference Computing / Code Tech (2) Improved LS-DYNA Performance on Sun Servers Youn-Seo Roh, Ph.D. And Henry H. Fong Sun Microsystems, Inc. Abstract Current Sun platforms
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University
Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)
White Paper Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability) White Paper July, 2012 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public
Protect SQL Server 2012 AlwaysOn Availability Group with Hitachi Application Protector
Protect SQL Server 2012 AlwaysOn Availability Group with Hitachi Application Protector Tech Note Nathan Tran The purpose of this tech note is to show how organizations can use Hitachi Applications Protector
SQL Server PDW. Artur Vieira Premier Field Engineer
SQL Server PDW Artur Vieira Premier Field Engineer Agenda 1 Introduction to MPP and PDW 2 PDW Architecture and Components 3 Data Structures 4 PDW Tools Data Load / Data Output / Administrative Console
