GRID Computing: CAS Style
|
|
- Elaine Willis
- 8 years ago
- Views:
Transcription
1 CS4CC3 Advanced Operating Systems Architectures Laboratory 7 GRID Computing: CAS Style campus trunk C.I.S. router "birkhoff" server The CAS Grid Computer 100BT ethernet node 1 "gigabyte" Ethernet switch 'penguin' FrontEnd node 2 node 8 lab8fig1.cfl wfsp/jan node node node 5 node 7 node McMaster University Hamilton, Ontario L8S 4K Done: Round-Robin File: 4cc04lb7.doc Date:25oct04/nm Revision Level: 01
2 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-2 Introduction This lab will provide you with an understanding of what the major hype is behind grid computing. Grid computing is not a new idea; the concept has been around the research world for a long time but lacks real life generalized tools. Named for the ubiquity of the electric power grid, grid computing represents a flexible and scalable architecture that collects and concentrates available computational resources to solve business and mission-critical computational challenges. Several definitions can be found on the internet of what many consider grid computing. For example, a grid can be a flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources. By organizing hundreds or thousands of interconnected heterogeneous computers as a single, unified computational resource, Grid computing offers a cost-effective approach to solving compute-intensive problems while consolidating and simplifying distributed resource management. For an example of a well known grid computing project please visit SETI (Search for Extraterrestrial Intelligence) at Here PC users worldwide donate unused processor cycles to help the search for signs of extraterrestrial life by analyzing signals coming from outer space. The project relies on individual users to volunteer to allow the project to harness the unused processing power of the user s computer. This method allows the researchers to maximize their processing capabilities yet minimizing their operations costs. Many other real life examples to address issues such as Smallpox, cancer, and anthrax are in place and being used. Objectives Understanding of Grid Computing Familiarize yourself with Grid Engine Create and submit a task for a grid to solve CAS Setup As stated in the introduction in order to do grid computing a cluster of computers is required along with appropriate tools to manage those nodes transparently. In our department (CAS), we have one such cluster that can provide us with tremendous amounts of processing power. It s a combination of 9 machines, one acting as the main server (i.e. the frontend) along with 8 nodes that act as aiding computational entities. To illustrate with a diagram, please review figure 1. Table 1. CAS Grid Computing Component Specifications Penguin: 2 x 2.4GHz Xeon processor 4GB RAM main memory 2 x 36GB RPM SCSI disks 2 gigabit ethernet interfaces For Each Node: * 2 x 2.4GHz Xeon * 1GB RAM * 1 x 80GB IDE disk * 1 gigabit ethernet
3 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-3 Figure 1. CAS Grid Computing Setup PART I Introduction to the Grid Engine The introduction and McMaster setup provided the background needed to start your grid computing lab. Notice we provided the famous example of SETI in the introduction as a real life use of grid computing. 1. As a first question, please survey the Internet to find another real life implementation/use of grid computing. Your answer is to include what group is using it, for what purposes, and explain how they are taking advantage of grid computing. Grid computing does require special software that is unique to the computing project for which the grid is being used. In our case the software we will use is the Grid Engine project. The Grid Engine project is an open source community effort to facilitate the adoption of distributing computing solutions. Sun developed the initial versions of the software, which turn extremely successful and was free of charge to use. They now sponsor the Grid Engine open source project and develop their own Enterprise Edition in which licenses are required. This management software allows us to transparently submit a job and not have to worry about how the task is split up within the cluster. Therefore in order to prepare you for part II, part I will be used to review the documentation that comes with the Grid Engine. This documentation can be found on the cs4cc3 website at:
4 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-4 Please open the above link and begin reading to answer the following questions. The answers you obtain here will be required for you to complete the second section so do take the time to understand your answers. 2. Grids can be classified broadly into three different classes. Please list those classes, and identify which is supported by the Grid Engine project and which are identified by Sun Grid Engine Enterprise Edition. 3. At the bottom of page 22 they give an analogy of how the Grid Engine behaves. This analogy relates a money-center (i.e. a bank s behavior) to the Grid engine by illustrating a typical day in a bank and relating each scenario to the Grid software. Your job is to come up with another scenario which clearly illustrates that you understand the task of the Grid engine software and the big picture of this lab. 4. Identify the four types of hosts that are fundamental to the Sun Grid system. Discuss their differences, and what each are used for. In our case what is penguin.cas.mcmaster.ca classified as? What are the eight nodes that are behind classified as? 5. What are the three daemons that most be running on a master host? Are they running on penguin? What command did you use to check that? What two daemons must be running on an execution host? 6. In general when using daemons, and standard internet application you tell the OS what port to listen on for certain activities. In UNIX the /etc/services file is used for that purpose. Note that one daemon runs on all hosts using the grid engine. Discuss what that daemon is used for? Determine what port that Penguin is set to listen for the SGE (Sun Grid Engine) TCP traffic. 7. Can a host be a combination of an execution host and a Master host? In General can a host be a part of two groups or are the groups mutually exclusive? 8. What are sun grid engine queues used for? Explain how queues and jobs are tied together (i.e. how one can affect the other). As a submit hosts do we need to worry about the management of queues? If we do explain how, if we don t explain who does for us. 9. What are the two ways of operating the Grid engine (i..e. modes of operation)? 10. An account has been created for the use of this lab. Log onto penguin and redirect the display to your terminal. Your logon information will be the following: user: cs4cc3st password: moores.law Make sure to SSH and redirect the root display to your current terminal. Please determine how to do this and write the command you used as part of this answer. 11. Once you log in you must run a specific script in order to be able to execute the grid engine commands. This script simply updates your system environment to add the appropriate paths. Do the following: source /usr/local/sge-5.3p6/default/common/settings.csh You should now be able to execute the set of SGE binaries that are installed. Execute QMON the Graphical User Interface (GUI) that will aid us in managing our cluster as well as submitting jobs. If the X-window GUI for qmon, which is illustrated in Figure 2, does not appear on your X server display after several minutes or complains with some sort of error, the GridEngine may need to be restarted. (Note: the same error occurs, if the above source command is not issued in your filespace.) Please advise the TA or
5 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-5 more appropriately Derek Lipiec who has root access and can restart SunGE on penguin. Figure 2- QMON Main GUI window for using SGE The above window should be shown upon successful completion of QMON. 12. Explain what the following terms mean: a. Cluster b. Cell c. Manager d. Job Class e. Operator 13. Discuss what a queue represents in terms of the Grid Engine? Explain two different ways of verifying the status of your queues? This concludes PART I of the lab. Although very little practical operational programming has been accomplished, much more will be done in Part II. Record your answers for this section for inclusion in the lab report to be completed at the end of Part II and submitted one week later via WebCT. PART II Running Programs on the Grid Engine Now that you have had the chance to play around with Grid Engine in Part I we will put that knowledge to use. The core of any cluster or enterprise grid is the Distributed Resource Manager (DRM). Sun Grid Engine and its open source version Grid Engine are both examples of excellent DRMs. You can think of Grid Engine as an extra layer above parallel environment libraries such as PVM, MPI, and Globus. This extra layer provides a graphical frontend that allows you to manage your resources more effectively.
6 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-6 As illustrated above you are connecting to penguin that acts as the frontend (i.e. the logon node). You as a user never have to communicate directly with the computing nodes. The following three experiments will show you how you can use the Grid engine to simply submit a job and wait for results. Any tasks can be submitted to the Grid Engine but to gain its full power, using a parallel environment such as MPI will provide much greater use. Essentially a developer simply has to learn to use a library such as MPI, compile his code locally, and then submit it to SGE. You should already be familiar with parallel environments through either previous programming exercises/labs or other courses. A Parallel Environment (PE) is a software package designed for parallel computing in a network of computers, which allows execution of shared memory and distributed memory parallelized applications. A variety of systems have evolved over the past years into viable technology for distributed and parallel processing on various hardware platforms. The most commonly used parallel environments are Parallel Virtual Machine (PVM), Message Passing Interface (MPI), and OpenMP. All these systems show different characteristics and have unique requirements. In order to be able to handle arbitrary parallel jobs running on top of such systems, the Sun Grid Engine system provides a flexible and powerful interface that satisfies the various needs. Arbitrary PEs can be interfaced by Grid Engine as long as suitable startup and stop procedures are provided which is what we will do for MPI for the last part of this lab. The purpose of the rest of this lab will be to see how SGE facilitates the submissions of multiple types of jobs. We will test jobs of all sorts include standard bash commands, binary submissions, and submission of a parallel environment such as MPI. Therefore, to get started log back in to the class account and make sure step 11 from part 1 is already completed before going on. Execute QMON as well. Before practicing the submission of jobs you should check to see the status of the available queues and how busy the Grid system is. From your knowledge of part 1 please describe which nodes are being used as computational nodes and how many slots each node is given (a possible screenshot here with an explanation is probably the best approach).
7 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-7 What does the orange bar signify on cnode2 queue status? 1. Shell Script job submission The easiest and most straightforward submissions are simple shell scripts. By default you cannot submit binaries to SGE it will return an error (there are ways around as you will see in the next section). In order to get a feel of a simple submission download the script sleeper.sh from the CS4CC3 website and place it in your home directory. Open the script and it should be obvious as to what it does please include a brief note in your discussion what sleeper.sh did. Before submitting the job open the Job Control window as well as the Queue Control window. By having these two windows open on the side you will be able to visually see the job being completed. To submit this job use your knowledge from question 14 of part 1 and submit the job using whichever way you desire. Notice that when the job is done a new file was generated in your current working directory called sleeper.sh.o[job]. View the content of that file and explain the results of that file in your write-up. If you get two output files from submitting your job it is because an error has occurred. The.o extension refers to the standard output, but the.e extension refers to the standard error. Now to see that SGE delegates where each tasks will run continually submit the sleeper.sh script (maybe a dozen times) in a row and with QMON open the Queue Control window to see how the queue dynamically get filled up and how they empty out as they gradually finish. In order to get some practice develop a script with your group that can be submitted to the Grid Engine. The script does not have to be long or take any form of parallelism (just something similar to sleeper.sh ). In your write up you are required to submit that script, a screenshot of your submission or queue status (either one-gui or CLI), as well as the output file(s) that are generated. 2. Binary Submission This section similar to the previous will illustrate that you are not bounded to simply script languages but that you can also submit binaries. Now again here, we will not take advantage of any sort of parallelism yet but you will see a new method to submit a task. Go to the 4CC3 website and download the machineeps.c file. Look at the code and determine what the purpose of the while loop is for. Please include what is meant by machine epsilon and the purpose of that loop in your discussion for this section. After analyzing the code compile the output to create the corresponding binary. The following section will all be strictly done at the command line interface. So in order to submit your work correctly you should log the following commands and their corresponding output (hint: use the typescript command).
8 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-8 Now although you can simply run the program locally what happens if this were a more computational intense program that a single CPU on a standalone machine could not handle? This is where the beauty of SGE comes to play. At the command prompt type the qsub command then press return without specifying a job script. You will then see a secondary shell prompt where you can type in the name of the binary file you want to submit. You can then press return and continue to enter more binary or shell commands. When you are done specifying your job press control-d. i.e % qsub machineeps.exe <CTR-D> your job 104 ( STDIN ) has been submitted To show that you have submitted your job as soon as you hit CTR-D type in qstat ne which will describe to you the status of your used queues. If nothing shows up yet keep entering that command till you see a change in queue status. You should be seeing something like: [cs4cc3st@penguin ~] qstat -ne job-id prior name user state submit/start at queue master ja-task-id greetings merizzn qw 10/24/ :59:57 [cs4cc3st@penguin ~] qstat -ne job-id prior name user state submit/start at queue master ja-task-id greetings merizzn qw 10/24/ :59:57 [cs4cc3st@penguin ~] qstat -ne job-id prior name user state submit/start at queue master ja-task-id greetings cs4cc3st t 10/24/ :00:12 cnode3.q MASTER 0 greetings cs4cc3st t 10/24/ :00:12 cnode3.q SLAVE 0 greetings cs4cc3st t 10/24/ :00:12 cnode3.q SLAVE 0 greetings cs4cc3st t 10/24/ :00:12 cnode3.q SLAVE 0 greetings cs4cc3st t 10/24/ :00:12 cnode3.q SLAVE [cs4cc3st@penguin ~]
9 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page Parallel Environment Submission To reduce the complexity of this lab the Grid engine has already been configured to accept parallel jobs. Your task once again is to transparently submit and wait to obtain results. From your screenshots above and from looking at your queues after submitting the previous two parts I m sure you have noticed that one submission results in one queue being used. In this section you will see one submission that will result in multiple queues being used. We will ignore the details of the code and will omit going into details of parallel programming. Here we are strictly looking at how SGE can be used to submit complex algorithms to a larger more efficient infrastructure. In order for the environment to work correctly you need two files and an environment variable: 1. First move to your home directory and check to see if you already have a.login file that exists. If so open it up and make sure the following line is contained inside (at the end): setenv TMPDIR ~/SGElab/ 2. Again without going into the details of MPI a key file that needs to be provided in order to successfully execute tasks on multiple CPUs is providing MPI with a machines file. The scripts that have been created for your use assume their location to be at $TMPDIR/machines. So inside your home directory please create a new file called machines and add the following lines: penguin cnode1 cnode3 cnode4 Below is the script that you will be executing:!/bin/sh Your job name $ -N MPI_jacobi Use current working directory $ -cwd Join stdout and stderr $ -j y pe request for MPICH. Set your number of processors here. $ -pe mpich 4 Run job through bash shell $ -S /bin/tcsh The following is for reporting only. It is not really needed to run the job. It will show up in your output file.
10 2004/2005 CS 4CC3/6CC3 -- Laboratory 7 page 7-10 echo "Got $NSLOTS processors." echo "Machines:" cat $TMPDIR/machines Use full pathname to make sure we are using the right mpirun /usr/local/mpich/bin/mpirun -np $NSLOTS \ -machinefile $TMPDIR/machines jacobi.exe Commands to do something with the data after the program has finished. When submitting parallel environment (PE) jobs you need to warn SGE about it. The SGE administrator will have already provided a parallel environment for your programs to run in but you need to tell it to use it. So that is what the $ -pe mpich 4 means in the above script. This says use the mpich PE that is already setup and then I want the script to use 4 processors. With the Queue Control window open next to a bash prompt type submit your job and constantly refresh your Queue Control window and observe what happens. Please discuss what happens. In order to help you with your discussion another MPI script was provided which can be changed a little. Open up the greetings.sh script and change the $ -pe mpich 4 line to some other value between 2-8 (ie. $ -pe mpich 6). Submit the program with your changed value and observe the number queues that are taken up. This completes the lab on the Grid Engine. Although only introductory issues were looked at its power extends greatly, and many more experimental issues can be tested to really see the performance gains of using such a system. Acknowledgements Thanks go to Mr. Nicholas Merizzi, a graduate student in the Applied Computersystems Group (ACsG) for his idea conception, design and implementation of this laboratory. His attention to detail has produced a very instructive mechanism for understanding group communications in a networked environment. (wfsp/2004) File: 4cc04lb7.doc Revision Level: 01 Date:25oct04 / wfsp
Streamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
More informationGrid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu
Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging
More informationSGE Roll: Users Guide. Version @VERSION@ Edition
SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1
More informationIntroduction to Sun Grid Engine (SGE)
Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems
More informationGrid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)
Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing
More information1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology
Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science
More informationGrid Engine Users Guide. 2011.11p1 Edition
Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the
More informationManual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
More informationThe SUN ONE Grid Engine BATCH SYSTEM
The SUN ONE Grid Engine BATCH SYSTEM Juan Luis Chaves Sanabria Centro Nacional de Cálculo Científico (CeCalCULA) Latin American School in HPC on Linux Cluster October 27 November 07 2003 What is SGE? Is
More informationHodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
More informationIntroduction to the SGE/OGS batch-queuing system
Grid Computing Competence Center Introduction to the SGE/OGS batch-queuing system Riccardo Murri Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich Oct. 6, 2011 The basic
More informationParallel Debugging with DDT
Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece
More informationInstalling and running COMSOL on a Linux cluster
Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation
More informationGrid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net
Grid Engine 6 Troubleshooting BioTeam Inc. info@bioteam.net Grid Engine Troubleshooting There are two core problem types Job Level Cluster seems OK, example scripts work fine Some user jobs/apps fail Cluster
More informationHigh Performance Computing
High Performance Computing at Stellenbosch University Gerhard Venter Outline 1 Background 2 Clusters 3 SU History 4 SU Cluster 5 Using the Cluster 6 Examples What is High Performance Computing? Wikipedia
More informationHigh Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina
High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers
More informationImproved LS-DYNA Performance on Sun Servers
8 th International LS-DYNA Users Conference Computing / Code Tech (2) Improved LS-DYNA Performance on Sun Servers Youn-Seo Roh, Ph.D. And Henry H. Fong Sun Microsystems, Inc. Abstract Current Sun platforms
More informationHow to Run Parallel Jobs Efficiently
How to Run Parallel Jobs Efficiently Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education May 9, 2013 1 The big picture: running parallel jobs on Hoffman2
More informationCycleServer Grid Engine Support Install Guide. version 1.25
CycleServer Grid Engine Support Install Guide version 1.25 Contents CycleServer Grid Engine Guide 1 Administration 1 Requirements 1 Installation 1 Monitoring Additional OGS/SGE/etc Clusters 3 Monitoring
More informationCluster@WU User s Manual
Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut
More informationNotes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine
Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine Last updated: 6/2/2008 4:43PM EDT We informally discuss the basic set up of the R Rmpi and SNOW packages with OpenMPI and the Sun Grid
More informationUptime Infrastructure Monitor. Installation Guide
Uptime Infrastructure Monitor Installation Guide This guide will walk through each step of installation for Uptime Infrastructure Monitor software on a Windows server. Uptime Infrastructure Monitor is
More informationGrid Engine. Application Integration
Grid Engine Application Integration Getting Stuff Done. Batch Interactive - Terminal Interactive - X11/GUI Licensed Applications Parallel Jobs DRMAA Batch Jobs Most common What is run: Shell Scripts Binaries
More informationWork Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
More informationHow To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 (
Running Hadoop and Stratosphere jobs on TomPouce cluster 16 October 2013 TomPouce cluster TomPouce is a cluster of 20 calcula@on nodes = 240 cores Located in the Inria Turing building (École Polytechnique)
More informationThe Asterope compute cluster
The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU
More informationAn Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
More informationMPI / ClusterTools Update and Plans
HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski
More informationEfficient cluster computing
Efficient cluster computing Introduction to the Sun Grid Engine (SGE) queuing system Markus Rampp (RZG, MIGenAS) MPI for Evolutionary Anthropology Leipzig, Feb. 16, 2007 Outline Introduction Basic concepts:
More informationThe XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing
December 19, 2013 The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing Andrew Grimshaw, University of Virginia Co-architect XSEDE The complexity of software
More informationIntroduction to Grid Engine
Introduction to Grid Engine Workbook Edition 8 January 2011 Document reference: 3609-2011 Introduction to Grid Engine for ECDF Users Workbook Introduction to Grid Engine for ECDF Users Author: Brian Fletcher,
More informations@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]
s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] Oracle 1z0-102 : Practice Test Question No : 1 Which two statements are true about java
More information- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
More informationPBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007
PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit
More informationGC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems
GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems Riccardo Murri, Sergio Maffioletti Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich
More informationNorduGrid ARC Tutorial
NorduGrid ARC Tutorial / Arto Teräs and Olli Tourunen 2006-03-23 Slide 1(34) NorduGrid ARC Tutorial Arto Teräs and Olli Tourunen CSC, Espoo, Finland March 23
More informationAqua Connect Load Balancer User Manual (Mac)
Aqua Connect Load Balancer User Manual (Mac) Table of Contents About Aqua Connect Load Balancer... 3 System Requirements... 4 Hardware... 4 Software... 4 Installing the Load Balancer... 5 Configuration...
More informationNEC HPC-Linux-Cluster
NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores
More informationELIXIR LOAD BALANCER 2
ELIXIR LOAD BALANCER 2 Overview Elixir Load Balancer for Elixir Repertoire Server 7.2.2 or greater provides software solution for load balancing of Elixir Repertoire Servers. As a pure Java based software
More informationHigh Availability of the Polarion Server
Polarion Software CONCEPT High Availability of the Polarion Server Installing Polarion in a high availability environment Europe, Middle-East, Africa: Polarion Software GmbH Hedelfinger Straße 60 70327
More informationNEFSIS DEDICATED SERVER
NEFSIS TRAINING SERIES Nefsis Dedicated Server version 5.2.0.XXX (DRAFT Document) Requirements and Implementation Guide (Rev5-113009) REQUIREMENTS AND INSTALLATION OF THE NEFSIS DEDICATED SERVER Nefsis
More informationThe CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
More informationUsing WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014
Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan
More informationInstallation Notes for Outpost Network Security (ONS) version 3.2
Outpost Network Security Installation Notes version 3.2 Page 1 Installation Notes for Outpost Network Security (ONS) version 3.2 Contents Installation Notes for Outpost Network Security (ONS) version 3.2...
More informationSQL Server Business Intelligence
SQL Server Business Intelligence Setup and Configuration Guide Himanshu Gupta Technology Solutions Professional Data Platform Contents 1. OVERVIEW... 3 2. OBJECTIVES... 3 3. ASSUMPTIONS... 4 4. CONFIGURE
More informationHadoop Tutorial. General Instructions
CS246: Mining Massive Datasets Winter 2016 Hadoop Tutorial Due 11:59pm January 12, 2016 General Instructions The purpose of this tutorial is (1) to get you started with Hadoop and (2) to get you acquainted
More informationThe Application Level Placement Scheduler
The Application Level Placement Scheduler Michael Karo 1, Richard Lagerstrom 1, Marlys Kohnke 1, Carl Albing 1 Cray User Group May 8, 2006 Abstract Cray platforms present unique resource and workload management
More informationUsing Red Hat Network Satellite Server to Manage Dell PowerEdge Servers
Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers Enterprise Product Group (EPG) Dell White Paper By Todd Muirhead and Peter Lillian July 2004 Contents Executive Summary... 3 Introduction...
More informationLinux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.
Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction
More informationCONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities
CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities DNS name: turing.cs.montclair.edu -This server is the Departmental Server
More informationParallel Computing using MATLAB Distributed Compute Server ZORRO HPC
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing
More information10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition
10 STEPS TO YOUR FIRST QNX PROGRAM QUICKSTART GUIDE Second Edition QNX QUICKSTART GUIDE A guide to help you install and configure the QNX Momentics tools and the QNX Neutrino operating system, so you can
More informationHPCC - Hrothgar Getting Started User Guide MPI Programming
HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...
More informationMaxwell compute cluster
Maxwell compute cluster An introduction to the Maxwell compute cluster Part 1 1.1 Opening PuTTY and getting the course materials on to Maxwell 1.1.1 On the desktop, double click on the shortcut icon for
More informationObelisk: Summoning Minions on a HPC Cluster
Obelisk: Summoning Minions on a HPC Cluster Abstract In scientific research, having the ability to perform rigorous calculations in a bearable amount of time is an invaluable asset. Fortunately, the growing
More informationMatlab on a Supercomputer
Matlab on a Supercomputer Shelley L. Knuth Research Computing April 9, 2015 Outline Description of Matlab and supercomputing Interactive Matlab jobs Non-interactive Matlab jobs Parallel Computing Slides
More informationINF-110. GPFS Installation
INF-110 GPFS Installation Overview Plan the installation Before installing any software, it is important to plan the GPFS installation by choosing the hardware, deciding which kind of disk connectivity
More informationIntroduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
More informationDebugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics
More informationGetting Started with HC Exchange Module
Getting Started with HC Exchange Module HOSTING CONTROLLER WWW.HOSTINGCONROLLER.COM HOSTING CONTROLLER Contents Introduction...1 Minimum System Requirements for Exchange 2013...1 Hardware Requirements...1
More informationLSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
More informationCDH installation & Application Test Report
CDH installation & Application Test Report He Shouchun (SCUID: 00001008350, Email: she@scu.edu) Chapter 1. Prepare the virtual machine... 2 1.1 Download virtual machine software... 2 1.2 Plan the guest
More informationBuffalo Technology: Migrating your data to Windows Storage Server 2012 R2
Buffalo Technology: Migrating your data to Windows Storage Server 2012 R2 1 Buffalo Technology: Migrating your data to Windows Storage Server 2012 R2 Contents Chapter 1 Data migration method:... 3 Chapter
More informationIDS 561 Big data analytics Assignment 1
IDS 561 Big data analytics Assignment 1 Due Midnight, October 4th, 2015 General Instructions The purpose of this tutorial is (1) to get you started with Hadoop and (2) to get you acquainted with the code
More informationQ N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v 6. 4. 10 Steps to Developing a QNX Program Quickstart Guide
Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v 6. 4 10 Steps to Developing a QNX Program Quickstart Guide 2008, QNX Software Systems GmbH & Co. KG. A Harman International Company. All rights
More informationPlanning the Installation and Installing SQL Server
Chapter 2 Planning the Installation and Installing SQL Server In This Chapter c SQL Server Editions c Planning Phase c Installing SQL Server 22 Microsoft SQL Server 2012: A Beginner s Guide This chapter
More informationHPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
More informationCHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS
CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS The web content providers sharing the content over the Internet during the past did not bother about the users, especially in terms of response time,
More informationBatch Scheduling and Resource Management
Batch Scheduling and Resource Management Luke Tierney Department of Statistics & Actuarial Science University of Iowa October 18, 2007 Luke Tierney (U. of Iowa) Batch Scheduling and Resource Management
More informationPentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System
Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance
More informationinsync Installation Guide
insync Installation Guide 5.2 Private Cloud Druva Software June 21, 13 Copyright 2007-2013 Druva Inc. All Rights Reserved. Table of Contents Deploying insync Private Cloud... 4 Installing insync Private
More informationIUCLID 5 Guidance and support. Installation Guide Distributed Version. Linux - Apache Tomcat - PostgreSQL
IUCLID 5 Guidance and support Installation Guide Distributed Version Linux - Apache Tomcat - PostgreSQL June 2009 Legal Notice Neither the European Chemicals Agency nor any person acting on behalf of the
More informationTSM for Windows Installation Instructions: Download the latest TSM Client Using the following link:
TSM for Windows Installation Instructions: Download the latest TSM Client Using the following link: ftp://ftp.software.ibm.com/storage/tivoli-storagemanagement/maintenance/client/v6r2/windows/x32/v623/
More informationHillstone StoneOS User Manual Hillstone Unified Intelligence Firewall Installation Manual
Hillstone StoneOS User Manual Hillstone Unified Intelligence Firewall Installation Manual www.hillstonenet.com Preface Conventions Content This document follows the conventions below: CLI Tip: provides
More informationLOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013
LOCKSS on LINUX CentOS6 Installation Manual 08/22/2013 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 6 BIOS Settings... 9 Installation... 10 Firewall Configuration...
More informationIntroduction to Running Computations on the High Performance Clusters at the Center for Computational Research
! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at
More informationInstalling IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux
Installing IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux By the OS4 Documentation Team Prepared by Roberto J Dohnert Copyright 2013, PC/OpenSystems LLC This whitepaper describes how
More informationREQUIREMENTS AND INSTALLATION OF THE NEFSIS DEDICATED SERVER
NEFSIS TRAINING SERIES Nefsis Dedicated Server version 5.1.0.XXX Requirements and Implementation Guide (Rev 4-10209) REQUIREMENTS AND INSTALLATION OF THE NEFSIS DEDICATED SERVER Nefsis Training Series
More informationPuTTY/Cygwin Tutorial. By Ben Meister Written for CS 23, Winter 2007
PuTTY/Cygwin Tutorial By Ben Meister Written for CS 23, Winter 2007 This tutorial will show you how to set up and use PuTTY to connect to CS Department computers using SSH, and how to install and use the
More informationLab 3 Routing Information Protocol (RIPv1) on a Cisco Router Network
Lab 3 Routing Information Protocol (RIPv1) on a Cisco Router Network CMPE 150 Fall 2005 Introduction Today you are going to be thrown into using Cisco s Internetwork Operating System (IOS) to configure
More informationNETWRIX EVENT LOG MANAGER
NETWRIX EVENT LOG MANAGER QUICK-START GUIDE FOR THE ENTERPRISE EDITION Product Version: 4.0 July/2012. Legal Notice The information in this publication is furnished for information use only, and does not
More informationSemantic based Web Application Firewall (SWAF - V 1.6)
Semantic based Web Application Firewall (SWAF - V 1.6) Installation and Troubleshooting Manual Document Version 1.0 1 Installation Manual SWAF Deployment Scenario: Client SWAF Firewall Applications Figure
More informationThe QueueMetrics Uniloader User Manual. Loway
The QueueMetrics Uniloader User Manual Loway The QueueMetrics Uniloader User Manual Loway Table of Contents 1. What is Uniloader?... 1 2. Installation... 2 2.1. Running in production... 2 3. Concepts...
More informationDisaster Recovery Cookbook Guide Using VMWARE VI3, StoreVault and Sun. (Or how to do Disaster Recovery / Site Replication for under $50,000)
Disaster Recovery Cookbook Guide Using VMWARE VI3, StoreVault and Sun. (Or how to do Disaster Recovery / Site Replication for under $50,000) By Scott Sherman, VCP, NACE, RHCT Systems Engineer Integrated
More informationOracle Virtual Desktop Infrastructure. VDI Demo (Microsoft Remote Desktop Services) for Version 3.2
Oracle Virtual Desktop Infrastructure VDI Demo (Microsoft Remote Desktop Services) for Version 2 April 2011 Copyright 2011, Oracle and/or its affiliates. All rights reserved. This software and related
More informationIntroduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St
More informationSynchronizer Installation
Synchronizer Installation Synchronizer Installation Synchronizer Installation This document provides instructions for installing Synchronizer. Synchronizer performs all the administrative tasks for XenClient
More informationTable of Contents. FleetSoft Installation Guide
FleetSoft Installation Guide Table of Contents FleetSoft Installation Guide... 1 Minimum System Requirements... 2 Installation Notes... 3 Frequently Asked Questions... 4 Deployment Overview... 6 Automating
More informationGlobalSCAPE DMZ Gateway, v1. User Guide
GlobalSCAPE DMZ Gateway, v1 User Guide GlobalSCAPE, Inc. (GSB) Address: 4500 Lockhill-Selma Road, Suite 150 San Antonio, TX (USA) 78249 Sales: (210) 308-8267 Sales (Toll Free): (800) 290-5054 Technical
More informationIntelligent Power Protector User manual extension for Microsoft Virtual architectures: Hyper-V 6.0 Manager Hyper-V Server (R1&R2)
Intelligent Power Protector User manual extension for Microsoft Virtual architectures: Hyper-V 6.0 Manager Hyper-V Server (R1&R2) Hyper-V Manager Hyper-V Server R1, R2 Intelligent Power Protector Main
More informationSetup Cisco Call Manager on VMware
created by: Rainer Bemsel Version 1.0 Dated: July/09/2011 The purpose of this document is to provide the necessary steps to setup a Cisco Call Manager to run on VMware. I ve been researching for a while
More informationBeyond Windows: Using the Linux Servers and the Grid
Beyond Windows: Using the Linux Servers and the Grid Topics Linux Overview How to Login & Remote Access Passwords Staying Up-To-Date Network Drives Server List The Grid Useful Commands Linux Overview Linux
More informationEMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution Release 3.0 User Guide P/N 300-999-671 REV 02 Copyright 2007-2013 EMC Corporation. All rights reserved. Published in the USA.
More informationPharos Control User Guide
Outdoor Wireless Solution Pharos Control User Guide REV1.0.0 1910011083 Contents Contents... I Chapter 1 Quick Start Guide... 1 1.1 Introduction... 1 1.2 Installation... 1 1.3 Before Login... 8 Chapter
More informationInstallation Guidelines (MySQL database & Archivists Toolkit client)
Installation Guidelines (MySQL database & Archivists Toolkit client) Understanding the Toolkit Architecture The Archivists Toolkit requires both a client and database to function. The client is installed
More informationEVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.
WA2088 WebSphere Application Server 8.5 Administration on Windows Student Labs Web Age Solutions Inc. Copyright 2013 Web Age Solutions Inc. 1 Table of Contents Directory Paths Used in Labs...3 Lab Notes...4
More informationMartinos Center Compute Clusters
Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress
More informationDeploying System Center 2012 R2 Configuration Manager
Deploying System Center 2012 R2 Configuration Manager This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.
More informationDistributed convex Belief Propagation Amazon EC2 Tutorial
6/8/2011 Distributed convex Belief Propagation Amazon EC2 Tutorial Alexander G. Schwing, Tamir Hazan, Marc Pollefeys and Raquel Urtasun Distributed convex Belief Propagation Amazon EC2 Tutorial Introduction
More informationdbx SN Azure Setup Guide
dbx SN Azure Setup Guide Rev 1.0 Oct 2014 XtremeData, Inc. 999 Plaza Dr., Ste. 570 Schaumburg, IL 60173 www.xtremedata.com Overview... 3 Virtual machine setup... 3 Step 1: Launch Virtual machine (node)...
More information