SEE-GRID-SCI. Cluster installation and configuration. SEE-GRID-SCI Training Event, Yerevan, Armenia, July 2008
|
|
- Dwayne Wilcox
- 8 years ago
- Views:
Transcription
1 SEE-GRID-SCI Cluster installation and configuration SEE-GRID-SCI Training Event, Yerevan, Armenia, July 2008 Mikayel Gyurjyan Institute for Informatics and Automation Problems National Academy of Sciences of the Republic of Armenia The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no
2 Outline Common Cluster Kits Oscar Installation And Configuration Overview MPI Compliant Library MPICH Basics PBS ( Portable Batch System ) Ganglia Introduction and Overview Cluster Parallel and Computing Libraries ArmCluster Architecture SEE-GRID-SCI Training Event, Yerevan, Armenia, July
3 Cluster Software Stack SEE-GRID-SCI Training Event, Yerevan, Armenia, July
4 Common Cluster Kits Rocks NPACI Rocks is an Open Source toolkit for building and managing Linux Clusters. It s primary goal is to enable users with limited experience to build and manage their own clusters. Rocks provides a turn-key solution, that allows deployment of clusters in an automated fashion. Oscar (Open Source Cluster Application Resources) enables users to install a Beowulf type HPC cluster. It also contains everything needed to administer & program this type of cluster. OSCAR s flexible package management system has a rich set of prepacked applications and utilities which means you can get up and running without laboriously installing and configuring complex cluster administration & communication packages. Current release: version 5.0 (November 2006) SEE-GRID-SCI Training Event, Yerevan, Armenia, July
5 The OSCAR Project(1) a snapshot of the best known methods for building, programming, and using HPC clusters Easy to install software bundle Everything needed to install, build, maintain and use a Linux cluster SEE-GRID-SCI Training Event, Yerevan, Armenia, July
6 The OSCAR Project(2) What does it do? OSCAR is a cluster software packaging utility Automatically configures software components Reduces time to build a cluster Reduces need for expertise Reduces chance of incorrect software configuration Increases consistency from one cluster to the next What will it do in the future? Maintain cluster information database Work as an interface not just for installation, but also for maintenance Accelerate software package integration into clusters SEE-GRID-SCI Training Event, Yerevan, Armenia, July
7 The OSCAR Project(3) Installs on a number of OS's Fedora Core 5 Fedora Core 4 Mandriva Linux 2006 Red Hat Enterprise Linux 4 (x86, ia64, x86_64) Red Hat Enterprise Linux 3 (x86, ia64, x86_64) SuSE Linux 10.0 Scientific Linux 4 Scientific Linux 3 CentOS 4 CentOS 3 Nodes are installed using system imager Includes a packet filter SEE-GRID-SCI Training Event, Yerevan, Armenia, July
8 OSCAR components Administration/ Configuration SIS, C3, OPIUM, Kernel-Picker & cluster services (dhcp, nfs, ntp,...) Security: Pfilter, OpenSSH HPC Services/T ools Parallel Libs: MPICH, LAM/MPI, PVM, Open MPI OpenPBS/MAUI, Torque, SGE HDF5 Ganglia, Clumon Other 3 rd party OSCAR Packages Core Infrastructure/M anagement System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD) SEE-GRID-SCI Training Event, Yerevan, Armenia, July
9 Installation Overview Install RedHat Download OSCAR Print/Read document Copy RPMS to server Run wizard (install_cluster) Build image per client type (partition layout, HD type) Define clients (network info, image binding) Setup networking (collect MAC addresses, configure DHCP, build boot floppy) Boot clients / build Complete setup (post install) Run test suite Use cluster
10 OSCAR: Step by Step Log on to server as root mkdir p /tftpboot/rpm copy all RedHat rpms from CDs to /tftpboot/rpm download OSCAR tar zxvf oscar-x.x.tar.gz cd oscar-x.x./configure make install Puts things in /opt/oscar as default (can alter this with the prefix flag should you require)./install_cluster <interface>
11 OSCAR: Step by Step After untarring, run the install_cluster script
12 After starting services, dumps you into GUI wizard OSCAR: Step by Step
13 OSCAR: Step by Step Step 1: Select Oscar Packages to Install
14 OSCAR: Step by Step Step 2: Configure Selected OSCAR Packages
15 OSCAR: Step by Step Step 3: Install OSCAR Server Packages This button installs the selected OSCAR packages on the server. You can monitor the progress of the installation using the messages displayed in the console windows used to start the OSCAR Installation Wizard.
16 OSCAR: Step by Step Step 4:Build OSCAR Client Image Build image with default or custom rpm lists and disk table layouts. Image build progress displayed
17 OSCAR: Step by Step Step 5: Define OSCAR clients Associate image(s) with network settings.
18 OSCAR: Step by Step Step 6: Setup Networking Collect MAC addresses and configure DHCP
19 OSCAR: Step by Step Intermediate Step: Network boot client nodes If the nodes are PXE capable, select the NIC as the boot device. Don t make this a static change, however. Otherwise, just use the autoinstall floppy disk. It is less convenient than PXE, but a reliable failsafe.
20 Intermediate Step: Boot Nodes Floppy or PXE (if available) OSCAR: Step by Step
21 OSCAR: Step by Step Step 7: Complete Cluster Setup Output displayed in terminal window.
22 OSCAR: Step by Step Step 8: Run Test Suite
23 OSCAR: Step by Step Add OSCAR Clients
24 OSCAR: Step by Step Delete OSCAR Clients
25 MPI Compliant Library MPICH Developed by Argon National laboratory Current version is Support MPI 1.2 and part of MPI 2.0 C, C++ and Fortran LAM Developed by The University of Notre Dame Current version is C++
26 Architecture MPICH Basics(1) of MPICH MPICH consists of: libraries and include files - libmpi, mpi.h compilers mpicc(c), mpicc(c++),mpif90(fortran 90,mpif77(Fortran 77). mpicc helloworld.c o helloworld These know about things like where relevant include and library files are runtime loader mpirun mpirun [mpirun_options...] <progname> [options...] Has arguments -np <number of nodes>, and -machinefile <file of nodenames> implements SPMD paradigm by starting a copy of program on each node. The program must therefore do any differentitation itself (using MPI_Comm_size() and MPI_Comm_rank() functions). SEE-GRID-SCI Training Event, Yerevan, Armenia, July
27 Architecture MPICH Basics(2) of MPICH It consists of: Tools for measuring performance - Mpptest, Goptest Debugging dbx,gdb,xxgdb,totalview Starting jobs with a debugger mpirun -dbg=gdb or mpirun -dbg=gdb a.out Analisis tools MPE (Multi-Processing Environment ) Logfile viewers - Jumpshot SEE-GRID-SCI Training Event, Yerevan, Armenia, July
28 PBS ( Portable Batch System ) What is PBS? PBS is a distributed workload management system. As such, PBS handles the management and monitoring of the computational workload on a set of one or more computers. Different flavors of PBS: PBS Pro Open PBS Torque ( Scalable PBS ) Provides queuing, scheduling, and monitoring of jobs We can replaced the standard Open PBS scheduler with the Maui scheduler
29 OpenPBS(1) The OpenPBS, is a batch job and computer system resource management package. While you are using cluster, you will need OpenPBS to control your jobs. OpenPBS consist of four major components: commands, the job Server, the job executor, and the job Scheduler.
30 OpenPBS(2) Commands: PBS supplies both command line commands and a graphical interface. Job Server: The Server s main function is to provide the basic batch services such as receiving/creating a batch job, modifying the job, protecting the job against system crashes, and running the job. Job Executor: The job executor is the daemon which actually places the job into execution. Job Scheduler: The Job Scheduler is another daemon which contains the site s policy controlling which job is run and where and when it is run.
31 OpenPBS Job Flow User Node (runs qsub) Job output rcp/scp Server (queues job) Execution host (mother superior) Scheduler (tells server what to run) Compute Nodes
32 Architecture of MPICH MAUI OpenPBS gives us the flexibility to extend it s own scheduling abilities by integrating with Maui. Maui is an advanced sheduler that allows us to implement policybased scheduling. It also implements what is known as back-fill. Backfill will force jobs that have been queued for a very long time to run. Maui also has much greater fair share overseeing power. Maui has much improved job and queue statistics as well as advanced statistics gathering and reporting tools. SEE-GRID-SCI Training Event, Yerevan, Armenia, July
33 How Architecture PBS Handles of MPICH a Job User determines resource requirements for a job and writes a batch script. User submits job to PBS with the qsub command. PBS places the job into a queue based on its resource requests and runs the job when those resources become available. The job runs until it either completes or exceeds one of its resource request limits. PBS copies the job s output into the directory from which the job was submitted and optionally notified the user via that the job has ended SEE-GRID-SCI Training Event, Yerevan, Armenia, July
34 How to submit Architecture a job to the of MPICH queue? 1. Make sure you have the appropriate software 2. Gather any necessary data files 3. Create a PBS script 4. Submit the job (qsub) 5. Monitor your job s progress 6. Get the output SEE-GRID-SCI Training Event, Yerevan, Armenia, July
35 1. Architecture Appropriate software MPICH Need software compiled for Linux Choice of compiler Intel (Fortran 77/90/95, C/C++) Portland Group (Fortran 77/90/HPF, C/C++) Absoft (Fortran 77/90/95) Multiprocessor support MPI libraries Parallel Libraris Myrinet libraries SEE-GRID-SCI Training Event, Yerevan, Armenia, July
36 2. Architecture Necessary of data MPICH files Provide needed input files, data files, etc. Consider using local /tmp instead of accessing /home through the network Important for frequently accessed input or output files Also important if using many nodes SEE-GRID-SCI Training Event, Yerevan, Armenia, July
37 3. Create Architecture a PBS of script(1) MPICH First, an aside What is a shell script? For that matter, what is a shell? A shell is a text-based user interface You type commands, it executes them A shell script is a series of shell commands Instead of typing directly to the prompt, commands are placed in a file A PBS job script is a special shell script SEE-GRID-SCI Training Event, Yerevan, Armenia, July
38 3. Create Architecture a PBS of script(2) MPICH A PBS job script has two main parts: PBS directives Start with #PBS Commented-out with #-PBS Tell PBS how to run the job Shell commands The actual job logic What to do How to do it Where to do it SEE-GRID-SCI Training Event, Yerevan, Armenia, July
39 3. Create Architecture a PBS of script(3) MPICH PBS Directives can specify many things Output and notification #PBS j oe put stdout and stderr in the same file #PBS M address #PBS m bea send on job begin, end & abort Resource requirements Number and type of nodes Memory requirements Time limitations Etc. SEE-GRID-SCI Training Event, Yerevan, Armenia, July
40 3. Create Architecture a PBS of script(4) MPICH Resource requirement examples #PBS l nodes=1 #PBS l nodes=8:ppn=2:myrinet #PBS l walltime=24:00:00 #PBS l cput=384:00:00 (16 processors * 24 hrs) #PBS l pmem=256mb #PBS l mem=4096mb (16 processors * 256 mb) SEE-GRID-SCI Training Event, Yerevan, Armenia, July
41 3. Create Architecture a PBS of script(5) MPICH After the PBS directives, add the commands that you want run Create directories Copy files Execute programs Possibly add looping, nested jobs, etc. SEE-GRID-SCI Training Event, Yerevan, Armenia, July
42 Architecture 4. Submit of the MPICH job Before submitting, you may what to check the current state of the cluster qstat Q Submit the job qsub yourfile.pbs qstat u yourusername qalter l resourcelist jobnumber qdel jobnumber SEE-GRID-SCI Training Event, Yerevan, Armenia, July
43 5. Monitor Architecture your job s of progress MPICH Queue monitoring qstat u yourusername qstat a jobnumber qstat f jobnumber qstat -h qdel jobnumber SEE-GRID-SCI Training Event, Yerevan, Armenia, July
44 Architecture 6. Get the of MPICH output PBS s you a notification when your job Begins Ends Is aborted PBS also creates output files In addition to your normal program output stdout: filename.ojobnum stderr: filename.ejobnum File management You will quickly accumulate lots of files Group output files by job Include copies of job script and input files SEE-GRID-SCI Training Event, Yerevan, Armenia, July
45 A Sample PBS Batch Submission Architecture of Script(1) MPICH #!/bin/csh # # file: pbs.template # # purpose: template for PBS (Portable Batch System) script # # remarks: a line beginning with # is a comment; # a line beginning with #PBS is a pbs command; # assume (upper/lower) case to be sensitive; # # use: submit job with # qsub pbs.template # # job name (default is name of pbs script file) #PBS -N myjob # # resource limits: number of CPUs to be used #PBS -l ncpus=25 # # resource limits: amount of memory to be used #PBS -l mem=213mb SEE-GRID-SCI Training Event, Yerevan, Armenia, July
46 A Sample PBS Batch Submission Architecture of Script(2) MPICH # resource limits: max. wall clock time during which job can be running #PBS -l walltime=3:20:00 # # path/filename for standard output #PBS -o mypath/my.out # # path/filename for standard error #PBS -e mypath/my.err # # queue name, one of {submit, special express} # The default queue, "submit", need not be specified #PBS -q submit # # send me mail when job begins #PBS -m b # send me mail when job ends #PBS -m e # send me mail when job aborts (with an error) #PBS -m a SEE-GRID-SCI Training Event, Yerevan, Armenia, July
47 A Sample PBS Batch Submission Architecture of Script(3) MPICH # export all my environment variables to the job #PBS -V # The following reflect the environment where the user ran qsub: # PBS_O_HOST The host where you ran the qsub command. # PBS_O_LOGNAME Your user ID where you ran qsub. # PBS_O_HOME Your home directory where you ran qsub. # PBS_O_WORKDIR The working directory where you ran qsub. # # These reflect the environment where the job is executing: # PBS_ENVIRONMENT Set to PBS_BATCH to indicate the job is a batch job, or # to PBS_INTERACTIVE to indicate the job is a PBS interactive job. # PBS_O_QUEUE The original queue you submitted to. # PBS_QUEUE The queue the job is executing from. # PBS_JOBID The job's PBS identifier. # PBS_JOBNAME The job's name. ### SEE-GRID-SCI Training Event, Yerevan, Armenia, July
48 Ganglia Introduction Architecture and of Overview MPICH Scalable Distributed Monitoring System Targeted at monitoring clusters and grids Multicast-based Listen/Announce protocol Depends on open standards XML XDR compact portable data transport RRDTool - Round Robin Database APR Apache Portable Runtime Apache HTTPD Server PHP based web interface or SEE-GRID-SCI Training Event, Yerevan, Armenia, July
49 Architecture Ganglia Architecture of MPICH Gmond Metric gathering agent installed on individual servers Gmetad Metric aggregation agent installed on one or more specific task oriented servers Apache Web Frontend Metric presentation and analysis server Ported to various different platforms (Linux, FreeBSD, Solaris, others) SEE-GRID-SCI Training Event, Yerevan, Armenia, July
50 Ganglia Architecture Apache Web Frontend Web Client GMETAD Poll Poll GMETAD Failover Poll Failover Failover Poll Cluster 1 Cluster 2 Cluster 3 GMOND Node GMOND Node GMOND Node GMOND Node GMOND Node GMOND Node GMOND Node GMOND Node GMOND Node
51 Architecture Ganglia(1) of MPICH SEE-GRID-SCI Training Event, Yerevan, Armenia, July
52 Architecture Ganglia(2) of MPICH SEE-GRID-SCI Training Event, Yerevan, Armenia, July
53 Architecture Ganglia(3) of MPICH SEE-GRID-SCI Training Event, Yerevan, Armenia, July
54 Cluster Parallel and Computing Architecture of Libraries MPICH Mathematical FFTW (Fast Fourier Transform) BLAS Basic Linear Algebraic Subprograms PBLAS (Parallel Basic Linear Algebra Software) Atlas (a collections of mathematical library) SPRNG (scalable parallel random number generator) MPITB -- MPI toolbox for MATLAB LAPACK Linear Algebra Package MKL - Intel Math Kernel Library ScaLAPACK A parallel version of LAPACK, a collection of Fortran subroutines built on BLACS (Basic Linear Algebra Communication Subprograms), LAPACK and message passing interface. SEE-GRID-SCI Training Event, Yerevan, Armenia, July
55 Cluster Parallel and Computing Architecture of Libraries MPICH Molecular Dynamic solver Gamess - The General Atomic and Molecular Electronic Structure System NAMD - parallel, object oriented molecular dynamics code Gromacs - GROningen MAchine for Chemical Simulations is a molecular dinamic simulation package Quantum Chemistry software Gaussian Qchem Weather modelling MM5 ( SEE-GRID-SCI Training Event, Yerevan, Armenia, July
56 ArmCluster Architecture Architecture of MPICH SEE-GRID-SCI Training Event, Yerevan, Armenia, July
57 Architecture Accessing the of cluster MPICH Access the cluster head node from your desktop computer using an ssh client program Windows SSH Secure Shell (SecureCRT) PuTTY OpenSSH (Cygwin) Macintosh OS X and Linux OpenSSH User authentication (login) Use your cluster userid and password SEE-GRID-SCI Training Event, Yerevan, Armenia, July
58 Architecture Data of Transfer MPICH Use the tools provided by the ssh client on your desktop computer SSH Secure Shell has a graphical user interface PuTTY has a command line tool called pscp.exe OpenSSH (Cygwin, Linux, Mac OS X) has the command scp Use sftp to transfer files between the cluster and other Unix machines Probably need dos2unix to fix files after transfer SEE-GRID-SCI Training Event, Yerevan, Armenia, July
59 Armcluster Architecture user of MPICH policy Users are allowed to remote login from Internet. All users must use their own user account to login. The Access Point (frontend) is used only for login, simple editing of program source code, preparing the job dispatching script and dispatching of jobs to compute node. No foreground or background can be run on it. Dispatching of jobs must be done via the OpenPBS system SEE-GRID-SCI Training Event, Yerevan, Armenia, July
Quick Tutorial for Portable Batch System (PBS)
Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.
More informationLinux clustering. Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University
Linux clustering Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University PII 4-node clusters started in 1999 PIII 16 node cluster purchased in 2001. Plan for grid For test base HKBU -
More informationPBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007
PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit
More informationResource Management and Job Scheduling
Resource Management and Job Scheduling Jenett Tillotson Senior Cluster System Administrator Indiana University May 18 18-22 May 2015 1 Resource Managers Keep track of resources Nodes: CPUs, disk, memory,
More informationWork Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
More informationIntroduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
More informationUsing WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014
Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan
More informationJob Scheduling with Moab Cluster Suite
Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..
More informationRa - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu
Ra - Batch Scripts Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Jobs on Ra are Run via a Batch System Ra is a shared resource Purpose: Give fair access to all users Have control over where jobs are run Set
More informationCluster@WU User s Manual
Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut
More informationHodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
More informationIntroduction to Sun Grid Engine (SGE)
Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems
More informationGrid Engine Users Guide. 2011.11p1 Edition
Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the
More informationThe CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
More informationBatch Scripts for RA & Mio
Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs
More informationGetting Started with HPC
Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage
More informationLinux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.
Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction
More informationLSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
More informationStreamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
More informationParallel Computing using MATLAB Distributed Compute Server ZORRO HPC
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing
More informationIntroduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St
More informationDebugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics
More informationA Crash course to (The) Bighouse
A Crash course to (The) Bighouse Brock Palen brockp@umich.edu SVTI Users meeting Sep 20th Outline 1 Resources Configuration Hardware 2 Architecture ccnuma Altix 4700 Brick 3 Software Packaged Software
More informationManual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
More informationMiami University RedHawk Cluster Working with batch jobs on the Cluster
Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.
More informationGrid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu
Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging
More informationSLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
More informationSun Grid Engine Package for OSCAR A Google SoC 2005 Project
Sun Grid Engine Package for OSCAR A Google SoC 2005 Project Babu Sundaram, Barbara Chapman University of Houston Bernard Li, Mark Mayo, Asim Siddiqui, Steven Jones Canada s Michael Smith Genome Sciences
More informationParallel Debugging with DDT
Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece
More informationNEC HPC-Linux-Cluster
NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores
More informationCycleServer Grid Engine Support Install Guide. version 1.25
CycleServer Grid Engine Support Install Guide version 1.25 Contents CycleServer Grid Engine Guide 1 Administration 1 Requirements 1 Installation 1 Monitoring Additional OGS/SGE/etc Clusters 3 Monitoring
More informationMPI / ClusterTools Update and Plans
HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski
More informationUsing NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz)
NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting
More informationBatch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource
PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)
More informationHigh-Performance Reservoir Risk Assessment (Jacta Cluster)
High-Performance Reservoir Risk Assessment (Jacta Cluster) SKUA-GOCAD 2013.1 Paradigm 2011.3 With Epos 4.1 Data Management Configuration Guide 2008 2013 Paradigm Ltd. or its affiliates and subsidiaries.
More informationSGE Roll: Users Guide. Version @VERSION@ Edition
SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1
More informationRA MPI Compilers Debuggers Profiling. March 25, 2009
RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar
More informationGRID Computing: CAS Style
CS4CC3 Advanced Operating Systems Architectures Laboratory 7 GRID Computing: CAS Style campus trunk C.I.S. router "birkhoff" server The CAS Grid Computer 100BT ethernet node 1 "gigabyte" Ethernet switch
More informationGrid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)
Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing
More informationTutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria
Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew
More informationHigh Performance Computing Cluster Quick Reference User Guide
High Performance Computing Cluster Quick Reference User Guide Base Operating System: Redhat(TM) / Scientific Linux 5.5 with Alces HPC Software Stack Copyright 2011 Alces Software Ltd All Rights Reserved
More informationPARALLELS SERVER 4 BARE METAL README
PARALLELS SERVER 4 BARE METAL README This document provides the first-priority information on Parallels Server 4 Bare Metal and supplements the included documentation. TABLE OF CONTENTS 1 About Parallels
More information1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology
Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science
More informationPARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN
1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction
More informationSLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
More informationGetting Started in Red Hat Linux An Overview of Red Hat Linux p. 3 Introducing Red Hat Linux p. 4 What Is Linux? p. 5 Linux's Roots in UNIX p.
Preface p. ix Getting Started in Red Hat Linux An Overview of Red Hat Linux p. 3 Introducing Red Hat Linux p. 4 What Is Linux? p. 5 Linux's Roots in UNIX p. 6 Common Linux Features p. 8 Primary Advantages
More informationAcronis Backup & Recovery 10 Server for Linux. Installation Guide
Acronis Backup & Recovery 10 Server for Linux Installation Guide Table of contents 1 Before installation...3 1.1 Acronis Backup & Recovery 10 components... 3 1.1.1 Agent for Linux... 3 1.1.2 Management
More informationInstalling and running COMSOL on a Linux cluster
Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation
More informationHigh Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina
High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers
More informationTo connect to the cluster, simply use a SSH or SFTP client to connect to:
RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or
More informationIntroduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz)
Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using
More informationThe RWTH Compute Cluster Environment
The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de
More informationAn Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
More informationWindows HPC 2008 Cluster Launch
Windows HPC 2008 Cluster Launch Regionales Rechenzentrum Erlangen (RRZE) Johannes Habich hpc@rrze.uni-erlangen.de Launch overview Small presentation and basic introduction Questions and answers Hands-On
More informationAcronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide
Acronis Backup & Recovery 10 Server for Linux Update 5 Installation Guide Table of contents 1 Before installation...3 1.1 Acronis Backup & Recovery 10 components... 3 1.1.1 Agent for Linux... 3 1.1.2 Management
More informationRunning applications on the Cray XC30 4/12/2015
Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes
More informationBeyond Windows: Using the Linux Servers and the Grid
Beyond Windows: Using the Linux Servers and the Grid Topics Linux Overview How to Login & Remote Access Passwords Staying Up-To-Date Network Drives Server List The Grid Useful Commands Linux Overview Linux
More informationJUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
More informationExample of Standard API
16 Example of Standard API System Call Implementation Typically, a number associated with each system call System call interface maintains a table indexed according to these numbers The system call interface
More informationLiving in a mixed world -Interoperability in Windows HPC Server 2008. Steven Newhouse stevenn@microsoft.com
Living in a mixed world -Interoperability in Windows HPC Server 2008 Steven Newhouse stevenn@microsoft.com Overview Scenarios: Mixed Environments Authentication & Authorization File Systems Application
More informationCATS-i : LINUX CLUSTER ADMINISTRATION TOOLS ON THE INTERNET
CATS-i : LINUX CLUSTER ADMINISTRATION TOOLS ON THE INTERNET Jiyeon Kim, Yongkwan Park, Sungjoo Kwon, Jaeyoung Choi {heaven, psiver, lithmoon}@ss.ssu.ac.kr, choi@comp.ssu.ac.kr School of Computing, Soongsil
More informationMOSIX: High performance Linux farm
MOSIX: High performance Linux farm Paolo Mastroserio [mastroserio@na.infn.it] Francesco Maria Taurino [taurino@na.infn.it] Gennaro Tortone [tortone@na.infn.it] Napoli Index overview on Linux farm farm
More informationHow to Install an OSCAR Cluster Quick Start Install Guide Software Version 4.2 Documentation Version 4.2
How to Install an OSCAR Cluster Quick Start Install Guide Software Version 4.2 Documentation Version 4.2 http://oscar.sourceforge.net/ oscar-users@lists.sourceforge.net The Open Cluster Group http://www.openclustergroup.org/
More informationKaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE
Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE A P P L I C A T I O N V E R S I O N : 8. 0 Dear User! Thank you for choosing our product. We hope that this documentation will help you in your
More informationSOSFTP Managed File Transfer
Open Source File Transfer SOSFTP Managed File Transfer http://sosftp.sourceforge.net Table of Contents n Introduction to Managed File Transfer n Gaps n Solutions n Architecture and Components n SOSFTP
More informationSTUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
More informationIntroduction to Running Computations on the High Performance Clusters at the Center for Computational Research
! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at
More informationArchitecture and Mode of Operation
Software- und Organisations-Service Open Source Scheduler Architecture and Mode of Operation Software- und Organisations-Service GmbH www.sos-berlin.com Scheduler worldwide Open Source Users and Commercial
More informationOnCommand Performance Manager 1.1
OnCommand Performance Manager 1.1 Installation and Setup Guide For Red Hat Enterprise Linux NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 U.S. Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501
More informationProgram Grid and HPC5+ workshop
Program Grid and HPC5+ workshop 24-30, Bahman 1391 Tuesday Wednesday 9.00-9.45 9.45-10.30 Break 11.00-11.45 11.45-12.30 Lunch 14.00-17.00 Workshop Rouhani Karimi MosalmanTabar Karimi G+MMT+K Opening IPM_Grid
More informationInstallation Guide. McAfee VirusScan Enterprise for Linux 1.9.0 Software
Installation Guide McAfee VirusScan Enterprise for Linux 1.9.0 Software COPYRIGHT Copyright 2013 McAfee, Inc. Do not copy without permission. TRADEMARK ATTRIBUTIONS McAfee, the McAfee logo, McAfee Active
More information13.1 Backup virtual machines running on VMware ESXi / ESX Server
13 Backup / Restore VMware Virtual Machines Tomahawk Pro This chapter describes how to backup and restore virtual machines running on VMware ESX, ESXi Server or VMware Server 2.0. 13.1 Backup virtual machines
More informationRingStor User Manual. Version 2.1 Last Update on September 17th, 2015. RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ 08816.
RingStor User Manual Version 2.1 Last Update on September 17th, 2015 RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ 08816 Page 1 Table of Contents 1 Overview... 5 1.1 RingStor Data Protection...
More informationHPC system startup manual (version 1.30)
HPC system startup manual (version 1.30) Document change log Issue Date Change 1 12/1/2012 New document 2 10/22/2013 Added the information of supported OS 3 10/22/2013 Changed the example 1 for data download
More informationOracle EXAM - 1Z0-102. Oracle Weblogic Server 11g: System Administration I. Buy Full Product. http://www.examskey.com/1z0-102.html
Oracle EXAM - 1Z0-102 Oracle Weblogic Server 11g: System Administration I Buy Full Product http://www.examskey.com/1z0-102.html Examskey Oracle 1Z0-102 exam demo product is here for you to test the quality
More informationUsing Parallel Computing to Run Multiple Jobs
Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The
More informationIn order to upload a VM you need to have a VM image in one of the following formats:
What is VM Upload? 1. VM Upload allows you to import your own VM and add it to your environment running on CloudShare. This provides a convenient way to upload VMs and appliances which were already built.
More informationRed Hat System Administration 1(RH124) is Designed for IT Professionals who are new to Linux.
Red Hat Enterprise Linux 7- RH124 Red Hat System Administration I Red Hat System Administration 1(RH124) is Designed for IT Professionals who are new to Linux. This course will actively engage students
More informationNOC PS manual. Copyright Maxnet 2009 2015 All rights reserved. Page 1/45 NOC-PS Manuel EN version 1.3
NOC PS manual Copyright Maxnet 2009 2015 All rights reserved Page 1/45 Table of contents Installation...3 System requirements...3 Network setup...5 Installation under Vmware Vsphere...8 Installation under
More informationHow To Install Linux Titan
Linux Titan Distribution Presented By: Adham Helal Amgad Madkour Ayman El Sayed Emad Zakaria What Is a Linux Distribution? What is a Linux Distribution? The distribution contains groups of packages and
More informationThe Asterope compute cluster
The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU
More informationHPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
More informationIntroduction to SDSC systems and data analytics software packages "
Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available
More informationIntroduction to ACENET Accelerating Discovery with Computational Research May, 2015
Introduction to ACENET Accelerating Discovery with Computational Research May, 2015 What is ACENET? What is ACENET? Shared regional resource for... high-performance computing (HPC) remote collaboration
More informationBatch Scheduling and Resource Management
Batch Scheduling and Resource Management Luke Tierney Department of Statistics & Actuarial Science University of Iowa October 18, 2007 Luke Tierney (U. of Iowa) Batch Scheduling and Resource Management
More informationCopyright 1999-2011 by Parallels Holdings, Ltd. All rights reserved.
Parallels Virtuozzo Containers 4.0 for Linux Readme Copyright 1999-2011 by Parallels Holdings, Ltd. All rights reserved. This document provides the first-priority information on Parallels Virtuozzo Containers
More informationEnterprise Content Management System Monitor. Server Debugging Guide. 20.09.2013 CENIT AG Bettighofer, Stefan
Enterprise Content Management System Monitor Server Debugging Guide 20.09.2013 CENIT AG Bettighofer, Stefan 1 Table of Contents 1 Table of Contents... 2 2 Overview... 3 3 The Server Status View... 3 4
More informationTechnical Specification Data
Equitrac Office 4.1 SOFTWARE SUITE Equitrac Office Software Suite Equitrac Office Suite Equitrac Office Small Business Edition (SBE) Applications Any size network with single or multiple accounting and/or
More informationSystem Software for High Performance Computing. Joe Izraelevitz
System Software for High Performance Computing Joe Izraelevitz Agenda Overview of Supercomputers Blue Gene/Q System LoadLeveler Job Scheduler General Parallel File System HPC at UR What is a Supercomputer?
More informationThis document describes the new features of this release and important changes since the previous one.
Parallels Virtuozzo Containers 4.0 for Linux Release Notes Copyright 1999-2011 by Parallels Holdings, Ltd. All rights reserved. This document describes the new features of this release and important changes
More information24/08/2004. Introductory User Guide
24/08/2004 Introductory User Guide CSAR Introductory User Guide Introduction This material is designed to provide new users with all the information they need to access and use the SGI systems provided
More informationAcronis Backup & Recovery 10 Server for Linux. Installation Guide
Acronis Backup & Recovery 10 Server for Linux Installation Guide Table of Contents 1. Installation of Acronis Backup & Recovery 10... 3 1.1. Acronis Backup & Recovery 10 components... 3 1.1.1. Agent for
More informationPARALLELS SERVER BARE METAL 5.0 README
PARALLELS SERVER BARE METAL 5.0 README 1999-2011 Parallels Holdings, Ltd. and its affiliates. All rights reserved. This document provides the first-priority information on the Parallels Server Bare Metal
More informationInternational Outlook
International Outlook Maurizio Davini Dipartimento di Fisica e INFN Pisa maurizio.davini@df.unipi.it Components Extensions for clustering Configurations,installation and administration Monitoring Batch
More informationThe Penguin in the Pail OSCAR Cluster Installation Tool
The Penguin in the Pail OSCAR Cluster Installation Tool Thomas Naughton and Stephen L. Scott Computer Science and Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN {naughtont, scottsl}@ornl.gov
More informationHigh Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda
High Performance Computing with Sun Grid Engine on the HPSCC cluster Fernando J. Pineda HPSCC High Performance Scientific Computing Center (HPSCC) " The Johns Hopkins Service Center in the Dept. of Biostatistics
More informationIntroduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge guillimin@calculquebec.ca December 1st, 2015
Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge guillimin@calculquebec.ca December 1st, 2015 1 Partners and sponsors 2 Exercise 0: Login and Setup Ubuntu login:
More informationCONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities
CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities DNS name: turing.cs.montclair.edu -This server is the Departmental Server
More informationFast Setup and Integration of ABAQUS on HPC Linux Cluster and the Study of Its Scalability
Fast Setup and Integration of ABAQUS on HPC Linux Cluster and the Study of Its Scalability Betty Huang, Jeff Williams, Richard Xu Baker Hughes Incorporated Abstract: High-performance computing (HPC), the
More informationRelease Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved.
Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved. Release date: August 28, 2014 This build was developed and tested on: -
More information